Tags
| Term | Frequency | Example Contexts |
|---|---|---|
| machine translation | 12019 | |
| 2020.coling-main.210 While novel metrics are proposed every year, a few popular metrics remain as the de facto metrics to evaluate tasks such as image captioning and ***** machine translation *****, despite their known limitations. | ||
| 2018.iwslt-1.8 We propose a method to transfer knowledge across neural ***** machine translation ***** (NMT) models by means of a shared dynamic vocabulary. | ||
| 2006.amta-papers.28 Discriminative training methods have recently led to significant advances in the state of the art of ***** machine translation ***** (MT). | ||
| 2004.amta-papers.23 This paper describes an evaluation experiment about a Japanese-Uighur ***** machine translation ***** system which consists of verbal suffix processing, case suffix processing, phonetic change processing, and a Japanese-Uighur dictionary including about 20,000 words. | ||
| 2005.mtsummit-papers.29 Example-based ***** machine translation ***** (EBMT) systems, so far, rely on heuristic measures in retrieving translation examples. | ||
| language model | 6337 | |
| 2020.coling-main.581 Deep pre-trained ***** language model *****s tend to become ubiquitous in the field of Natural Language Processing (NLP). | ||
| 2020.emnlp-main.162 Trained with these contextually generated vokens, our visually-supervised ***** language model *****s show consistent improvements over self-supervised alternatives on multiple pure-language tasks such as GLUE, SQuAD, and SWAG. | ||
| C16-1061 Linguistic constraints are then used to weed out phonotactically ill-formed segmentations, thereby allowing the ***** language model ***** to select the best grammatical segmentation. | ||
| 2021.calcs-1.20 Multilingual ***** language model *****s have shown decent performance in multilingual and cross-lingual natural language understanding tasks. | ||
| 2021.eacl-main.191 The adaptation of pretrained *****language models***** to solve supervised tasks has become a baseline in NLP, and many recent works have focused on studying how linguistic information is encoded in the pretrained sentence representations. | ||
| corpus | 5444 | |
| 1963.earlymt-1.13 The basis of the study was a ***** corpus ***** of 180,000 running words of Russian physics text prepared for analysis by the Automatic Language Data Processing group at The Rand Corporation; for each sentence of text the syntactic dependency of each word had been previously coded. | ||
| L12-1173 We show the impact of this ***** corpus ***** in the performance of a state-of-the-art SMT system when translating questions. | ||
| L10-1345 As we show in this paper, such annotations are very rich linguistically, since apart from syntax they also incorporate semantics, which does not only ensure that the treebank is guaranteed to be a truly sharable, re-usable and multi-functional linguistic resource, but also calls for the necessity of a better disambiguation of the internal (syntactic) structure of larger units of words, such as compound nouns, since this has an impact on the representation of their meaning, which is of utmost interest if the linguistic annotation of a given ***** corpus ***** is to be further understood as the practice of adding interpretative linguistic information of the highest quality in order to give added value to the ***** corpus *****. | ||
| L16-1548 The ATR labs where this system was invented no longer exist, but the website has been preserved as a ***** corpus ***** containing 1537 samples of synthesised speech from that period (118 MB in aiff format) in 211 pages under various finely interrelated themes The ***** corpus ***** can be accessed from www.speech-data.jp as well as www.tcd-fastnet.com, where the original code and samples are now being maintained | ||
| L14-1591 The French and English sub-corpora had been pos-tagged from the onset, using TreeTagger (Schmid, 1994), but the ***** corpus ***** lacked, until now, a tagged version of the Serbian sub-***** corpus *****. | ||
| NLP | 4374 | |
| 2020.acl-main.443 Meanwhile, there is still a lack of fundamental ***** NLP ***** techniques for identifying code tokens or software-related named entities that appear within natural language sentences. | ||
| 2021.acl-long.431 Recent researches have shown that large natural language processing (***** NLP *****) models are vulnerable to a kind of security threat called the Backdoor Attack. | ||
| N19-1129 We hope our analyses will help better assess the usefulness of the rebuttal phase in ***** NLP ***** conferences. | ||
| L08-1582 The Linguistic Data Consortium (LDC) has supported research on statistical machine translations and other ***** NLP ***** applications by creating and distributing a large amount of parallel text resources for the research communities. | ||
| 2020.acl-srw.6 To this end, we started our research by implementing a novel multi-task learner with relaxed annotated data requirements and obtained a performance improvement on two ***** NLP ***** tasks | ||
| neural machine translation | 4112 | |
| 2018.iwslt-1.8 We propose a method to transfer knowledge across ***** neural machine translation ***** (NMT) models by means of a shared dynamic vocabulary. | ||
| P19-1555 In this paper, we present a novel data augmentation method for ***** neural machine translation *****.Different from previous augmentation methods that randomly drop, swap or replace words with other words in a sentence, we softly augment a randomly chosen word in a sentence by its contextual mixture of multiple related words. | ||
| 2021.americasnlp-1.27 Our ***** neural machine translation ***** system ranked first in Track two (development set not used for training) and third in Track one (training includes development data). | ||
| 2021.humeval-1.5 Our classification-based approach focuses on such errors using several error type labels, for practical machine translation evaluation in an age of ***** neural machine translation *****. | ||
| P17-1012 The prevalent approach to ***** neural machine translation ***** relies on bi-directional LSTMs to encode the source sentence. | ||
| annotation | 3789 | |
| 2020.latechclfl-1.10 The recently introduced Error Annotation Toolkit (ERRANT) tackled this problem by presenting a way to automatically annotate data that contain grammatical errors, while also providing a standardisation for ***** annotation *****. | ||
| L14-1011 This research describes the challenges therein, including the development of new ***** annotation ***** practices that walk the line between abstracting away from language-particular syntactic facts to explore deeper semantics, and maintaining the connection between semantics and syntactic structures that has proven to be very valuable for PropBank as a corpus of training data for Natural Language Processing applications. | ||
| L14-1362 By employing our powerful ***** annotation ***** tool Recon, annotators mark selected entities and relations (including events), coreference relations among these entities and events, and also terms that are semantically related to the relevant relations and events. | ||
| W16-5111 Following the collection of the data, ***** annotation ***** guidelines were created over several iterations, which detail important aspects of social media data ***** annotation ***** and can be used by future researchers for developing similar data sets. | ||
| L06-1128 We include the ***** annotation ***** guidelines, the event classes we categorized, the way we use normal distributions to model vague and implicit temporal information, and how we evaluate inter-annotator agreement | ||
| dataset | 3551 | |
| P17-2103 We create a counterfactual tweet ***** dataset ***** and explore approaches for detecting counterfactuals using rule-based and supervised statistical approaches. | ||
| 2020.coling-main.119 We evaluate our methods to detect cognates on a challenging ***** dataset ***** of twelve Indian languages, namely, Sanskrit, Hindi, Assamese, Oriya, Kannada, Gujarati, Tamil, Telugu, Punjabi, Bengali, Marathi, and Malayalam. | ||
| W17-2512 Because of the design of the ***** dataset *****, in which not all gold parallel sentence pairs are known, these are only minimum values. | ||
| 2020.blackboxnlp-1.17 When BERT is fine-tuned, relational knowledge is forgotten but the extent of forgetting is impacted by the fine-tuning objective but not the size of the ***** dataset *****. | ||
| 2020.coling-main.310 In this paper, we argue that if these SLU ***** dataset *****s are considered together, different knowledge from different ***** dataset *****s could be learned jointly, and there are high chances to promote the performance of each ***** dataset ***** | ||
| embeddings | 3393 | |
| W17-4906 Variations on dataset size and on the kinds of ***** embeddings ***** are also investigated. | ||
| 2020.lrec-1.252 By merging advances in word ***** embeddings ***** with traditional machine learning models and model ensembling, prediction accuracy is at an acceptable level to produce a large silver-standard corpus despite the small gold-standard corpus training set. | ||
| 2021.conll-1.12 Our grounded ***** embeddings ***** are publicly available here. | ||
| W18-6317 The quality of the resulting ***** embeddings ***** are evaluated on parallel corpus reconstruction and by assessing machine translation systems trained on gold vs. mined sentence pairs | ||
| 2020.findings-emnlp.250 Word-***** embeddings ***** are vital components of Natural Language Processing (NLP) models and have been extensively explored. | ||
| datasets | 3378 | |
| 2020.winlp-1.18 This work details the organisation of the AI4D - African Language Dataset Challenge, an effort to incentivize the creation, curation and uncovering to African language ***** datasets ***** through a competitive challenge, particularly ***** datasets ***** that are annotated or prepared for use in a downstream NLP task. | ||
| 2021.maiworkshop-1.10 However, in many real- world ***** datasets *****, additional modalities are included which the Transformer does not directly leverage. | ||
| 2021.emnlp-main.316 Extensive experiments show that CATE achieves better performance against state-of-the-art baselines on several benchmark ***** datasets *****. | ||
| 2021.emnlp-main.215 Experimental results on six benchmark ***** datasets ***** show that KIEMP outperforms the existing state-of-the-art keyphrase extraction approaches in most cases | ||
| 2020.coling-main.278 Our proposed LaAP-Net outperforms existing approaches on three benchmark ***** datasets ***** for the text VQA task by a noticeable margin. | ||
| corpora | 3125 | |
| 2021.sigtyp-1.3 Thus, validity and consistency of multilingual ***** corpora ***** should be tested through application tasks involving syntactic structures with PoS tags, dependency labels, and universal features. | ||
| 2020.conll-1.16 The CoNLL-2003 corpus for English-language named entity recognition (NER) is one of the most influential ***** corpora ***** for NER model research. | ||
| L10-1153 We investigate which distributional properties should be present in a tagset by examining different mappings of various current part-of-speech tagsets, looking at English, German, and Italian ***** corpora *****. | ||
| 2021.acl-demo.8 ParCourE can be set up for any parallel corpus and can thus be used for typological research on other ***** corpora ***** as well as for exploring their quality and properties | ||
| L14-1591 The French and English sub-***** corpora ***** had been pos-tagged from the onset, using TreeTagger (Schmid, 1994), but the corpus lacked, until now, a tagged version of the Serbian sub-corpus. | ||
| semantic | 2946 | |
| 2021.gwc-1.13 We thus employ a large-scale data-driven linguistically motivated analysis afforded by the rich derivational and morpho***** semantic ***** description in WordNet to the end of capturing finer regularities in the process of derivation as represented in the ***** semantic ***** properties of the words involved and as reflected in the structure of the lexicon. | ||
| L14-1300 Particular attention is drawn on the use of NLP deep ***** semantic ***** methods to help in data processing. | ||
| 2020.lrec-1.389 The Open Multilingual Wordnet has 34 languages (11 shared with TUFS) organized into synsets linked by ***** semantic ***** relations, with examples and definitions for some languages. | ||
| 2020.acl-main.291 In this paper, we proposed a Semantic-Emotion Knowledge Transferring (SEKT) model for cross-target stance detection, which uses the external knowledge (***** semantic ***** and emotion lexicons) as a bridge to enable knowledge transfer across different targets. | ||
| 2020.challengehml-1.8 This work utilizes the image captioning model to capture the ***** semantic *****s of the input image and a modular design to generate a probability distribution for ***** semantic ***** topics | ||
| natural language | 2766 | |
| 2021.calcs-1.20 Multilingual language models have shown decent performance in multilingual and cross-lingual ***** natural language ***** understanding tasks. | ||
| 2021.naacl-main.342 In the pursuit of ***** natural language ***** understanding, there has been a long standing interest in tracking state changes throughout narratives. | ||
| 1993.iwpt-1.7 Marcus demonstrated that it was possible to construct a deterministic grammar/interpreter for a subset of ***** natural language ***** [Marcus, 1980]. | ||
| D19-1384 We demonstrate that complex linguistic behavior observed in ***** natural language ***** can be reproduced in this simple setting: i) the outcome of contact between communities is a function of inter- and intra-group connectivity; ii) linguistic contact either converges to the majority protocol, or in balanced cases leads to novel creole languages of lower complexity; and iii) a linguistic continuum emerges where neighboring languages are more mutually intelligible than farther removed languages. | ||
| 2021.semeval-1.44 There is currently a gap between the ***** natural language ***** expression of scholarly publications and their structured semantic content modeling to enable intelligent content search. | ||
| pre - training | 2690 | |
| 2021.emnlp-main.75 *****Pre-trained***** Transformer language models (LM) have become go-to text representation encoders. | ||
| D17-1056 Therefore, this study proposes a word vector refinement model that can be applied to any *****pre-trained***** word vectors (e.g., Word2vec and GloVe). | ||
| W18-2404 Specifically, we employ *****pre-trained***** word embeddings to characterize the semantic relationship between utterances and labels. | ||
| 2020.pam-1.7 In this work we adapt and evaluate a few-shot learning approach, Matching Networks (Vinyals et al., 2016), to conversational strategies of a robot interacting with a human tutor in order to efficiently learn to categorise objects that are presented to it and also investigate to what degree transfer learning from *****pre-trained***** models on images from different contexts can improve its performance. | ||
| D19-1625 Our findings indicate that, despite the recent successes of large language models on tasks aimed to assess commonsense knowledge, these models do not greatly outperform simple word-level models based on *****pre-trained***** word embeddings. | ||
| lexical | 2622 | |
| 2021.naacl-main.371 Furthermore, ESCHER can nimbly combine data annotated with senses from different ***** lexical ***** resources, achieving performances that were previously out of everyone's reach. | ||
| L10-1043 We demonstrate how the structured data available in Encarta and the ***** lexical ***** semantic relations between words in MindNet can be used to enrich MindNet with semantic relations between entities. | ||
| 2020.starsem-1.3 In other words, instead of proving sentence-level inference relations with the help of ***** lexical ***** relations, the ***** lexical ***** relations are proved taking into account the sentence-level inference relations. | ||
| 2021.ranlp-1.3 The main objective of this paper is to see how one can benefit from the ***** lexical ***** similarity found in Indian languages in a multilingual scenario | ||
| W18-3805 on verbs that share the same ***** lexical ***** morpheme and are derived from other verbs via prefixation, suffixation and/or stem alternations. | ||
| question answering | 2371 | |
| 2021.emnlp-main.293 Information seeking is an essential step for open-domain ***** question answering ***** to efficiently gather evidence from a large corpus. | ||
| 2021.naacl-main.193 Comprehensive experiments on three video-and-language tasks (text-to-video retrieval, video captioning, and video ***** question answering *****) across five datasets demonstrate that our approach outperforms previous state-of-the-art methods. | ||
| C16-1191 Natural language generation (NLG) is an important component of ***** question answering *****(QA) systems which has a significant impact on system quality. | ||
| N19-1403 We use a stand-alone ***** question answering ***** (QA) system to perform QA task and a Natural Language Inference (NLI) system to identify the relations between the choice pairs. | ||
| 2020.findings-emnlp.171 Finally, simply finetuning this pre trained QA model into specialized models results in a new state of the art on 10 factoid and commonsense ***** question answering ***** datasets, establishing UNIFIEDQA as a strong starting point for building QA systems. | ||
| language pair | 2245 | |
| W19-5423 We have submitted systems for the Portuguese ↔ Spanish ***** language pair *****, in both directions. | ||
| L16-1004 While Edit Distance as such does not express cognitive effort or time spent editing machine translation suggestions, we found that it correlates strongly with the productivity tests we performed, for various ***** language pair *****s and domains. | ||
| 2021.naacl-main.311 Experimental results on several ***** language pair *****s show that the proposed methods substantially outperform conventional UNMT systems. | ||
| 2020.wmt-1.15 This paper describes Tilde's submission to the WMT2020 shared task on news translation for both directions of the English-Polish ***** language pair ***** in both the constrained and the unconstrained tracks. | ||
| Q18-1022 This is an instance of multitask learning, where individual tasks (***** language pair *****s) benefit from sharing knowledge with related tasks. | ||
| syntactic | 2206 | |
| L16-1050 Odin is an information extraction framework that applies cascades of finite state automata over both surface text and ***** syntactic ***** dependency graphs. | ||
| L06-1022 A noteworthy ***** syntactic ***** property is that some serial verb constructions tend to be used as if they were compound verbs. | ||
| 2020.ccl-1.92 Moreover, we propose an attention-based fine-tuning strategy that better selects relevant semantic and ***** syntactic ***** information from the pre-trained language model and uses those features on downstream text classification tasks. | ||
| 2021.cmcl-1.28 Taken together, these results suggest a role for propositional content and ***** syntactic ***** category information in incremental sentence processing | ||
| L12-1283 This work is part of a project for MWE extraction and characterization using different techniques aiming at measuring the properties related to idiomaticity, as institutionalization, non-compositionality and lexico-***** syntactic ***** fixedness. | ||
| statistical machine translation | 2132 | |
| C16-1172 While most sentences are more accurate and fluent than translations by ***** statistical machine translation ***** (SMT)-based systems, in some cases, the NMT system produces translations that have a completely different meaning. | ||
| 2012.iwslt-papers.9 Although ***** statistical machine translation ***** (SMT) has made great progress since it came into being, the translation of numerical and time expressions is still far from satisfactory. | ||
| 2014.amta-wptp.15 Such post-editing (e.g., PET [Aziz et al., 2012]) can be used practically for translation between European languages, which has a high performance in ***** statistical machine translation *****. | ||
| L14-1176 This paper presents a systematic human evaluation of translations of English support verb constructions produced by a rule-based machine translation (RBMT) system (OpenLogos) and a ***** statistical machine translation ***** (SMT) system (Google Translate) for five languages: French, German, Italian, Portuguese and Spanish. | ||
| W18-3604 Based on (Castro Ferreira et al., 2017), the approach works by first preprocessing an input dependency tree into an ordered linearized string, which is then realized using a *****statistical machine translation***** model. | ||
| natural language processing | 2038 | |
| 2020.acl-demos.5 We present a large improvement over classic search engine baseline on several standard QA datasets and provide the community a collaborative data collection tool to curate the first ***** natural language processing ***** research QA dataset via a community effort. | ||
| 2021.teachingnlp-1.16 Introducing biomedical informatics (BMI) students to ***** natural language processing ***** (NLP) requires balancing technical depth with practical know-how to address application-focused needs. | ||
| 2020.acl-main.264 Advanced machine learning techniques have boosted the performance of ***** natural language processing *****. | ||
| 2021.acl-short.127 For example, one may be an expert in the ***** natural language processing ***** (NLP) domain, but want to determine the best order in which to learn new concepts in an unfamiliar Computer Vision domain (CV). | ||
| 2020.emnlp-main.255 To demystify the “black box” property of deep neural networks for ***** natural language processing ***** (NLP), several methods have been proposed to interpret their predictions by measuring the change in prediction probability after erasing each token of an input. | ||
| MT | 1873 | |
| 2020.eamt-1.40 In this paper we present a new version of the tool called ***** MT *****3, which builds on and extends a joint effort undertaken by the Faculty of Languages of the University of Cördoba and Faculty of Translation and Interpreting of the University of Geneva to develop an open-source web platform to teach ***** MT ***** to translation students. | ||
| 2021.eacl-main.132 Pre-editing is the process of modifying the source text (ST) so that it can be translated by machine translation (***** MT *****) in a better quality. | ||
| 2014.iwslt-papers.3 In the past, this task has been treated separately in ASR or ***** MT ***** contexts and we propose here a joint estimation of word confidence for a spoken language translation (SLT) task involving both ASR and ***** MT *****. | ||
| 2000.amta-studies.1 This paper discusses an informal methodology for evaluating Machine Translation software documentation with reference to a case study , in which a number of currently available *****MT***** packages are evaluated . | ||
| 2010.amta-government.1 We describe a case study that presents a framework for examining whether Machine Translation ( MT ) output enables translation professionals to translate faster while at the same time producing better quality translations than without *****MT***** output . | ||
| fine - tuning | 1849 | |
| 2021.eacl-main.191 As these models are often *****fine-tuned*****, it becomes increasingly important to understand how the encoded knowledge evolves along the fine-tuning. | ||
| 2021.emnlp-main.75 Prior research *****fine-tunes***** deep LMs to encode text sequences such as sentences and passages into single dense vector representations for efficient text comparison and retrieval. | ||
| 2020.inlg-1.8 We perform a comprehensive study on the validity of explicit discourse relations in GPT-2's outputs under both organic generation and *****fine-tuned***** scenarios. | ||
| 2020.starsem-1.13 Of the state-of-the-art approaches, *****fine-tuned***** transformer-based (Vaswani et al., 2017) | ||
| 2021.emnlp-main.88 We demonstrate that 1) full-blown conversational pretraining is not required, and that LMs can be quickly transformed into effective conversational encoders with much smaller amounts of unannotated data; 2) pretrained LMs can be *****fine-tuned***** into task-specialised sentence encoders, optimised for the fine-grained semantics of a particular task. | ||
| machine translation system | 1848 | |
| 2004.amta-papers.23 This paper describes an evaluation experiment about a Japanese-Uighur ***** machine translation system ***** which consists of verbal suffix processing, case suffix processing, phonetic change processing, and a Japanese-Uighur dictionary including about 20,000 words. | ||
| 2021.americasnlp-1.27 Our neural ***** machine translation system ***** ranked first in Track two (development set not used for training) and third in Track one (training includes development data). | ||
| D19-1100 Although over 100 languages are supported by strong off-the-shelf ***** machine translation system *****s, only a subset of them possess large annotated corpora for named entity recognition. | ||
| 2021.insights-1.10 In this work, we conduct a comprehensive investigation on one of the centerpieces of modern ***** machine translation system *****s: the encoder-decoder attention mechanism. | ||
| L12-1408 This paper describes the development of a statistical *****machine translation system***** between French and English for scientific papers. | ||
| Neural | 1791 | |
| N19-2027 *****Neural***** approaches to Natural Language Generation ( NLG ) have been promising for goal - oriented dialogue . | ||
| W18-6543 *****Neural***** approaches to data - to - text generation generally handle rare input items using either delexicalisation or a copy mechanism . | ||
| 2020.acl-main.361 *****Neural***** models have achieved great success on machine reading comprehension ( MRC ) , many of which typically consist of two components : an evidence extractor and an answer predictor . | ||
| D18-1492 *****Neural***** networks with tree - based sentence encoders have shown better results on many downstream tasks . | ||
| W19-7601 This tutorial will provide an in - depth look at the experiments , jointly carried out by KantanMT and eBay during 2018 , to determine which *****Neural***** Model delivers the best translation performance for eBay Customer Service content . | ||
| BERT | 1736 | |
| 2020.acl-main.207 Recent Transformer language models like ***** BERT ***** learn powerful textual representations, but these models are targeted towards token- and sentence-level training objectives and do not leverage information on inter-document relatedness, which limits their document-level representation power. | ||
| P19-1356 In this work, we provide novel support for this claim by performing a series of experiments to unpack the elements of English language structure learned by ***** BERT *****. | ||
| 2020.sustainlp-1.11 With a slight modification, ***** BERT ***** becomes a model with multiple output paths, and each inference sample can exit early from these paths. | ||
| W19-2311 Then, a feature-based measure which is based on a recent highly deep model trained on a large text corpus called ***** BERT ***** is introduced. | ||
| 2020.coling-main.246 At last, we apply the ***** BERT ***** model to further improve the performance on both slot filling and intent detection | ||
| parsing | 1713 | |
| L06-1050 We discuss the methods used to enhance coverage and ***** parsing ***** quality and we present an evaluation on a gold standard, to our knowledge the first one for a deep grammar of German. | ||
| L16-1263 The recognition of multiword expressions (MWEs) in a sentence is important for such linguistic analyses as syntactic and semantic ***** parsing *****, because it is known that combining an MWE into a single token improves accuracy for various NLP tasks, such as dependency ***** parsing ***** and constituency ***** parsing *****. | ||
| 2020.ccl-1.75 This paper presents the first comprehensive study for self-training in cross-lingual dependency ***** parsing *****. | ||
| L12-1268 If tokens are ambiguous, lexical analysis must provide all possible sets of annotation for later (syntactic) disambiguation, be it tagging, or full ***** parsing *****. | ||
| 2020.emnlp-main.118 Meaning representation is an important component of semantic ***** parsing ***** | ||
| annotated | 1669 | |
| N19-1080 As a consequence, existing approaches required both ***** annotated ***** triggers and event types in training data. | ||
| 2021.ranlp-1.181 Our semi-supervised approach with only 20% of ***** annotated ***** data achieves similar performance compared with its supervised learning counterpart. | ||
| 2020.insights-1.17 We aim to collect ***** annotated ***** data for this phenomenon by reducing it to either of two known tasks: Explicit Completion and Natural Language Inference. | ||
| 2021.law-1.11 To ensure that the semantics of ***** annotated ***** time intervals remained unaltered despite our changes to the syntax of the annotation scheme, we applied several different techniques to validate our changes. | ||
| D18-1131 However, many forums do not have ***** annotated ***** data, i.e., questions labeled by experts as duplicates, and thus a promising solution is to use domain adaptation from another forum that has such annotations | ||
| nlp task | 1557 | |
| 2021.iwpt-1.3 The Reading Machine, is a parsing framework that takes as input raw text and performs six standard ***** nlp task *****s: tokenization, pos tagging, morphological analysis, lemmatization, dependency parsing and sentence segmentation. | ||
| E17-1035 Neural attention models have achieved great success in different *****NLP tasks*****. | ||
| W16-4208 Two standard clinical *****NLP tasks***** (the i2b2 2010 concept and assertion tasks) are evaluated with commonly used deep learning models (recurrent neural networks and convolutional neural networks) using a set of six corpora ranging from the target i2b2 data to large open-domain datasets. | ||
| 2021.acl-long.154 Transfer learning has yielded state-of-the-art (SoTA) results in many supervised *****NLP tasks*****. | ||
| E17-2042 By developing this dataset, we also introduce a new *****NLP task***** for the automatic classification of Content Types. | ||
| attention mechanism | 1511 | |
| W19-4324 The most recent successes are predominantly due to the use of different variations of ***** attention mechanism *****s, but their cognitive plausibility is questionable. | ||
| 2020.iwdp-1.4 The experimental results show that with the increase of text length, the performance of NMT model using ***** attention mechanism ***** will gradually decline. | ||
| P19-1514 In this work, we propose a novel approach to support value extraction scaling up to thousands of attributes without losing performance: (1) We propose to regard attribute as a query and adopt only one global set of BIO tags for any attributes to reduce the burden of attribute tag or model explosion; (2) We explicitly model the semantic representations for attribute and title, and develop an ***** attention mechanism ***** to capture the interactive semantic relations in-between to enforce our framework to be attribute comprehensive. | ||
| 2021.acl-srw.8 As a lot of these models are based on Transformers, several studies on the ***** attention mechanism *****s used by the models to learn to associate phrases with their visual grounding in the image have been conducted. | ||
| Q18-1005 Specifically, we embed a differentiable non-projective parsing algorithm into a neural model and use ***** attention mechanism *****s to incorporate the structural biases. | ||
| language understanding | 1508 | |
| 2021.calcs-1.20 Multilingual language models have shown decent performance in multilingual and cross-lingual natural ***** language understanding ***** tasks. | ||
| C18-1105 In this paper, we study the problem of data augmentation for ***** language understanding ***** in task-oriented dialogue system. | ||
| 2021.naacl-main.342 In the pursuit of natural ***** language understanding *****, there has been a long standing interest in tracking state changes throughout narratives. | ||
| 2021.eacl-main.159 We introduce a data augmentation technique based on byte pair encoding and a BERT-like self-attention model to boost performance on spoken ***** language understanding ***** tasks. | ||
| 2020.acl-main.82 Spelling error correction is an important yet challenging task because a satisfactory solution of it essentially needs human-level ***** language understanding ***** ability. | ||
| translation task | 1484 | |
| 2020.wmt-1.91 Our experiments show that systematic addition of the aforementioned techniques to the baseline yields an excellent performance in the English-to-Basque ***** translation task *****. | ||
| D19-5205 The mainchallenge is to provide a precise MT output.The multi-modal concept incorporates textualand visual features in the ***** translation task *****. | ||
| 2020.wmt-1.29 We present the results of our systems for the English–Inuktitut language pair for the WMT 2020 ***** translation task *****s. | ||
| 2021.emnlp-main.263 Experimental results show that BiT pushes the SOTA neural machine translation performance across 15 ***** translation task *****s on 8 language pairs (data sizes range from 160K to 38M) significantly higher. | ||
| R19-1140 We applied our model for the Turkish-Finnish language pair on the bilingual word ***** translation task *****. | ||
| language | 1471 | |
| I17-1068 However, the concepts represented by the relation ontology (e.g. ResidesIn, EmployeeOf) are ***** language ***** independent. | ||
| P19-1395 However, current representations in machine learning are ***** language ***** dependent. | ||
| 2020.lrec-1.132 People can extract precise, complex logical meanings from text in documents such as tax forms and game rules, but ***** language ***** processing systems lack adequate training and evaluation resources to do these kinds of tasks reliably. | ||
| L12-1534 Formulaic expressions are commonplace not only in every- day ***** language ***** but also in scientific writing. | ||
| L12-1095 Given that semantic frames are ***** language ***** independent to a fair degree (Boas 2005; Baker 2009), the labels attributed to each of the 76 identified frames (e.g. [Crime], [Regulations]) were used to group together 165 pairs of candidate equivalents | ||
| embedding | 1431 | |
| C18-1140 Thereby, the meta-***** embedding ***** space is enforced to capture complementary information in different source ***** embedding *****s via a coherent common ***** embedding ***** space. | ||
| I17-4016 The systems mainly utilize a multi-layer neural networks, with multiple features input such as word ***** embedding *****, part-of-speech-tagging (POST), word clustering, prefix type, character ***** embedding *****, cross sentiment input, and AdaBoost method for model training. | ||
| 2020.wmt-1.100 Our results show that YiSi-2's correlation with human direct assessment on translation quality is greatly improved by replacing multilingual BERT with XLM-RoBERTa and projecting the source ***** embedding *****s into the tar- get ***** embedding ***** space using a cross-lingual lin- ear projection (CLP) matrix learnt from a small development set. | ||
| 2020.repl4nlp-1.19 Furthermore, through syntactic probing of the principal ***** embedding ***** space, we show that the syntactic information captured by a principal component does not correlate with the amount of variance it explains. | ||
| 2020.coling-main.106 These properties manifest when querying the ***** embedding ***** space for the most similar vectors, and when used at the input layer of deep neural networks trained to solve downstream NLP problems | ||
| human evaluation | 1430 | |
| 2021.acl-demo.41 To guarantee acceptability, all the text transformations are linguistically based and all the transformed data selected (up to 100,000 texts) scored highly under ***** human evaluation *****. | ||
| P18-1153 Automatic and ***** human evaluation *****s show that our models are able to generate homographic puns of good readability and quality. | ||
| N18-1160 Automatic and ***** human evaluation ***** show that the encoder is able to reliably assign good labels for the movie's attributes, and the overviews provide descriptions of the movie's content which are informative and faithful. | ||
| 2021.humeval-1.8 Only two papers presented a ***** human evaluation ***** that was in line with what was modeled in the method. | ||
| 2021.acl-short.112 We perform both ***** human evaluation ***** and automatic evaluation of dialogs generated by our method. | ||
| neural | 1410 | |
| 2020.coling-main.338 In recent years, parsing performance is dramatically improved on in-domain texts thanks to the rapid progress of deep ***** neural ***** network models. | ||
| D19-5708 The NER model enumerates all possible spans as potential entity mentions and classify them into entity types or no entity with deep ***** neural ***** networks. | ||
| 2020.coling-main.468 Moreover, our model integrates a joint ***** neural ***** topic model (NTM) to discover latent topics, which can provide document-level features for sentence selection. | ||
| D19-1570 Many Data Augmentation (DA) methods have been proposed for ***** neural ***** machine translation. | ||
| 2021.gem-1.16 We aim to bridge this gap by applying and evaluating advances in decoding methods for ***** neural ***** response generation to ***** neural ***** narrative generation | ||
| NMT | 1375 | |
| 2020.lrec-1.446 ***** NMT ***** systems have already outperformed traditional phrase-based statistical machine translation (PBSMT) systems for some pairs of languages. | ||
| K19-1031 One of the key differences of these ***** NMT ***** models is how the model handles position information which is essential to process sequential data. | ||
| 2020.wmt-1.65 For humans, the solution to the rare-word problem has long been dictionaries, but dictionaries cannot be straightforwardly incorporated into ***** NMT *****. | ||
| Q16-1027 Neural machine translation (***** NMT *****) aims at solving machine translation (MT) problems using neural networks and has exhibited promising results in recent years. | ||
| 2020.wat-1.18 The paper describes the development process of the The University of Tokyo 's *****NMT***** systems that were submitted to the WAT 2020 Document - level Business Scene Dialogue Translation sub - task . | ||
| annotations | 1298 | |
| 2021.emnlp-main.267 Training QE models require massive parallel data with hand-crafted quality ***** annotations *****, which are time-consuming and labor-intensive to obtain. | ||
| P18-2106 Notwithstanding the absence of parallel data, and the dissimilarity in ***** annotations ***** between languages, our approach results in improvement in parsing performance on several languages over a monolingual baseline. | ||
| W17-4107 Most NLP resources that offer ***** annotations ***** at the word segment level provide morphological annotation that includes features indicating tense, aspect, modality, gender, case, and other inflectional information. | ||
| 2020.emnlp-main.48 Each rule-of-thumb is further broken down with 12 different dimensions of people's judgments, including social judgments of good and bad, moral foundations, expected cultural pressure, and assumed legality, which together amount to over 4.5 million ***** annotations ***** of categorical labels and free-text descriptions. | ||
| W19-3512 This has been later emphasized through the comprehensive evaluation of the ***** annotations ***** as the annotation agreement metrics of Cohen's Kappa (k) and Krippendorff's alpha (α) indicated the consistency of the ***** annotations ***** | ||
| multilingual | 1293 | |
| L08-1254 For ***** multilingual ***** systems, accurate translation of named entities and their descriptors is critical. | ||
| 2020.lrec-1.793 As a common approach, ***** multilingual ***** training has been applied to achieve more context coverage and has shown better performance over the monolingual training (Heigold et al., 2013). | ||
| 2020.cl-1.3 Probabilistic topic modeling is a common first step in crosslingual tasks to enable knowledge transfer and extract ***** multilingual ***** features. | ||
| W16-3716 We propose a technique towards solving this problem with the help of ***** multilingual ***** word clusters obtained from ***** multilingual ***** word embeddings. | ||
| 2020.semeval-1.31 As important branches of lexical entailment, predicting ***** multilingual ***** and cross-lingual lexical entailment (LE) are two subtasks of SemEval2020 Task2 | ||
| shared task | 1291 | |
| W17-1413 In the paper we present an adaptation of Liner2 framework to solve the BSNLP 2017 ***** shared task ***** on multilingual named entity recognition. | ||
| 2020.fnp-1.1 FNS summarisation ***** shared task ***** is the first to target financial annual reports. | ||
| 2020.wnut-1.39 This paper presents our teamwork on WNUT 2020 ***** shared task *****-1: wet lab entity extract, that we conducted studies in several models, including a BiLSTM CRF model and a Bert case model which can be used to complete wet lab entity extraction. | ||
| 2020.wmt-1.15 This paper describes Tilde's submission to the WMT2020 ***** shared task ***** on news translation for both directions of the English-Polish language pair in both the constrained and the unconstrained tracks. | ||
| K18-2016 We introduce a complete neural pipeline system that takes raw text as input, and performs all tasks required by the ***** shared task *****, ranging from tokenization and sentence segmentation, to POS tagging and dependency parsing. | ||
| translation quality | 1284 | |
| W17-6005 The extensive experiments demonstrate that our proposed framework significantly improves the ***** translation quality ***** of terms and sentences. | ||
| 2020.wat-1.2 Our results confirm that applying the proposed optimization method on English-Persian translation can exceed ***** translation quality ***** compared to the English-Persian Statistical Machine Translation (SMT) paradigm. | ||
| 1997.mtsummit-papers.22 In this paper2 we present the criteria and the evaluation procedure for evaluating the ***** translation quality ***** of the VERBMOBIL prototype. | ||
| 2021.wat-1.21 We find that SP is the overall best choice for segmentation, and that larger dictionary sizes lead to higher ***** translation quality *****. | ||
| 2009.iwslt-evaluation.14 The evaluation results show that both strategies yield sizeable and consistent improvements in ***** translation quality *****. | ||
| language processing | 1258 | |
| 2020.winlp-1.17 In the following, we present a system for assisted typing in LS whose accuracy and speed is largely due to the deployment of real time natural-***** language processing ***** enabling efficient prediction and context-sensitive grammar support. | ||
| 2020.coling-main.187 Chinese word segmentation (CWS) and part-of-speech (POS) tagging are two fundamental tasks for Chinese ***** language processing *****. | ||
| 2020.acl-demos.5 We present a large improvement over classic search engine baseline on several standard QA datasets and provide the community a collaborative data collection tool to curate the first natural ***** language processing ***** research QA dataset via a community effort. | ||
| 2021.teachingnlp-1.16 Introducing biomedical informatics (BMI) students to natural ***** language processing ***** (NLP) requires balancing technical depth with practical know-how to address application-focused needs. | ||
| 2020.acl-main.264 Advanced machine learning techniques have boosted the performance of natural ***** language processing *****. | ||
| baselines | 1220 | |
| P19-1195 Our results show that the proposed model outperforms competitive ***** baselines ***** in automatic and human evaluation. | ||
| 2020.coling-main.504 Besides the detailed dataset description, we show the performance of several typical extractive summarization methods on TWEETSUM to establish ***** baselines *****. | ||
| 2021.acl-long.363 Experiment results on two datasets show that CoRI can significantly outperform the ***** baselines *****, improving AUC from .677 to .748 and from .716 to .780, respectively. | ||
| N19-1287 Experimental results show that our neural network model can outperform various ***** baselines ***** on the constructed corpus. | ||
| 2020.findings-emnlp.360 As a set of ***** baselines ***** for further studies, we evaluate the performance of existing cross-lingual abstractive summarization methods on our dataset | ||
| translation | 1170 | |
| 2020.wmt-1.48 We explored the use of different linguistic features like POS and Morph along with back ***** translation ***** for Hindi-Marathi and Marathi-Hindi machine ***** translation *****. | ||
| D18-1330 Our experiments on standard benchmarks show that our approach outperforms the state of the art on word ***** translation *****, with the biggest improvements observed for distant language pairs such as English-Chinese. | ||
| 2021.ranlp-1.47 In a post-editing scenario, user corrections of machine ***** translation ***** output are thus continuously incorporated into ***** translation ***** models, reducing or eliminating repetitive error editing and increasing the usefulness of automated ***** translation *****. | ||
| 2020.emnlp-main.178 To achieve high accuracy, the model usually needs to wait for more streaming text before ***** translation *****, which results in increased latency. | ||
| W18-6318 We also propose alignment pruning to speed up decoding in alignment-based neural machine ***** translation ***** (ANMT), which speeds up ***** translation ***** by a factor of 1.8 without loss in ***** translation ***** performance | ||
| BLEU | 1133 | |
| N19-1235 The performance of the model can be improved further using a high-precision, broad coverage grammar-based parser to generate a large silver training corpus, achieving a final ***** BLEU ***** score of 77.17 on the full test set, and 83.37 on the subset of test data most closely matching the silver data domain. | ||
| W18-6306 We show that NMT models taking advantage of context oracle signals can achieve considerable gains in ***** BLEU *****, of up to 7.02 ***** BLEU ***** for coreference and 1.89 ***** BLEU ***** for coherence on subtitles translation. | ||
| 2021.blackboxnlp-1.14 There is a positive correlation between the performance of the length prediction and the ***** BLEU ***** score. | ||
| 2021.mtsummit-research.3 In this paper and we show that initialising the embedding layer of UNMT models with cross-lingual embeddings leads to significant ***** BLEU ***** score improvements over existing UNMT models where the embedding layer weights are randomly initialized. | ||
| 2019.iwslt-1.20 While absolute and relative positional encoding perform equally strong overall, we show that relative positional encoding is vastly superior (4.4% to 11.9% ***** BLEU *****) when translating a sentence that is longer than any observed training sentence | ||
| Machine | 1133 | |
| 2020.sltu-1.40 *****Machine***** Translation is the inevitable technology to reduce communication barriers in today 's world . | ||
| W19-4613 Segmentation serves as an integral part in many NLP applications including *****Machine***** Translation , Parsing , and Information Retrieval . | ||
| W19-5434 We describe the National Research Council Canada team 's submissions to the parallel corpus filtering task at the Fourth Conference on *****Machine***** Translation . | ||
| 2020.emnlp-main.677 In recent years , there has been an increasing interest in the application of Artificial Intelligence and especially *****Machine***** Learning to the field of Sustainable Development ( SD ) . | ||
| 2021.americasnlp-1.17 Low - resource polysynthetic languages pose many challenges in NLP tasks , such as morphological analysis and *****Machine***** Translation , due to available resources and tools , and the morphologically complex languages . | ||
| summarization | 1074 | |
| L14-1077 Although many ***** summarization ***** algorithms exist, there are few tools or infrastructures providing capabilities for developing ***** summarization ***** applications. | ||
| C18-1077 We evaluate the proposed metric in replicating the human assigned scores for ***** summarization ***** systems and summaries on data from query-focused and update ***** summarization ***** tasks in TAC 2008 and 2009. | ||
| D17-1175 Although sequence-to-sequence (seq2seq) network has achieved significant success in many NLP tasks such as machine translation and text ***** summarization *****, simply applying this approach to transition-based dependency parsing cannot yield a comparable performance gain as in other state-of-the-art methods, such as stack-LSTM and head selection. | ||
| 2021.eacl-main.154 We compare our approach to standard and pre-trained language-model-based summarizers and report state-of-the-art results for long document ***** summarization ***** and comparable results for smaller document ***** summarization *****. | ||
| N18-1157 Our method is applicable to any monotone submodular objective function, including many functions well-suited for document ***** summarization ***** | ||
| language modeling | 1016 | |
| N18-1086 We find that our model formulation of latent dependencies with exact marginalization do not lead to better intrinsic ***** language modeling ***** performance than vanilla RNNs, and that parsing accuracy is not correlated with ***** language modeling ***** perplexity in stack-based models. | ||
| 2020.acl-main.327 Our experiments show that our proposed method using Cross-lingual Language Model (XLM) trained with a translation ***** language modeling ***** (TLM) objective achieves a higher correlation with human judgments than a baseline method that uses only hypothesis and reference sentences. | ||
| W18-1202 We then investigate utilizing the mined subwords within the FastText embedding model and compare performance of the learned representations in a downstream ***** language modeling ***** task. | ||
| L10-1319 In this paper, we extend the method by (1) using neighboring context to index the target passage, and (2) applying a ***** language modeling ***** approach for document retrieval. | ||
| 2021.acl-long.201 Specifically, with a two-stream multi-modal Transformer encoder, LayoutLMv2 uses not only the existing masked visual-***** language modeling ***** task but also the new text-image alignment and text-image matching tasks, which make it better capture the cross-modality interaction in the pre-training stage. | ||
| downstream task | 1009 | |
| 2020.gebnlp-1.6 Furthermore, we analyze the effect of the debiasing techniques on ***** downstream task *****s which show a negligible impact on traditional embeddings and a 2% decrease in performance in contextualized embeddings. | ||
| 2021.wnut-1.53 Our results show that while word-level, intrinsic, performance evaluation is behind other methods, our model improves performance on extrinsic, ***** downstream task *****s through normalization compared to models operating on raw, unprocessed, social media text. | ||
| 2021.naacl-main.293 We aim to denoise bias information while training on the ***** downstream task *****, rather than completely remove social bias and pursue static unbiased representations. | ||
| 2021.acl-long.243 We study a set of nine typologically diverse languages with readily available pretrained monolingual models on a set of five diverse monolingual ***** downstream task *****s. | ||
| 2021.emnlp-main.749 Experiments on various ***** downstream task *****s in GLUE benchmark show that Child-Tuning consistently outperforms the vanilla fine-tuning by 1.5 8.6 average score among four different pretrained models, and surpasses the prior fine-tuning techniques by 0.6 1.3 points. | ||
| inference | 996 | |
| P17-1152 Based on this, we further show that by explicitly considering recursive architectures in both local ***** inference ***** modeling and ***** inference ***** composition, we achieve additional improvement. | ||
| 2021.acl-long.508 In this paper, we extend PoWER-BERT (Goyal et al., 2020) and propose Length-Adaptive Transformer that can be used for various ***** inference ***** scenarios after one-shot training. | ||
| 2021.emnlp-main.72 We also find, however, that performance is heavily influenced by word frequency, with experiments showing that both the absolute frequency of a verb form, as well as the frequency relative to the alternate inflection, are causally implicated in the predictions BERT makes at ***** inference ***** time. | ||
| 2020.coling-main.360 We noticed that the gold emotion labels of the context utterances can provide explicit and accurate emotion interaction, but it is impossible to input gold labels at ***** inference ***** time. | ||
| 2020.emnlp-main.39 State-of-the-art lifelong language learning methods store past examples in episodic memory and replay them at both training and ***** inference ***** time | ||
| Translation | 991 | |
| 2001.mtsummit-ebmt.4 *****Translation***** systems that automatically extract transfer mappings ( rules or examples ) from bilingual corpora have been hampered by the difficulty of achieving accurate alignment and acquiring high quality mappings . | ||
| 2020.latechclfl-1.20 TL - Explorer is a digital humanities tool for mapping and analyzing translated literature , encompassing the World Map and the *****Translation***** Dashboard . | ||
| 1998.amta-systems.3 *****Translation***** tools can be integrated with the translation process with the goal and result of increasing consistency , reusing previous translations , and decreasing the amount of time needed to put a product on the market . | ||
| 2020.eamt-1.19 Matching and retrieving previously translated segments from the *****Translation***** Memory is a key functionality in Translation Memories systems . | ||
| W18-3814 *****Translation***** relations , which distinguish literal translation from other translation techniques , constitute an important subject of study for human translators ( Chuquet and Paillard , 1989 ) . | ||
| parser | 988 | |
| 2021.adaptnlp-1.25 In this paper, we present a first effort toward building a weakly-supervised semantic ***** parser ***** to transform brief, multi-intent natural utterances into logical forms. | ||
| 2021.iwpt-1.2 In this paper, we present the first statistical ***** parser ***** for Lambek categorial grammar (LCG), a grammatical formalism for which the graphical proof method known as *proof nets* is applicable. | ||
| 1993.iwpt-1.8 Efficiency is also a concern, as tutoring applications typically run on personal computers, with the ***** parser ***** sharing memory with other components of the system. | ||
| 2020.coling-main.345 Current methods of cross-lingual ***** parser ***** transfer focus on predicting the best ***** parser ***** for a low-resource target language globally, that is, “at treebank level”. | ||
| C16-2019 Furthermore, by default, the ***** parser ***** handles collocations and other MWEs, as well as anaphora resolution (limited to 3rd person personal pronouns) | ||
| morphological | 988 | |
| N19-1155 Error analysis indicates that joint ***** morphological ***** tagging and lemmatization is especially helpful in low-resource lemmatization and languages that display a larger degree of ***** morphological ***** complexity. | ||
| E17-2018 We capitalize on this idea by training a tagger for English that uses syntactic features obtained by automatic parsing to recover complex ***** morphological ***** tags projected from Czech. | ||
| 2020.cl-2.4 To address this, we introduce 15 type-level probing tasks such as case marking, possession, word length, ***** morphological ***** tag count, and pseudoword identification for 24 languages. | ||
| 2021.nodalida-main.25 Lemmatization is often used with ***** morphological *****ly rich languages to address issues caused by ***** morphological ***** complexity, performed by grammar-based lemmatizers. | ||
| L10-1564 We show that joint functional and ***** morphological ***** information percolation improves both the recovery of trees as well as dependency results in the form of LFG f-structures | ||
| Furthermore | 979 | |
| 2020.crac-1.1 ***** Furthermore *****, our approach can be adopted by the majority of Transformer-based language models. | ||
| C18-1229 ***** Furthermore *****, our WSD system outperforms the state-of-the-art WSD systems in the Semeval-13 dataset. | ||
| 2020.acl-main.604 ***** Furthermore *****, an ensemble RikiNet obtains 76.1 F1 and 61.3 F1 on long-answer and short-answer tasks, achieving the best performance on the official NQ leaderboard. | ||
| 2021.latechclfl-1.2 ***** Furthermore *****, we present a case study concerning the annotation of olfactory situations in English historical travel writings describing trips to Italy. | ||
| 2000.iwpt-1.9 ***** Furthermore *****, we perform a preliminary investigation in smoothing these grammars by means of an external linguistic resource, namely, the tree families of an XTAG grammar, a hand built grammar of English | ||
| Semantic | 978 | |
| P18-2008 *****Semantic***** parsing requires training data that is expensive and slow to collect . | ||
| S19-2003 This paper presents Unsupervised Lexical Frame Induction , Task 2 of the International Workshop on *****Semantic***** Evaluation in 2019 . | ||
| P18-1018 *****Semantic***** relations are often signaled with prepositional or possessive markingbut extreme polysemy bedevils their analysis and automatic interpretation . | ||
| Q14-1042 *****Semantic***** parsing is the task of translating natural language utterances into a machine - interpretable meaning representation . | ||
| P19-1473 *****Semantic***** parsing over multiple knowledge bases enables a parser to exploit structural similarities of programs across the multiple domains . | ||
| social media | 978 | |
| 2021.wnut-1.53 Our results show that while word-level, intrinsic, performance evaluation is behind other methods, our model improves performance on extrinsic, downstream tasks through normalization compared to models operating on raw, unprocessed, ***** social media ***** text. | ||
| N18-4018 While some work has been done on code-mixed ***** social media ***** text and in emotion prediction separately, our work is the first attempt which aims at identifying the emotion associated with Hindi-English code-mixed ***** social media ***** text. | ||
| 2020.semeval-1.99 Information on ***** social media ***** comprises of various modalities such as textual, visual and audio. | ||
| 2020.wnut-1.52 Increasing usage of ***** social media ***** presents new non-traditional avenues for monitoring disease outbreaks, virus transmissions and disease progressions through user posts describing test results or disease symptoms. | ||
| P18-1185 In this paper, we explore the task of name tagging in multimodal ***** social media ***** posts. | ||
| Text | 977 | |
| E17-2070 Measuring topic quality is essential for scoring the learned topics and their subsequent use in Information Retrieval and ***** Text ***** classification. | ||
| 2020.lrec-1.185 With the tremendous success of deep learning models on computer vision tasks , there are various emerging works on the Natural Language Processing ( NLP ) task of *****Text***** Classification using parametric models . | ||
| 2021.emnlp-main.642 *****Text***** classification is a fundamental task with broad applications in natural language processing . | ||
| K19-1094 *****Text***** classification plays a crucial role for understanding natural language in a wide range of applications . | ||
| 2020.acl-main.243 *****Text***** generation often requires high - precision output that obeys task - specific rules . | ||
| word representation | 968 | |
| S19-2048 Our model extends the Recurrent Convolutional Neural Network (RCNN) by using external fine-tuned ***** word representation *****s and DeepMoji sentence representations. | ||
| N18-1082 In addition, recent studies aiming at solving prepositional attachment and preposition selection problems depend heavily on external linguistic resources and use dataset-specific ***** word representation *****s. | ||
| 2020.coling-main.338 To deal with this problem, we propose to improve the contextualized ***** word representation *****s via adversarial learning and fine-tuning BERT processes. | ||
| 2020.acl-demos.41 We fine-tune the contextualized ***** word representation *****s of the RoBERTa language model using labeled DDI data, and apply the fine-tuned model to identify supplement interactions. | ||
| 2020.lrec-1.592 We propose a new method that leverages contextual embeddings for the task of diachronic semantic shift detection by generating time specific ***** word representation *****s from BERT embeddings. | ||
| translation model | 962 | |
| 2008.amta-papers.19 We also build a cascaded ***** translation model ***** that dynamically shifts translation units from phrase level to word and morpheme phrase levels. | ||
| W17-3205 Classifier probabilities are used to weight sentences according to their domain similarity when updating the parameters of the neural ***** translation model *****. | ||
| L08-1579 Recently the LATL has undertaken the development of a multilingual translation system based on a symbolic parsing technology and on a transfer-based ***** translation model *****. | ||
| 2008.iwslt-evaluation.18 For the pivot task, we combined the translations generated by a pivot based statistical ***** translation model ***** and a statistical transfer ***** translation model ***** (firstly, translating from Chinese to English, and then from English to Spanish). | ||
| P17-2012 Our approach encourages the neural machine ***** translation model ***** to incorporate linguistic prior during training, and lets it translate on its own afterward. | ||
| encoder - decoder | 947 | |
| 2021.nlp4prog-1.2 We also introduce baselines based on transformer *****encoder-decoders*****, and study the effects of including syntactic information and context. | ||
| 2021.dialdoc-1.3 Most existing neural network based task-oriented dialog systems follow *****encoder-decoder***** paradigm, where the decoder purely depends on the source texts to generate a sequence of words, usually suffering from instability and poor readability. | ||
| P17-1106 In this work, we propose to use layer-wise relevance propagation (LRP) to compute the contribution of each contextual word to arbitrary hidden states in the attention-based *****encoder-decoder***** framework. | ||
| 2021.emnlp-main.195 Specifically, an attentional *****encoder-decoder***** with a retriever framework is utilized. | ||
| 2021.emnlp-main.2 Using this method, SixT significantly outperforms mBART, a pretrained multilingual *****encoder-decoder***** model explicitly designed for NMT, with an average improvement of 7.1 BLEU on zero-shot any-to-English test sets across 14 source languages. | ||
| textual | 926 | |
| 2020.lrec-1.765 Swearing plays an ubiquitous role in everyday conversations among humans, both in oral and ***** textual ***** communication, and occurs frequently in social media texts, typically featured by informal language and spontaneous writing. | ||
| 2021.socialnlp-1.2 In this work we collect a dataset of 1.2M tweets related to this event, with particular interest to the ***** textual ***** content shared, and we design a hashtag-based semi-automatic approach to label them as Supporters or Against the referendum. | ||
| 2020.coling-main.207 We thus propose to additionally leverage references, which are selected from a large pool of texts labeled with one of the attributes, as ***** textual ***** information that enriches inductive biases of given attributes. | ||
| 2020.fnp-1.32 We present a novel approach to unsupervised information extraction by identifying and extracting relevant concept-value pairs from ***** textual ***** data. | ||
| 2021.lantern-1.2 Here, we leverage multimodal modeling for purely ***** textual ***** tasks (language modeling and classification) with the expectation that the multimodal pretraining provides a grounding that can improve text processing accuracy | ||
| bleu score | 924 | |
| 2021.dialdoc-1.14 Our best model for knowledge identification outperformed the baseline by 10.5+ f1-score on the test-dev split, and our best model for response generation outperformed the baseline by 11+ Sacre***** bleu score ***** on the test-dev split. | ||
| W19-5204 We apply our model to the output of existing NMT systems, and demonstrate that, while the human-judged quality improves in all cases, *****BLEU scores***** drop with forward-translated test sets. | ||
| W18-3604 Our approach shows promising results, with *****BLEU scores***** above 50 for 5 different languages (English, French, Italian, Portuguese and Spanish) and above 35 for the Dutch language. | ||
| 2021.nlp4prog-1.2 Overall, our models achieve a *****BLEU score***** of 38.2, while only generating unparsable code in 1.92% of cases. | ||
| 2011.iwslt-evaluation.22 When we considered the performance with suitable wide beams that ensured the ASR accuracy had converged we observed the language model weight had little influence on the SMT *****BLEU scores*****. | ||
| named entity recognition | 901 | |
| W17-1413 In the paper we present an adaptation of Liner2 framework to solve the BSNLP 2017 shared task on multilingual ***** named entity recognition *****. | ||
| N18-1131 Most ***** named entity recognition ***** (NER) systems deal only with the flat entities and ignore the inner nested ones, which fails to capture finer-grained semantic information in underlying texts. | ||
| D19-1100 Although over 100 languages are supported by strong off-the-shelf machine translation systems, only a subset of them possess large annotated corpora for ***** named entity recognition *****. | ||
| Q14-1037 We present a joint model of three core tasks in the entity analysis stack: coreference resolution (within-document clustering), ***** named entity recognition ***** (coarse semantic typing), and entity linking (matching to Wikipedia entities). | ||
| L08-1054 We discuss a *****named entity recognition***** system for Arabic , and show how we incorporated the information provided by MADA , a full morphological tagger which uses a morphological analyzer . | ||
| segmentation | 895 | |
| 2020.lrec-1.262 More specifically, given that lyrics encode an important part of the semantics of a song, we focus here on the description of the methods we proposed to extract relevant information from the lyrics, as their structure ***** segmentation *****, their topic, the explicitness of the lyrics content, the salient passages of a song and the emotions conveyed. | ||
| P19-1314 We benchmark neural word-based models which rely on word ***** segmentation ***** against neural char-based models which do not involve word ***** segmentation ***** in four end-to-end NLP benchmark tasks: language modeling, machine translation, sentence matching/paraphrase and text classification. | ||
| L10-1026 Obtained results also show up to 3F points improvement is achieved when the appropriate ***** segmentation ***** style is used. | ||
| 2020.lrec-1.861 Finally, we propose a double-annotation mode, for which Seshat computes automatically an associated inter-annotator agreement with the gamma measure taking into account the categorisation and ***** segmentation ***** discrepancies. | ||
| 2021.naacl-main.116 For every step in ***** segmentation *****, it recognizes the leftmost segment of the remaining sequence | ||
| linguistic | 894 | |
| 2020.lrec-1.408 Also, the optional language sub-tags compliant with BCP 47 do not offer a possibility fine-grained enough to represent ***** linguistic ***** variation. | ||
| D18-1467 Dissemination across many ***** linguistic ***** contexts is a predictor of success: words that appear in more ***** linguistic ***** contexts grow faster and survive longer. | ||
| 1994.bcs-1.16 This paper presents: - a contrastive view to knowledge based techniques in MAT, - mechanisms for mapping the “ordinary” ***** linguistic ***** lexicon and the terminological lexicon of two languages onto one knowledge base, - methods to access the domain knowledge in a flexible way without allowing completely free ***** linguistic ***** dialogues, - techniques to present the result of queries to the translator in restricted natural language, and - use of domain knowledge to solve specific translation difficulties. | ||
| 2000.amta-papers.6 In this paper we prototype a machine translation system from English to American Sign Language (ASL), taking into account not only ***** linguistic ***** but also visual and spatial information associated with ASL signs. | ||
| 2021.acl-long.52 While there is an abundance of advice to podcast creators on how to speak in ways that engage their listeners, there has been little data-driven analysis of podcasts that relates ***** linguistic ***** style with engagement | ||
| classifier | 883 | |
| P19-1284 We approach this problem by jointly training two neural network models: a latent model that selects a rationale (i.e. a short and informative part of the input text), and a ***** classifier ***** that learns from the words in the rationale alone. | ||
| D17-1053 Furthermore, the universal single ***** classifier ***** is compared with a few cross-language sentiment ***** classifier *****s relying on direct parallel data between the source and target languages, and the results show that the performance of our universal sentiment ***** classifier ***** is very promising compared to that of different cross-language ***** classifier *****s in multiple target languages. | ||
| L08-1492 We have previously reported on the use of a simple dialogue act ***** classifier ***** based on purely intra-utterance features - principally involving word n-gram cue phrases automatically generated from a training corpus. | ||
| D18-1383 With the difference between source and target minimized, we then exploit additional information from the target domain by consolidating the idea of semi-supervised learning, for which, we jointly employ two regularizations — entropy minimization and self-ensemble bootstrapping — to incorporate the unlabeled target data for ***** classifier ***** refinement. | ||
| D17-1126 The main advantage of our method is its simplicity, as it gets rid of the ***** classifier ***** or human in the loop needed to select data before annotation and subsequent application of paraphrase identification algorithms in the previous work | ||
| Moreover | 882 | |
| R19-1075 ***** Moreover *****, this hierarchy should be compatible with forming phrases and sentences. | ||
| 2020.conll-1.35 ***** Moreover *****, historical variations can be present in aged documents, which can impact the performance of the NER process. | ||
| D19-1107 ***** Moreover *****, we show that often models do not generalize well to examples from annotators that did not contribute to the training set. | ||
| 2021.acl-long.237 ***** Moreover *****, when attacked by TextFooler with synonym replacement, SEQA demonstrates much less performance drops than baselines, thereby indicating stronger robustness. | ||
| 2021.acl-long.227 ***** Moreover *****, it outperformed competitive pretraining models by a large margin on most language understanding tasks, such as text classification and question answering | ||
| natural language inference | 877 | |
| 2020.acl-main.177 As a case study, we perform a series of experiments in the setting of ***** natural language inference ***** (NLI). | ||
| 2021.starsem-1.27 We show that examples that depend critically on a rarer word are more challenging for ***** natural language inference ***** models. | ||
| 2020.acl-main.645 In this paper, we investigate the feasibility of training monolingual Transformer-based language models for other languages, taking French as an example and evaluating our language models on part-of-speech tagging, dependency parsing, named entity recognition and ***** natural language inference ***** tasks. | ||
| D18-1007 We present a large-scale collection of diverse ***** natural language inference ***** (NLI) datasets that help provide insight into how well a sentence representation captures distinct types of reasoning. | ||
| S19-1027 Large crowdsourced datasets are widely used for training and evaluating neural models on ***** natural language inference ***** (NLI). | ||
| convolutional neural network | 841 | |
| 2018.gwc-1.27 In addition, we use ***** convolutional neural network ***** and piecewise max pooling ***** convolutional neural network ***** relation extraction models that efficiently grasp key features in sentences. | ||
| C16-1289 Upon the generated source and target phrase structures, we stack a ***** convolutional neural network ***** to integrate vector representations of linguistic units on the structures into bilingual phrase embeddings. | ||
| C18-1156 When used along with content-based feature extractors such as ***** convolutional neural network *****s, we see a significant boost in the classification performance on a large Reddit corpus. | ||
| D17-1191 In this paper, we design a novel ***** convolutional neural network ***** (CNN) with residual learning, and investigate its impacts on the task of distantly supervised noisy relation extraction. | ||
| W18-4930 For TRAPACC, the classifier consists of a data-independent dimension reduction and a ***** convolutional neural network ***** (CNN) for learning and labelling transitions. | ||
| reinforcement learning | 840 | |
| 2020.inlg-1.7 Recently ***** reinforcement learning ***** (RL) techniques have been adopted to train deep end-to-end systems to directly optimize sequence-level objectives. | ||
| 2021.acl-long.206 We follow strategies in ***** reinforcement learning ***** to optimize the parameters of the controller and compute the reward based on the accuracy of a task model, which is fed with the sampled concatenation as input and trained on a task dataset. | ||
| 2020.ccl-1.94 From the perspective of representation learning, ***** reinforcement learning ***** was combined with traditional deep learning methods. | ||
| N18-4010 The supervised training agent can further be improved via interacting with users and learning online from user demonstration and feedback with imitation and ***** reinforcement learning *****. | ||
| D17-1153 Machine translation is a natural candidate problem for ***** reinforcement learning ***** from human feedback: users provide quick, dirty ratings on candidate translations to guide a system to improve. | ||
| monolingual | 820 | |
| 2021.semeval-1.102 Our approach also achieves satisfactory results in other ***** monolingual ***** and cross-lingual language pairs as well. | ||
| 2016.iwslt-1.6 In addition, we point out a novel way to make use of ***** monolingual ***** data with Neural Machine Translation using the same approach with a 3.15-BLEU-score gain in IWSLT'16 English→German translation task. | ||
| 2021.wnut-1.47 In this low-resource scenario with data display- ing a high level of variability, we compare the downstream performance of a character-based language model on part-of-speech tagging and dependency parsing to that of ***** monolingual ***** and multilingual models. | ||
| L12-1118 Progressing from ***** monolingual ***** to cross-lingual Entity Linking technologies, the 2011 cross-lingual NEL evaluation targeted multilingual capabilities. | ||
| 2020.wmt-1.51 Finally, we make use of additional ***** monolingual ***** data by creating synthetic parallel data through back-translation. | ||
| Sentiment | 815 | |
| D19-5227 ***** Sentiment ***** ambiguous lexicons refer to words where their polarity depends strongly on con- text. | ||
| 2020.semeval-1.183 This paper reports the zyy1510 team 's work in the International Workshop on Semantic Evaluation ( SemEval-2020 ) shared task on *****Sentiment***** analysis for Code - Mixed ( Hindi - English , English - Spanish ) Social Media Text . | ||
| 2020.semeval-1.172 This paper describes the participation of LIMSI_UPV team in SemEval-2020 Task 9 : *****Sentiment***** Analysis for Code - Mixed Social Media Text . | ||
| 2020.semeval-1.170 *****Sentiment***** Analysis is a well - studied field of Natural Language Processing . | ||
| 2020.eamt-1.9 *****Sentiment***** analysis is a widely researched NLP problem with state - of - the - art solutions capable of attaining human - like accuracies for various languages . | ||
| coreference | 798 | |
| L16-1325 Despite the popularity of ***** coreference ***** resolution as a research topic, the overwhelming majority of the work in this area focused so far on single antecedence ***** coreference ***** only. | ||
| 2021.naacl-main.198 To complement these resources and enhance future research, we present Wikipedia Event Coreference (WEC), an efficient methodology for gathering a large-scale dataset for cross-document event ***** coreference ***** from Wikipedia, where ***** coreference ***** links are not restricted within predefined topics. | ||
| N19-1085 In this work, we propose a transfer learning framework for event ***** coreference ***** resolution that utilizes a large amount of unlabeled data to learn argument compatibility of event mentions. | ||
| L14-1646 The ECB corpus is one of the data sets used for evaluation of the task of event ***** coreference ***** resolution. | ||
| P18-1103 Human generates responses relying on semantic and functional dependencies, including ***** coreference ***** relation, among dialogue elements and their context. | ||
| self - attention | 795 | |
| W19-3508 We propose an experimental study that has three aims: 1) to provide us with a deeper understanding of current data sets that focus on different types of abusive language, which are sometimes overlapping (racism, sexism, hate speech, offensive language, and personal attacks); 2) to investigate what type of attention mechanism (contextual vs. *****self-attention*****) is better for abusive language detection using deep learning architectures; and 3) to investigate whether stacked architectures provide an advantage over simple architectures for this task. | ||
| 2021.naacl-main.12 We conclude by discussing how the RBF kernel resembles BERT's *****self-attention***** layers and speculate that this resemblance leads to the RBF-based probe's stronger performance. | ||
| W18-6413 TenTrans is an improved NMT system based on Transformer *****self-attention***** mechanism. | ||
| 2021.naacl-main.137 We design a lattice position attention mechanism to exploit the lattice structures in *****self-attention***** layers. | ||
| 2021.emnlp-main.828 This transformation is carefully crafted so that the final output of *****self-attention***** is not affected by absolute positions of tokens. | ||
| WordNet | 785 | |
| L10-1481 Inspired by work on classification of word senses by polarity (e.g., Senti***** WordNet *****), and taking ***** WordNet ***** as a starting point, we build Q-***** WordNet *****. | ||
| C18-1023 In this paper, we study how we can improve a deep learning approach to textual entailment by incorporating lexical entailment relations from ***** WordNet *****. | ||
| 2021.gwc-1.6 We exploit the knowledge encoded within different off-the-shelf pre-trained Language Models and task formulations to infer the domain label of a particular ***** WordNet ***** definition. | ||
| L08-1341 New semantic and lexical relations have been included to maximize compatibility with new versions of the original Princeton ***** WordNet ***** and to include the whole range of relations from Euro***** WordNet *****. | ||
| 2020.lrec-1.368 We present the parallel creation of a *****WordNet***** resource for Swedish and Bulgarian which is tightly aligned with the Princeton WordNet . | ||
| multimodal | 772 | |
| W19-8625 In this paper, we focus on the generation of hypotheses from premises in a ***** multimodal ***** setting, to generate a sentence (hypothesis) given an image and/or its description (premise) as the input. | ||
| L14-1315 This paper introduces a ***** multimodal ***** discussion corpus for the study into head movement and turn-taking patterns in debates. | ||
| 2021.eacl-main.275 Instead of defining a deterministic fusion operation, such as concatenation, for the network, we let the network decide “how” to combine a given set of ***** multimodal ***** features more effectively. | ||
| L12-1600 Additionally, we compare the gaze behavior of the human subjects to evaluate saliency regions in the ***** multimodal ***** and visual only conditions. | ||
| D19-6402 In this paper, we propose a new approach to learn ***** multimodal ***** multilingual embeddings for matching images and their relevant captions in two languages | ||
| hate speech | 749 | |
| 2021.woah-1.10 In ***** hate speech ***** detection, however, equalizing model predictions may ignore important differences among targeted social groups, as ***** hate speech ***** can contain stereotypical language specific to each SGT. | ||
| 2020.restup-1.1 We aim at identifying possible fake news spreaders as a first step towards preventing fake news from being propagated among online users (fake news aim to polarize the public opinion and may contain ***** hate speech *****). | ||
| 2020.osact-1.17 For that purpose, we develop an effective method for automatic data augmentation and show the utility of training both offensive and ***** hate speech ***** models off (i.e., by fine-tuning) previously trained affective models (i.e., sentiment and emotion). | ||
| 2020.lrec-1.626 This paper presents a novel scheme for the annotation of ***** hate speech ***** in corpora of Web 2.0 commentary. | ||
| 2021.acl-long.556 In other words, getting more affective features from other affective resources will significantly affect the performance of ***** hate speech ***** detection. | ||
| neural machine | 746 | |
| 2018.iwslt-1.8 We propose a method to transfer knowledge across ***** neural machine ***** translation (NMT) models by means of a shared dynamic vocabulary. | ||
| P19-1555 In this paper, we present a novel data augmentation method for ***** neural machine ***** translation.Different from previous augmentation methods that randomly drop, swap or replace words with other words in a sentence, we softly augment a randomly chosen word in a sentence by its contextual mixture of multiple related words. | ||
| 2021.americasnlp-1.27 Our ***** neural machine ***** translation system ranked first in Track two (development set not used for training) and third in Track one (training includes development data). | ||
| 2021.humeval-1.5 Our classification-based approach focuses on such errors using several error type labels, for practical machine translation evaluation in an age of ***** neural machine ***** translation. | ||
| P17-1012 The prevalent approach to ***** neural machine ***** translation relies on bi-directional LSTMs to encode the source sentence. | ||
| sentiment classification | 744 | |
| C16-1047 The sentiment augmented optimized vector obtained at the end is used for the training of SVM for ***** sentiment classification *****. | ||
| D19-1342 If a real-world ***** sentiment classification ***** system ignores the existence of conflict opinions when it is designed, it will incorrectly mixed conflict opinions into other sentiment polarity categories in action. | ||
| D19-1464 Based on it, a novel aspect-specific ***** sentiment classification ***** framework is raised. | ||
| P18-1089 Owing to these differences, cross-domain ***** sentiment classification ***** is still a challenging task. | ||
| 2020.emnlp-main.718 *****Sentiment classification***** on tweets often needs to deal with the problems of under-specificity, noise, and multilingual content. | ||
| graph | 728 | |
| P19-1431 State-of-the-art models for knowledge ***** graph ***** completion aim at learning a fixed embedding representation of entities in a multi-relational ***** graph ***** which can generalize to infer unseen entity relationships at test time. | ||
| 2020.iwpt-1.24 Unfortunately, we did not ensure a connected ***** graph ***** as part of our pipeline approach and our competition submission relied on a last-minute fix to pass the validation script which harmed our official evaluation scores significantly. | ||
| 2020.acl-demos.11 Our system, GAIA, enables seamless search of complex ***** graph ***** queries, and retrieves multimedia evidence including text, images and videos. | ||
| D18-1455 Building on recent advances in ***** graph ***** representation learning we propose a novel model, GRAFT-Net, for extracting answers from a question-specific sub***** graph ***** containing text and KB entities and relations. | ||
| 2021.emnlp-main.429 Our method outperforms prior state of the art, such as multi-scale learning and ***** graph ***** neural networks, by over 20 absolute F1 points | ||
| offensive language | 728 | |
| S19-2110 OffensEval addresses the problem of identifying and categorizing ***** offensive language ***** in social media in three subtasks; whether or not a content is offensive (subtask A), whether it is targeted (subtask B) towards an individual, a group, or other entities (subtask C). | ||
| S19-2099 Our results show 85.12% accuracy and 80.57% F1 scores in Subtask A (***** offensive language ***** identification), 87.92% accuracy and 50% F1 scores in Subtask B (categorization of offense types), and 69.95% accuracy and 50.47% F1 score in Subtask C (offense target identification). | ||
| W19-3508 We propose an experimental study that has three aims: 1) to provide us with a deeper understanding of current data sets that focus on different types of abusive language, which are sometimes overlapping (racism, sexism, hate speech, ***** offensive language *****, and personal attacks); 2) to investigate what type of attention mechanism (contextual vs. self-attention) is better for abusive language detection using deep learning architectures; and 3) to investigate whether stacked architectures provide an advantage over simple architectures for this task. | ||
| 2021.acl-long.210 On social media platforms, hateful and ***** offensive language ***** negatively impact the mental well-being of users and the participation of people from diverse backgrounds. | ||
| 2020.osact-1.16 The use of social media platforms has become more prevalent, which has provided tremendous opportunities for people to connect but has also opened the door for misuse with the spread of hate speech and ***** offensive language *****. | ||
| Transformer | 715 | |
| 2020.sustainlp-1.20 However, ***** Transformer ***** models remain computationally challenging since they are not efficient at inference-time compared to traditional approaches. | ||
| 2020.emnlp-main.337 We start by training a conditional ***** Transformer ***** language model to generate a new product review given other available reviews of the product. | ||
| P19-2030 In this paper, we propose a multi-hop attention for the ***** Transformer *****. | ||
| 2020.sustainlp-1.7 Large ***** Transformer ***** models have achieved state-of-the-art results in neural machine translation and have become standard in the field | ||
| 2020.emnlp-main.19 *****Transformer***** models have advanced the state of the art in many Natural Language Processing ( NLP ) tasks . | ||
| word alignment | 712 | |
| 2004.amta-papers.29 However, these methods achieve unsatisfactory alignment results when performing ***** word alignment ***** on a small-scale domain-specific bilingual corpus without terminological lexicons. | ||
| W17-1713 We use ***** word alignment ***** variance as an indicator for the non-compositionality of German and English noun compounds. | ||
| L14-1418 We argue that automatic ***** word alignment ***** allows for major innovations in searching parallel corpora. | ||
| 2021.acl-long.531 Experimental results show that our proposed model outperforms all previous approaches for monolingual ***** word alignment ***** as well as a competitive QA-based baseline, which was previously only applied to bilingual data. | ||
| L12-1595 We present a method for improving ***** word alignment ***** quality for phrase-based statistical machine translation by reordering the source text according to the target word order suggested by an initial ***** word alignment *****. | ||
| similarity | 706 | |
| S18-1067 Instead of regarding it as a 20-class classification problem we regard it as a text ***** similarity ***** problem. | ||
| 2020.semeval-1.5 We propose an approach that relies on translation and multilingual language models in order to compute the contextual ***** similarity ***** between pairs of words. | ||
| P18-1190 Semantic hashing has become a powerful paradigm for fast ***** similarity ***** search in many information retrieval systems. | ||
| D19-1495 Experiments on three event-related tasks, i.e., event ***** similarity *****, script event prediction and stock market prediction, show that our model obtains much better event embeddings for the tasks, achieving 78% improvements on hard ***** similarity ***** task, yielding more precise inferences on subsequent events under given contexts, and better accuracies in predicting the volatilities of the stock market. | ||
| R17-1053 The dataset contains word pairs with hand annotated scores that indicate the semantic ***** similarity ***** and semantic relatedness of the words | ||
| vector | 704 | |
| 2021.semeval-1.166 Models like Logistic Regression, LSTM, MLP, CNN were used, and pre-trained models like DistilBert were introduced to generate accurate ***** vector ***** representation for textual data. | ||
| C16-1222 The paper presents the results of the experiments as well as a text-matching model where the query shapes the ***** vector ***** space, a document is modelled by two or three ***** vector *****s in this ***** vector ***** space, and the query-document similarity score depends on the length of the ***** vector *****s and the relationships between them. | ||
| N18-2028 In doing so, a direction ***** vector ***** is introduced for each word, whose embedding is thus learned by not only word co-occurrence patterns in its context, but also the directions of its contextual words. | ||
| 2020.acl-srw.14 A feature ***** vector ***** extracted from the image conveys visual information, but its ability to describe the image is limited. | ||
| P19-1442 We show that our curated dataset provides an excellent signal for learning ***** vector ***** representations of sentence meaning, representing relations that can only be determined when the meanings of two sentences are combined | ||
| sentiment analysis | 688 | |
| 2021.blackboxnlp-1.19 Tsetlin Machine (TM) is an interpretable pattern recognition algorithm based on propositional logic, which has demonstrated competitive performance in many Natural Language Processing (NLP) tasks, including ***** sentiment analysis *****, text classification, and Word Sense Disambiguation. | ||
| Q14-1024 Such evaluations can be analyzed separately using signed social networks and textual ***** sentiment analysis *****, but this misses the rich interactions between language and social context. | ||
| N19-1242 Since ReviewRC has limited training examples for RRC (and also for aspect-based ***** sentiment analysis *****), we then explore a novel post-training approach on the popular language model BERT to enhance the performance of fine-tuning of BERT for RRC. | ||
| D19-1467 In this paper, we propose constrained attention networks (CAN), a simple yet effective solution, to regularize the attention for multi-aspect ***** sentiment analysis *****, which alleviates the drawback of the attention mechanism. | ||
| 2021.emnlp-main.362 In the cases of aspect-based ***** sentiment analysis *****, violation of the above issues may change the aspect and sentiment polarity. | ||
| bert model | 681 | |
| 2021.ccl-1.82 Finally we construct a question-and-answer pair and use it as the input of the *****BERT model***** to complete emotion classification. | ||
| 2020.starsem-1.13 *****BERT models***** have shown great promise toward continued system performance improvements compared with approaches relying on surface-level cues alone that demonstrate performance saturation. | ||
| 2020.semeval-1.207 We propose different *****Bert models***** trained on several offensive language classification and profanity datasets, and combine their output predictions in an ensemble model. | ||
| 2020.semeval-1.44 Among others, we show that results can be improved by using a two-step fine-tuning process, in which the *****BERT model***** is first fine-tuned on the full training set, and then further specialized towards a target domain. | ||
| 2020.lrec-1.82 The two modules were implemented by fine-tuning a *****BERT model*****, which is a recent successful neural network model. | ||
| classification | 680 | |
| W18-5106 We experimented with 6 ***** classification ***** models and our CNN model on a 10 K-fold cross-validation gave the best result with the prediction accuracy of 73.2%. | ||
| 2021.acl-demo.17 Our approach maximizes the efficiency of manual efforts by targeting only those comments for which human intervention is needed, e.g. due to high ***** classification ***** uncertainty. | ||
| 2020.alw-1.22 Unfortunately, machine learning is vulnerable to unintended bias in training data, which could have severe consequences, such as a decrease in ***** classification ***** performance or unfair behavior (e.g., discriminating minorities). | ||
| S18-1006 The last LSTM layer will output the hidden representations of texts, and they will be used in three ***** classification ***** task. | ||
| W17-1318 Our best models achieve significantly above the baselines, with 67.93% and 69.37% accuracies for subjectivity and sentiment ***** classification ***** respectively | ||
| classifiers | 679 | |
| 2020.coling-main.557 Specifically, we observe an average 5% improvement for the hate class F1 scores across all state-of-the-art hate speech ***** classifiers *****. | ||
| D18-1522 Exploiting only minimal linguistic clues and the contextual usage of a concept as manifested in textual data, we train sufficiently powerful ***** classifiers *****, obtaining high correlation with human labels. | ||
| 2020.emnlp-main.24 These ***** classifiers ***** are thus likely to have undesirable properties. | ||
| R19-1106 This paper compares how different machine learning ***** classifiers ***** can be used together with simple string matching and named entity recognition to detect locations in texts. | ||
| 2021.argmining-1.1 We then train ***** classifiers ***** to determine the types of tweets, achieving the best performance of 71% F1 | ||
| NER | 669 | |
| 2021.emnlp-main.424 We propose Chem***** NER *****, an ontology-guided, distantly-supervised method for fine-grained chemistry ***** NER ***** to tackle these challenges. | ||
| 2021.acl-long.140 Through experiments on E-commerce query ***** NER ***** and Biomedical ***** NER *****, we demonstrate that NEEDLE can effectively suppress the noise of the weak labels and outperforms existing methods. | ||
| 2020.insights-1.15 We attempt to replicate a named entity recognition (***** NER *****) model implemented in a popular toolkit and discover that a critical barrier to doing so is the inconsistent evaluation of improper label sequences. | ||
| S17-2159 As for AMR parsing, we added ***** NER ***** extensions to our SemEval-2016 general-domain AMR parser to handle the biomedical genre, rich in organic compound names, achieving Smatch F1=54.0%. | ||
| I17-2017 We present Segment-level Neural CRF, which combines neural networks with a linear chain CRF for segment-level sequence modeling tasks such as named entity recognition (***** NER *****) and syntactic chunking | ||
| computational | 662 | |
| 2020.lrec-1.303 However, Brown clustering has high ***** computational ***** complexity and does not lend itself to parallel computation. | ||
| 2021.acl-long.386 Finally, we consider coping with limited ***** computational ***** resources, as real-life applications require eSPD on mobile devices. | ||
| 2021.eval4nlp-1.20 Text generation is a highly active area of research in the ***** computational ***** linguistic community. | ||
| N18-1119 However, while theoretically sound, existing approaches have ***** computational ***** complexities that are either linear (Hokamp and Liu, 2017) or exponential (Anderson et al., 2017) in the number of constraints. | ||
| 2021.emnlp-main.828 The experiments show that PermuteFormer uniformly improves the performance of Performer with almost no ***** computational ***** overhead and outperforms vanilla Transformer on most of the tasks | ||
| relation | 658 | |
| 2021.emnlp-main.435 To solve these issues, in this research, we conduct a comprehensive examination of different techniques to add medical knowledge into a pre-trained BERT model for clinical ***** relation ***** extraction. | ||
| 2021.emnlp-main.218 In this paper, we adapt the popular dependency parsing model, the biaffine parser, to this entity ***** relation ***** extraction task. | ||
| W17-4305 The structure of SA-LSTM changes according to dependency structure of each sentence, so that SA-LSTM can model the whole tree structure of dependency ***** relation ***** in an architecture engineering way. | ||
| W19-1903 The SVM-based system and a neural system obtain comparable results, with the SVM system doing better on concepts and the neural system performing better on ***** relation ***** extraction tasks. | ||
| 2021.emnlp-main.750 To this end, we represent each ***** relation ***** (edge) in a KG as a vector field on several manifolds | ||
| machine | 658 | |
| 2008.amta-srw.4 Feedback from translators reveals a variety of attitudes towards ***** machine ***** translation, with some supporting and others contradicting several points of conventional wisdom regarding the relationship between ***** machine ***** translation and human translators. | ||
| 2020.evalnlgeval-1.4 We present NUBIA, a methodology to build automatic evaluation metrics for text generation using only ***** machine ***** learning models as core components. | ||
| 2021.eacl-main.140 Experimental results show that even though these models can perform decently on the task, there remains a gap between ***** machine ***** and human performance, especially in out-of-domain settings. | ||
| 2020.acl-main.419 While attention mechanisms are claimed to achieve interpretability, little is known about the actual relationships between ***** machine ***** and human attention. | ||
| L10-1020 These two systems, acting complementarily, could bridge the gap between ***** machine ***** learning and rule-based approaches | ||
| QA | 657 | |
| 2020.emnlp-main.439 The experimental results indicate significant improvements in the domain adaptation of ***** QA ***** models outperforming current state-of-the-art methods. | ||
| R17-1018 Finally, we propose to use extrinsic evaluation with respect to a ***** QA ***** task as an automatic evaluation method for chatbot systems. | ||
| K17-1028 We find that there are two ingredients necessary for building a high-performing neural ***** QA ***** system: first, the awareness of question words while processing the context and second, a composition function that goes beyond simple bag-of-words modeling, such as recurrent neural networks. | ||
| E17-1036 Our RNN based model generates ***** QA ***** pairs with an accuracy of 33.61 percent and performs 110.47 percent (relative) better than a state-of-the-art template based method for generating natural language question from keywords. | ||
| I17-4033 Multi - choice question answering in exams is a typical *****QA***** task . | ||
| nmt model | 648 | |
| 2019.icon-1.7 We used a pre-trained *****NMT model***** to map a query in the source language into an equivalent query in the target language. | ||
| 2021.emnlp-main.477 In this paper, we introduce a modular framework for incorporating lemma constraints in neural MT (NMT) in which linguistic knowledge and diverse types of *****NMT models***** can be flexibly applied. | ||
| 2021.emnlp-main.2 However, it is under-explored that whether the MPE can help to facilitate the cross-lingual transferability of *****NMT model*****. | ||
| 2020.wmt-1.71 Although there exist various architectures and analyses, the effectiveness of different context-aware *****NMT models***** is not well explored yet. | ||
| D18-1037 We exploit a top-down tree-structured model called DRNN (Doubly-Recurrent Neural Networks) first proposed by Alvarez-Melis and Jaakola (2017) to create an *****NMT model***** called Seq2DRNN that combines a sequential encoder with tree-structured decoding augmented with a syntax-aware attention model. | ||
| Multilingual | 644 | |
| 2020.semeval-1.293 In this paper, we built several pre-trained models to participate SemEval-2020 Task 12: ***** Multilingual ***** Offensive Language Identification in Social Media. | ||
| N18-2096 ***** Multilingual ***** speakers mix languages they tweet to address a different audience, express certain feelings, or attract attention. | ||
| K18-2022 We present SParse , our Graph - Based Parsing model submitted for the CoNLL 2018 Shared Task : *****Multilingual***** Parsing from Raw Text to Universal Dependencies ( Zeman et al . , 2018 ) . | ||
| 2021.calcs-1.19 *****Multilingual***** models have demonstrated impressive cross - lingual transfer performance . | ||
| K18-2004 This paper describes the ICS PAS system which took part in CoNLL 2018 shared task on *****Multilingual***** Parsing from Raw Text to Universal Dependencies . | ||
| abstractive summarization | 643 | |
| 2021.acl-long.470 In this work, we exploit large pre-trained transformer-based models and address long-span dependencies in ***** abstractive summarization ***** using two methods: local self-attention; and explicit content selection. | ||
| 2021.eacl-main.220 Despite advances in modeling techniques, ***** abstractive summarization ***** models still suffer from several key challenges: (i) layout bias: they overfit to the style of training corpora; (ii) limited abstractiveness: they are optimized to copying n-grams from the source rather than generating novel abstractive summaries; (iii) lack of transparency: they are not interpretable. | ||
| 2020.emnlp-main.506 However, ensuring the factual consistency of the generated summaries for ***** abstractive summarization ***** systems is a challenge. | ||
| 2020.acl-main.457 Sequence-to-sequence models for ***** abstractive summarization ***** have been studied extensively, yet the generated summaries commonly suffer from fabricated content, and are often found to be near-extractive. | ||
| C18-1121 Considering a correct summary is semantically entailed by the source sentence, we incorporate entailment knowledge into ***** abstractive summarization ***** models | ||
| vectors | 640 | |
| S18-1166 These measures correspond to offset ***** vectors ***** which are concatenated as features, mainly to improve upon the F1score, with the best accuracy. | ||
| D17-1193 We propose a new relational similarity measure based on the combination of word2vec's CBOW input and output ***** vectors ***** which outperforms concurrent vector representations, when used for unsupervised clustering on SemEval 2010 Relation Classification data. | ||
| 2021.eacl-main.89 First, we combine the linguistic principles of hypernym transitivity and intersective modifier-noun composition to generate additional pairs of ***** vectors *****, such as “small dog - dog” or “small dog - animal”, for which a hypernymy relationship can be assumed. | ||
| D19-1576 In this work, we propose a novel universal grammar induction approach that represents language identities with continuous ***** vectors ***** and employs a neural network to predict grammar parameters based on the representation. | ||
| P18-1003 To this end, we first introduce a variant of GloVe, in which there is an explicit connection between word ***** vectors ***** and PMI weighted co-occurrence ***** vectors ***** | ||
| augmentation | 627 | |
| 2020.acl-main.529 Experiments on Chinese-English, English-French, and English-German translation benchmarks show that AdvAug achieves significant improvements over theTransformer (up to 4.9 BLEU points), and substantially outperforms other data ***** augmentation ***** techniques (e.g.back-translation) without using extra corpora. | ||
| 2020.acl-main.631 Unlike existing ***** augmentation ***** approaches, ours is controllable and allows to generate more diversified sentences. | ||
| 2020.textgraphs-1.2 Our paper proposes a novel graph-based approach to solve entity ***** augmentation *****. | ||
| 2020.smm4h-1.9 Data ***** augmentation ***** increased our training and validation corpora from 13,172 tweets to 28,094 tweets. | ||
| I17-2053 We also experiment with data ***** augmentation ***** techniques to further increase the amount of training data | ||
| utterances | 623 | |
| 2020.emnlp-main.597 Many recent ERC methods use graph-based neural networks to take the relationships between the ***** utterances ***** of the speakers into account. | ||
| L16-1119 These methods require a high number of natural language ***** utterances ***** to train the speech recognition engine and to assess the quality of the system. | ||
| Q13-1026 In most work on this topic, however, ***** utterances ***** in a conversation are treated independently and discourse structure information is largely ignored. | ||
| 2020.findings-emnlp.108 In this paper, we propose a novel Semantic Matching and Aggregation Network where semantic components are distilled from ***** utterances ***** via multi-head self-attention with additional dynamic regularization constraints. | ||
| 2021.eacl-main.248 We propose the use of subtitles as a proxy dataset for correcting ASR acoustic segmentation, creating synthetic acoustic ***** utterances ***** by modeling common error modes | ||
| pretrained | 622 | |
| 2020.acl-main.745 Recent years have witnessed the burgeoning of ***** pretrained ***** language models (LMs) for text-based natural language (NL) understanding tasks. | ||
| N19-3012 In this study, we effectively combine these two approaches in the context of multimodal NMT and explore how we can take full advantage of ***** pretrained ***** word embeddings to better translate rare words. | ||
| 2021.emnlp-main.99 All code and ***** pretrained ***** models will be released as further steps towards larger reproducible benchmarks for African languages. | ||
| 2021.nlp4convai-1.13 In addition, the available ***** pretrained ***** models are trained on general domain language, creating a mismatch between the pretraining language and the downstream domain launguage. | ||
| D17-1098 Our method uses constrained beam search to force the inclusion of selected tag words in the output, and fixed, ***** pretrained ***** word embeddings to facilitate vocabulary expansion to previously unseen tag words | ||
| Spoken | 621 | |
| W17-2804 In this work, we present the empirical evaluation of an adaptive ***** Spoken ***** Language Understanding chain for robotic commands, that explicitly depends on the operational environment during both the learning and recognition stages. | ||
| 2020.nlp4convai-1.11 Slot Filling (SF) is one of the sub-tasks of ***** Spoken ***** Language Understanding (SLU) which aims to extract semantic constituents from a given natural language utterance. | ||
| 2020.findings-emnlp.442 *****Spoken***** languages are ever - changing , with new words entering them all the time . | ||
| L14-1183 The Database for Spoken German ( Datenbank fr Gesprochenes Deutsch , DGD2 , http://dgd.ids-mannheim.de ) is the central platform for publishing and disseminating spoken language corpora from the Archive of *****Spoken***** German ( Archiv fr Gesprochenes Deutsch , AGD , http://agd.ids-mannheim.de ) at the Institute for the German Language in Mannheim . | ||
| 2004.amta-papers.7 *****Spoken***** Translation , Inc. ( STI ) of Berkeley , CA has developed a commercial system for interactive speech - to - speech machine translation designed for both high accuracy and broad linguistic and topical coverage . | ||
| ASR | 621 | |
| 2014.amta-researchers.20 In this paper, we outline a statistical framework for analyzing the impact of specific ***** ASR ***** error types on translation quality in a speech translation pipeline. | ||
| 2021.iwslt-1.21 We utilize state-of-the-art models combined with several data augmentation, multi-task and transfer learning approaches for the automatic speech recognition (***** ASR *****) and machine translation (MT) steps of our cascaded system. | ||
| L08-1496 This paper aims at studying the interest of morphosyntactic information as a useful resource for ***** ASR *****. | ||
| W17-2620 We explore transfer learning based on model adaptation as an approach for training ***** ASR ***** models under constrained GPU memory, throughput and training data. | ||
| 2020.sltu-1.7 We build the single ***** ASR ***** grapheme set via taking the union over each language-specific grapheme set, and we find such multilingual graphemic hybrid ***** ASR ***** model can perform language-independent recognition on all 7 languages, and substantially outperform each monolingual ***** ASR ***** model | ||
| dialog | 613 | |
| 2020.emnlp-main.274 We propose the Variational Hierarchical Dialog Autoencoder (VHDA) for modeling the complete aspects of goal-oriented ***** dialog *****s, including linguistic features and underlying structured annotations, namely speaker information, ***** dialog ***** acts, and goals. | ||
| 2020.sigdial-1.40 Reinforcement learning (RL) methods have been widely used for learning ***** dialog ***** policies. | ||
| 2020.coling-main.170 As a conversational intelligence task, visual ***** dialog ***** entails answering a series of questions grounded in an image, using the ***** dialog ***** history as context. | ||
| 2020.nlpbt-1.9 We extend the MAC network architecture with Context-aware Attention and Memory (CAM), which attends over control states in past ***** dialog ***** turns to determine the necessary reasoning operations for the current question. | ||
| 2021.emnlp-main.621 Most reinforcement learning methods for ***** dialog ***** policy learning train a centralized agent that selects a predefined joint action concatenating domain name, intent type, and slot name | ||
| SMT | 612 | |
| 2013.iwslt-evaluation.10 This work describes the statistical machine translation (***** SMT *****) systems of RWTH Aachen University developed for the evaluation campaign International Workshop on Spoken Language Translation (IWSLT) 2013. | ||
| W19-5428 For both translation directions, we prepared state-of-the-art statistical (***** SMT *****) and neural (NMT) machine translation systems. | ||
| L10-1292 These results indicate that grammar checker techniques are a useful complement to ***** SMT *****. | ||
| 2010.iwslt-papers.10 We argue for the necessity to include crosssentence dependencies in ***** SMT ***** | ||
| 2010.amta-papers.21 A method is presented for incremental re - training of an *****SMT***** system , in which a local phrase table is created and incrementally updated as a file is translated and post - edited . | ||
| encoder | 612 | |
| 2020.emnlp-main.371 Character representations can easily be added in a sequence-to-sequence model in either one ***** encoder ***** or as a fully separate ***** encoder *****, with improvements that are robust to different language models, languages and data sets. | ||
| 2020.loresmt-1.11 We find that using a Transformer for the ***** encoder ***** and decoder performs best, improving accuracy by over 4 points compared to previous work. | ||
| D18-1110 Existing neural semantic parsers mainly utilize a sequence ***** encoder *****, i.e., a sequential LSTM, to extract word order features while neglecting other valuable syntactic information such as dependency or constituent trees. | ||
| C18-1265 The main goal targeted in this research is to provide richer information on the ***** encoder ***** side and redesign the decoder accordingly to benefit from such information. | ||
| W17-1002 In this paper, we propose decoupling the ***** encoder ***** and decoder networks, and training them separately | ||
| Statistical | 606 | |
| 2010.jec-1.7 This paper describes our work on building and employing ***** Statistical ***** Machine Translation systems for TV subtitles in Scandinavia. | ||
| 2020.acl-main.660 It has been exactly a decade since the first establishment of SPMRL , a research initiative unifying multiple research efforts to address the peculiar challenges of *****Statistical***** Parsing for Morphologically - Rich Languages ( MRLs ) . | ||
| 1998.amta-papers.37 *****Statistical***** models have recently been applied to machine translation with interesting results . | ||
| 2003.mtsummit-papers.37 *****Statistical***** techniques for machine translation offer promise for rapid development in response to unexpected requirements , but realizing that potential requires rapid acquisition of required resources as well . | ||
| W19-8659 *****Statistical***** generators increasingly dominate the research in NLG . | ||
| subtask | 590 | |
| 2020.semeval-1.293 For the English ***** subtask ***** B, we adopted the method of adding Auxiliary Sentences (AS) to transform the single-sentence classification task into a relationship recognition task between sentences. | ||
| S18-1021 Our system, trained using transfer learning, achieves 0.776 and 0.763 respectively for Pearson correlation coefficient and weighted quadratic kappa metrics on the ***** subtask ***** evaluation dataset. | ||
| 2021.semeval-1.151 We utilize a fusion of logistic regression, decision tree, and fine-tuned DistilBERT for tackling ***** subtask ***** 1. | ||
| S17-2132 Our system ranked 14th out of 39 submissions in ***** subtask ***** A, 5th out of 24 submissions in ***** subtask ***** B, and 3rd out of 16 submissions in ***** subtask ***** D. | ||
| S17-2052 The results show that our methods have the great effectiveness for both ***** subtask ***** A and ***** subtask ***** C | ||
| sentence pair | 584 | |
| S17-2015 This paper describes FCICU team systems that participated in SemEval-2017 Semantic Textual Similarity task (Task1) for monolingual and cross-lingual ***** sentence pair *****s. | ||
| 2020.emnlp-main.207 With the segmenter and the two methods combined, we compile a high-quality Bengali-English parallel corpus comprising of 2.75 million ***** sentence pair *****s, more than 2 million of which were not available before. | ||
| N19-1302 However, for multi-hop QA tasks, which require reasoning with multiple sentences, it remains unclear how best to utilize entailment models pre-trained on large scale datasets such as SNLI, which are based on ***** sentence pair *****s. | ||
| 2012.amta-papers.7 In order to improve quality, corpus collection efforts often attempt to fix or remove misaligned ***** sentence pair *****s. | ||
| 2020.emnlp-main.376 Here we introduce DAIS, a large benchmark dataset containing 50K human judgments for 5K distinct ***** sentence pair *****s in the English dative alternation. | ||
| word vector | 582 | |
| N19-1111 This point is illustrated by ***** word vector *****s retrofitted with novel treatments of the FrameNet data (Fillmore and Baker, 2010). | ||
| P18-1003 While we may similarly expect that co-occurrence statistics can be used to capture rich information about the relationships between different words, existing approaches for modeling such relationships are based on manipulating pre-trained ***** word vector *****s. | ||
| 2020.acl-main.337 This paper presents an investigation on the distribution of ***** word vector *****s belonging to a certain word class in a pre-trained ***** word vector ***** space. | ||
| 2020.emnlp-main.682 Through extensive experimentation under various settings with synthetic and real data we showcase the importance of sequential modelling of ***** word vector *****s through time for semantic change detection. | ||
| C16-1130 Recently, researchers have shown promising results using ***** word vector *****s extracted from a neural network language model as features in WSD algorithms. | ||
| model | 581 | |
| 2008.amta-papers.19 We also build a cascaded translation ***** model ***** that dynamically shifts translation units from phrase level to word and morpheme phrase levels. | ||
| P19-1624 Experimental results on the WMT14 English-German and English-French benchmarks show that our ***** model ***** consistently improves performance over the strong Transformer ***** model *****, demonstrating the necessity and effectiveness of exploiting sentential context for NMT. | ||
| K18-1001 However, the simple graphical ***** model ***** structure belies the often complex non-local constraints between output labels. | ||
| 2019.gwc-1.11 The second ***** model ***** can achieve near-perfect accuracy. | ||
| 2020.coling-main.581 Deep pre-trained language ***** model *****s tend to become ubiquitous in the field of Natural Language Processing (NLP). | ||
| semantics | 570 | |
| 2020.findings-emnlp.8 In-depth analysis indicates that our method is highly effective in composing sentence ***** semantics *****. | ||
| 2021.acl-long.337 We then introduce a joint embedding loss and a matching learning loss to model the matching relationship between the text ***** semantics ***** and the label ***** semantics *****. | ||
| 2015.lilt-10.4 In recent years rich type theories developed for the ***** semantics ***** of programming languages have become influential in the ***** semantics ***** of natural language. | ||
| 2020.emnlp-main.345 Our user studies confirm that the learned LEs are explainable and capture domain ***** semantics *****. | ||
| S18-1013 Then, tweets are converted in the corresponding vector representation and given as input to the neural network with the aim of learning the different ***** semantics ***** contained in each emotion taken into account by the SemEval task | ||
| textual entailment | 562 | |
| 2020.inlg-1.19 We use the NLI model to check ***** textual entailment ***** between the input data and the output text in both directions, allowing us to reveal omissions or hallucinations. | ||
| L10-1469 Many natural language processing tasks, including information extraction, question answering and recognizing ***** textual entailment *****, require analysis of the polarity, focus of polarity, tense, aspect, mood and source of the event mentions in a text in addition to its predicate-argument structure analysis. | ||
| I17-1011 We define a novel ***** textual entailment ***** task that requires inference over multiple premise sentences. | ||
| W18-5511 We show that this formulation leads to several advantages, including the ability to (i) perform zero-shot relation classification by exploiting relation descriptions, (ii) utilize existing ***** textual entailment ***** models, and (iii) leverage readily available ***** textual entailment ***** datasets, to enhance the performance of relation classification systems. | ||
| C16-1272 Sentence intersection captures the semantic overlap of two texts, generalizing over paradigms such as ***** textual entailment ***** and semantic text similarity | ||
| word sense disambiguation | 559 | |
| W17-1915 This paper compares two approaches to ***** word sense disambiguation ***** using word embeddings trained on unambiguous synonyms. | ||
| 2020.lrec-1.362 The existing lexicons blur senses and frames of predicates, which needs to be refined to meet the tasks like ***** word sense disambiguation ***** and event extraction. | ||
| L12-1646 In this paper we investigate the role of multilingual features in improving ***** word sense disambiguation *****. | ||
| 2021.eacl-main.294 In this paper we use statistical measures such as entropy to give an updated analysis of the complexity of the NP-complete Most Probable Sentence problem for pCFGs, which can then be applied to ***** word sense disambiguation ***** and inference tasks. | ||
| L10-1015 Given the recent trend to evaluate the performance of *****word sense disambiguation***** systems in a more application - oriented set - up , we report on the construction of a multilingual benchmark data set for cross - lingual word sense disambiguation . | ||
| language models | 557 | |
| 2020.coling-main.581 Deep pre-trained ***** language models ***** tend to become ubiquitous in the field of Natural Language Processing (NLP). | ||
| 2020.emnlp-main.162 Trained with these contextually generated vokens, our visually-supervised ***** language models ***** show consistent improvements over self-supervised alternatives on multiple pure-language tasks such as GLUE, SQuAD, and SWAG. | ||
| 2021.calcs-1.20 Multilingual ***** language models ***** have shown decent performance in multilingual and cross-lingual natural language understanding tasks. | ||
| 2021.alta-1.26 Our empirical experiments reveal that these modern pretrained ***** language models ***** suffer from high variance, and the ensemble method can improve the model performance. | ||
| 2020.acl-demos.10 However , this line of research requires an uncommon confluence of skills : both the theoretical knowledge needed to design controlled psycholinguistic experiments , and the technical proficiency needed to train and deploy large - scale *****language models***** . | ||
| extraction | 551 | |
| P18-2094 To our knowledge, this paper is the first to report such double embeddings based CNN model for aspect ***** extraction ***** and achieve very good results. | ||
| Q16-1009 Answer sentence ranking and answer ***** extraction ***** are two key challenges in question answering that have traditionally been treated in isolation, i.e., as independent tasks. | ||
| 2020.ccl-1.87 We refer to tourism InfoBox for semi-structured knowledge ***** extraction ***** and leverage deep learning algorithms to extract entities and relations from unstructured travel notes, which are colloquial and high-noise, and then we fuse the extracted knowledge from two sources. | ||
| P19-1522 Traditional approaches to the task of ACE event ***** extraction ***** usually depend on manually annotated data, which is often laborious to create and limited in size | ||
| L12-1283 This work is part of a project for MWE ***** extraction ***** and characterization using different techniques aiming at measuring the properties related to idiomaticity, as institutionalization, non-compositionality and lexico-syntactic fixedness. | ||
| Annotation | 544 | |
| 2020.lrec-1.872 *****Annotation***** tools are a valuable asset for the construction of labelled textual datasets . | ||
| L06-1392 *****Annotation***** projects dealing with complex semantic or pragmatic phenomena face the dilemma of creating annotation schemes that oversimplify the phenomena , or that capture distinctions conventional reliability metrics can not measure adequately . | ||
| 2020.conll-1.28 *****Annotation***** styles express guidelines that direct human annotators in what rules to follow when creating gold standard annotations of text corpora . | ||
| L10-1037 *****Annotation***** Science , a discipline dedicated to developing and maturing methodology for the annotation of language resources , is playing a prominent role in the fields of computational and corpus linguistics . | ||
| P18-2071 *****Annotation***** corpus for discourse relations benefits NLP tasks such as machine translation and question answering . | ||
| Unsupervised | 539 | |
| N19-1255 ***** Unsupervised ***** document representation learning is an important task providing pre-trained features for NLP applications. | ||
| P19-1482 ***** Unsupervised ***** text style transfer aims to alter text styles while preserving the content, without aligned data for supervision. | ||
| 2020.coling-main.227 ***** Unsupervised ***** dependency parsing aims to learn a dependency parser from sentences that have no annotation of their correct parse trees | ||
| 2021.naacl-main.89 *****Unsupervised***** translation has reached impressive performance on resource - rich language pairs such as English - French and English - German . | ||
| 2021.wmt-1.106 This paper describes our submission for the shared task on *****Unsupervised***** MT and Very Low Resource Supervised MT at WMT 2021 . | ||
| ontology | 537 | |
| L10-1437 We first construct our ***** ontology ***** manually by relating our concepts from Arabic Linguistics to the upper concepts of GOLD, furthermore an information extraction algorithm is implemented to automatically enrich the ***** ontology *****. | ||
| C18-1224 Because ontologies provide a way of structuring this information and making it accessible to agents and computational systems generally, efforts are underway to incorporate the extracted information to an ***** ontology ***** hub of Natural Language Processing semantic role labeling resources, the Rich Event Ontology. | ||
| L08-1272 FarsNet is an ***** ontology ***** whose elements are lexicalized in Persian. | ||
| I17-1068 Typically, relation extraction models are trained to extract instances of a relation ***** ontology ***** using only training data from a single language. | ||
| 2020.lrec-1.630 In addition to classical morphosyntax and dependency structure, the treebank was enriched with a lexical-semantic layer covering named entities, a semantic type ***** ontology ***** for nouns and adjectives and a framenet-inspired semantic classification of verbs | ||
| Lexical | 533 | |
| W16-5320 Moreover, we show how it combines to ***** Lexical ***** Model for Ontologies (lemon), for the transformation of lexical networks into the semantic web formats. | ||
| L06-1362 *****Lexical***** information for South African Bantu languages is not readily available in the form of machine - readable lexicons . | ||
| P17-1170 *****Lexical***** ambiguity can impede NLP systems from accurate understanding of semantics . | ||
| 2021.emnlp-main.803 *****Lexical***** disambiguation is a major challenge for machine translation systems , especially if some senses of a word are trained less often than others . | ||
| 2020.coling-main.107 *****Lexical***** substitution , i.e. | ||
| adversarial | 532 | |
| 2020.nlposs-1.18 TextAttack is an open-source Python toolkit for ***** adversarial ***** attacks, ***** adversarial ***** training, and data augmentation in NLP. | ||
| 2020.findings-emnlp.170 We introduce AISLe, which combines ***** adversarial ***** learning with importance sampling to strike a balance between precision and coverage. | ||
| 2021.starsem-1.30 The results show that ***** adversarial ***** training is effective universally, and PQAT further improves the performance. | ||
| K18-1007 Furthermore, we show that ***** adversarial ***** examples transfer among model architectures, and that the proposed ***** adversarial ***** training procedure improves the robustness of NLI models to ***** adversarial ***** examples. | ||
| 2020.conll-1.48 We also perform data augmentation including text swap, word substitution and paraphrase and prove its efficiency in combating various (though not all) ***** adversarial ***** attacks at the same time | ||
| LSTM | 528 | |
| 2020.trac-1.22 In this paper, we analyze and compare four explanation methods for different offensive language classifiers: an interpretable machine learning model (naive Bayes), a model-agnostic explanation method (LIME), a model-based explanation method (LRP), and a self-explanatory model (***** LSTM ***** with an attention mechanism). | ||
| 2020.semeval-1.131 In the shared task of assessing the funniness of edited news headlines, which is a part of the SemEval 2020 competition, we preprocess datasets by replacing abbreviation, stemming words, then merge three models including Light Gradient Boosting Machine (LightGBM), Long Short-Term Memory (***** LSTM *****), and Bidirectional Encoder Representation from Transformer (BERT) by taking the average to perform the best. | ||
| S18-1063 Our model combines CNN and ***** LSTM ***** layers to capture both local and long-range contextual information for tweet representation. | ||
| R19-1121 Recurrent Neural Network Language Models composed of *****LSTM***** units , especially those augmented with an external memory , have achieved state - of - the - art results in Language Modeling . | ||
| C16-1179 We experiment with different ways of training *****LSTM***** networks to predict RST discourse trees . | ||
| Previous | 524 | |
| 2020.findings-emnlp.193 ***** Previous ***** work either directly applies a discriminative source parser to the target language, ignoring unannotated target corpora, or employs an unsupervised generative parser that can leverage unannotated target data but has weaker representational power than discriminative parsers. | ||
| 2021.argmining-1.7 ***** Previous ***** work tackled sufficiency assessment as a standard text classification problem, not modeling the inherent relation of premises and conclusion. | ||
| N19-1184 ***** Previous ***** neural models addressed this problem using an attention mechanism that attends to sentences that are likely to express the relations. | ||
| D17-1231 ***** Previous ***** work on dialog act (DA) classification has investigated different methods, such as hidden Markov models, maximum entropy, conditional random fields, graphical models, and support vector machines. | ||
| 2021.emnlp-main.703 ***** Previous ***** work has shown that human evaluations in NLP are notoriously under-powered | ||
| Language | 524 | |
| 2021.eacl-demos.26 The only option to enable and to benefit from multilingualism is through ***** Language ***** Technologies (LT), i.e., Natural ***** Language ***** Processing and Speech Technologies. | ||
| 2020.emnlp-main.325 *****Language***** drift has been one of the major obstacles to train language models through interaction . | ||
| K17-1002 *****Language***** acquisition can be modeled as a statistical inference problem : children use sentences and sounds in their input to infer linguistic structure . | ||
| R19-1121 Recurrent Neural Network Language Models composed of LSTM units , especially those augmented with an external memory , have achieved state - of - the - art results in *****Language***** Modeling . | ||
| W19-1707 *****Language***** models have broad adoption in predictive typing tasks . | ||
| retrieval | 522 | |
| K19-1006 We also obtain human ratings on ***** retrieval ***** outputs to better assess the impact of incidentally matching image-caption pairs that were not associated in the data, finding that automatic evaluation substantially underestimates the quality of the retrieved results. | ||
| 2003.mtsummit-systems.10 In the current implementation, trilingual (J/E/K) patent ***** retrieval ***** is available. | ||
| D17-1096 Our embeddings prove useful in textual tasks requiring aural reasoning like text-based sound ***** retrieval ***** and discovering Foley sound effects (used in movies). | ||
| L08-1345 The GermaNet Explorer exhibits various ***** retrieval *****, sort, filter and visualization functions for words/synsets and also provides an insight into the modeling of GermaNets semantic relations as well as its representation as a graph | ||
| D18-1212 Through the joint exploitation of these constraints in an adversarial manner, the underlying cross-language semantics relevant to ***** retrieval ***** tasks are better preserved in the embedding space. | ||
| contextual | 520 | |
| 2021.naacl-main.370 We perform a series of experiments and set performance baselines on this dataset, using monolingual and multilingual ***** contextual ***** language models. | ||
| 2021.naacl-industry.1 We examine how additional ***** contextual ***** signals (from previous messages, time, and subject) affect the performance of a commercial text prediction model. | ||
| P16-5007 A growing number of approaches leverage external knowledge to address the issue of inadequate ***** contextual ***** information that accompanies the short texts. | ||
| D19-1348 Initial experimental results demonstrate a 23.9% absolute improvement in mean average precision over the baseline model by incorporating ***** contextual ***** features, and a processing speed 14x faster than a text-based technique. | ||
| P19-1174 In this paper, we propose a reordering mechanism to learn the reordering embedding of a word based on its ***** contextual ***** information | ||
| word embeddings | 519 | |
| S17-2031 The first stage deals with constructing neural ***** word embeddings *****, the components of sentence embeddings. | ||
| W17-1411 We investigate whether ***** word embeddings ***** offer any advantage over corpus- and preprocessing-free string kernels, and how these compare to bag-of-words baselines. | ||
| C16-1121 We present a successful collaboration of ***** word embeddings ***** and co-training to tackle in the most difficult test case of semantic role labeling: predicting out-of-domain and unseen semantic frames. | ||
| N19-1061 Several recent works tackle this problem, and propose methods for significantly reducing this gender bias in ***** word embeddings *****, demonstrating convincing results. | ||
| D19-5621 Neural models that eliminate the softmax bottleneck by generating ***** word embeddings ***** (rather than multinomial distributions over a vocabulary) attain faster training with fewer learnable parameters. | ||
| representations | 517 | |
| D19-1145 Experimental results on NIST Chinese-to-English and WMT14 English-to-German translation tasks show that the proposed approach consistently boosts performance over both the absolute and relative sequential position ***** representations *****. | ||
| P19-1298 Neural machine translation (NMT) takes deterministic sequences for source ***** representations *****. | ||
| N19-1419 Recent work has improved our ability to detect linguistic knowledge in word ***** representations *****. | ||
| W19-0502 In any case, the system works on surface forms rather than on ***** representations ***** of any kind. | ||
| 2021.acl-tutorials.2 This tutorial will provide audience with a systematic introduction of (i) knowledge ***** representations ***** of events, (ii) various methods for automated extraction, conceptualization and prediction of events and their relations, (iii) induction of event processes and properties, and (iv) a wide range of NLU and commonsense understanding tasks that benefit from aforementioned techniques | ||
| temporal | 513 | |
| R19-1092 However, ***** temporal ***** data may be too sparse to build robust word embeddings and to discriminate significant drifts from noise. | ||
| R17-1103 We focus on detecting after and before ***** temporal ***** relations and design a weakly supervised learning approach that extracts thousands of regular event pairs and learns a contextual ***** temporal ***** relation classifier simultaneously. | ||
| 2020.emnlp-main.51 In this process, one can induce event complexes that organize multi-granular events with ***** temporal ***** order and membership relations interweaving among them. | ||
| P17-2071 Here, I show that ***** temporal ***** word analogies (“word w_1 at time t_α is like word w_2 at time t_β”) can effectively be modeled with diachronic word embeddings, provided that the independent embedding spaces from each time period are appropriately transformed into a common vector space. | ||
| W19-2906 To develop a better understanding, we propose a mechanistic model of perceptual decision making that interacts with a simulated task environment with ***** temporal ***** dynamics | ||
| generalization | 508 | |
| 2021.semeval-1.56 We show that self-training, active learning and data augmentation techniques can improve the ***** generalization ***** ability of the model on the unlabeled target domain data without accessing source domain data. | ||
| 2021.naacl-main.257 Deep Learning-based NLP systems can be sensitive to unseen tokens and hard to learn with high-dimensional inputs, which critically hinder learning ***** generalization *****. | ||
| 2021.alta-1.7 Our work shows that transformer-based models can improve text-pair classification by modifying the fine-tuning step to exploit shallow features while improving model ***** generalization *****, with only a slight reduction in efficiency. | ||
| 2020.acl-srw.23 We propose an interpretable approach for event extraction that mitigates the tension between ***** generalization ***** and interpretability by jointly training for the two goals. | ||
| W18-2503 The features make Texar particularly suitable for technique sharing and ***** generalization ***** across different text generation applications | ||
| information | 507 | |
| L10-1362 Question answering (QA) systems aim at retrieving precise ***** information ***** from a large collection of documents. | ||
| L06-1015 That is, the retrieved documents from both systems are shown to the judges without any ***** information ***** about thesearch techniques. | ||
| L14-1009 Our formalization is based on the BDI model (Belief, Desire and Intetion) and constitues a first step toward a unifying model for subjective ***** information ***** extraction. | ||
| 2020.lrec-1.74 We highlight how thinking aloud affects interpretation of dialogue acts in our setting and how to best capture that ***** information *****. | ||
| D18-1241 We present QuAC, a dataset for Question Answering in Context that contains 14K ***** information *****-seeking QA dialogs (100K questions in total). | ||
| response generation | 505 | |
| 2021.nlp4convai-1.18 In this work, we aim to construct a robust sentence representation learning model, that is specifically designed for dialogue ***** response generation *****, with Transformer-based encoder-decoder structure. | ||
| P18-1102 Empirical studies show that our model can significantly outperform the state-of-the-art ***** response generation ***** models under both automatic and human evaluations. | ||
| D19-1197 Since the paired data now is no longer enough to train a neural generation model, we consider leveraging the large scale of unpaired data that are much easier to obtain, and propose ***** response generation ***** with both paired and unpaired data. | ||
| 2021.acl-long.342 Experimental results on both dialogue understanding and ***** response generation ***** tasks show the superiority of our model. | ||
| 2021.inlg-1.37 In this paper, we propose a simple technique called Affective Decoding for empathetic ***** response generation *****. | ||
| Linguistic | 503 | |
| L14-1732 ***** Linguistic ***** configurations are mutable and can be refined and evolved over time as understanding of documentary needs improves. | ||
| 2021.emnlp-main.793 *****Linguistic***** typology generally divides synthetic languages into groups based on their morphological fusion . | ||
| 2020.lrec-1.696 With this paper , we provide an overview over ISOCat successor solutions and annotation standardization efforts since 2010 , and we describe the low - cost harmonization of post - ISOCat vocabularies by means of modular , linked ontologies : The CLARIN Concept Registry , LexInfo , Universal Parts of Speech , Universal Dependencies and UniMorph are linked with the Ontologies of *****Linguistic***** Annotation and through it with ISOCat , the GOLD ontology , the Typological Database Systems ontology and a large number of annotation schemes . | ||
| 2021.ranlp-1.166 *****Linguistic***** typology is an area of linguistics concerned with analysis of and comparison between natural languages of the world based on their certain linguistic features . | ||
| 2010.amta-commercial.16 This paper describes PROMT system deployment at PayPal including : PayPal localization process challenges and requirements to a machine translation solution ; Technical specifications of PROMT Translation Server Developer Edition ; *****Linguistic***** customization performed by PROMT team for PayPal ; Engineering Customization performed by PROMT team for PayPal ; Additional customized development performed by PROMT team on behalf of PayPal ; PROMT engine and PayPal productivity gains and cost savings . | ||
| social | 501 | |
| 2021.emnlp-main.123 Furthermore, these incorrectly gendered translations have the potential to reflect or amplify ***** social ***** biases. | ||
| 2020.acl-main.51 The problem of comparing two bodies of text and searching for words that differ in their usage between them arises often in digital humanities and computational ***** social ***** science. | ||
| 2021.wnut-1.53 Our results show that while word-level, intrinsic, performance evaluation is behind other methods, our model improves performance on extrinsic, downstream tasks through normalization compared to models operating on raw, unprocessed, ***** social ***** media text. | ||
| 2020.emnlp-main.355 We applied ALICEin two visual recognition tasks, bird species classification and ***** social ***** relationship classification. | ||
| N18-4018 While some work has been done on code-mixed ***** social ***** media text and in emotion prediction separately, our work is the first attempt which aims at identifying the emotion associated with Hindi-English code-mixed ***** social ***** media text. | ||
| architectures | 500 | |
| W19-3716 We believe using similar ***** architectures ***** for other languages can show interesting results. | ||
| 2020.sigdial-1.21 Second, we experiment with different ***** architectures ***** to model entities, Dialogue Acts and their combination and evaluate their performance in predicting human coherence ratings on SWBD-Coh. | ||
| 2021.naacl-industry.29 Thus we experiment with automatically optimizing the model ***** architectures ***** on the task at hand via neural architecture search (NAS). | ||
| C18-1273 To alleviate the need for human labor in generating hand-crafted features, methods that utilize neural ***** architectures ***** such as Convolutional Neural Network (CNN) or Recurrent Neural Network (RNN) to automatically extract such features have been proposed and have shown great results. | ||
| P19-3007 The Transformer is a sequence model that forgoes traditional recurrent ***** architectures ***** in favor of a fully attention-based approach | ||
| annotators | 498 | |
| 2020.coling-main.107 Besides, we analyze the types of semantic relations between target words and their substitutes generated by different models or given by ***** annotators *****. | ||
| L12-1216 In the current version, PDT contains 306 expressions (within the total 43,955 of sentences) that were labeled by ***** annotators ***** as being an AltLex. | ||
| L16-1420 The ***** annotators ***** do not have medical training nor they present specific medical problems. | ||
| 2020.aacl-srw.3 Using the human responses that come with the dev set of SNLI, we train both regression and classification models to predict how many ***** annotators ***** will answer a question correctly and then project the difficulty estimates onto the full SNLI train set to create the curriculum. | ||
| L10-1513 The web-interface connects administrators and ***** annotators ***** to a central repository for all data and simplifies many of the housekeeping tasks while keeping requirements at a minimum (that is, users only need an internet connection and a well-behaved browser) | ||
| distant supervision | 495 | |
| L14-1091 Here, we extend the ***** distant supervision ***** approach to template-based event extraction, focusing on the extraction of passenger counts, aircraft types, and other facts concerning airplane crash events. | ||
| C18-1193 The early work in this field (Keith etal., 2017) proposed a ***** distant supervision ***** framework based on Expectation Maximization (EM) to deal with the multiple appearances of the names in documents. | ||
| K17-1009 Here we propose an approach that uses answer ranking as ***** distant supervision ***** for learning how to select informative justifications, where justifications serve as inferential connections between the question and the correct answer while often containing little lexical overlap with either. | ||
| 2020.findings-emnlp.32 We also provide the RELX-Distant dataset, which includes hundreds of thousands of sentences with relations from Wikipedia and Wikidata collected by ***** distant supervision ***** for these languages. | ||
| D19-1039 Given two entities, ***** distant supervision ***** exploits sentences that directly mention them for predicting their semantic relation. | ||
| SemEval | 493 | |
| P19-1279 We also show that models initialized with our task agnostic representations, and then tuned on supervised relation extraction datasets, significantly outperform the previous methods on ***** SemEval ***** 2010 Task 8, KBP37, and TACRED | ||
| 2020.acl-main.629 The resulting system not only outperforms all existing transition-based models, but also matches the best fully-supervised accuracy to date on the ***** SemEval ***** 2015 Task 18 datasets among previous state-of-the-art graph-based parsers. | ||
| 2020.acl-main.582 To verify the performance of SDRN, we manually build three datasets based on ***** SemEval ***** 2014 and 2015 benchmarks. | ||
| S19-2159 In this paper, we present a news bias prediction system, which we developed as part of a ***** SemEval ***** 2019 task. | ||
| S18-1068 This paper describes our submissions to Task 2 in ***** SemEval ***** 2018, i.e., Multilingual Emoji Prediction | ||
| decoder | 489 | |
| 2014.iwslt-papers.8 In this paper we explore segmentation strategies for the stream ***** decoder ***** a method for decoding from a continuous stream of input tokens, rather than the traditional method of decoding from sentence segmented text. | ||
| P19-1200 In particular, a hierarchy of stochastic layers between the encoder and ***** decoder ***** networks is employed to abstract more informative and semantic-rich latent codes. | ||
| 2020.ngt-1.3 We propose a novel procedure for training multiple Transformers with tied parameters which compresses multiple models into one enabling the dynamic choice of the number of encoder and ***** decoder ***** layers during decoding. | ||
| W19-4115 Then, the model encourages the ***** decoder ***** to use the keywords for response generation. | ||
| 2021.emnlp-main.265 Non-autoregressive neural machine translation, which decomposes the dependence on previous target tokens from the inputs of the ***** decoder *****, has achieved impressive inference speedup but at the cost of inferior accuracy | ||
| decoding | 489 | |
| 2020.acl-main.103 However, such a ***** decoding ***** method ignores the intrinsic hierarchical compositionality existing in the keyphrase set of a document. | ||
| P19-1418 We instead to propose an adaptive ***** decoding ***** method to avoid such intermediate representations. | ||
| 2021.inlg-1.41 In our evaluation, we find that increased informativeness through pragmatic ***** decoding ***** generally lowers quality and, somewhat counter-intuitively, increases repetitiveness in captions. | ||
| D18-1191 At ***** decoding ***** time, we greedily select higher scoring labeled spans. | ||
| P17-1048 This paper proposes joint ***** decoding ***** algorithm for end-to-end ASR with a hybrid CTC/attention architecture, which effectively utilizes both advantages in ***** decoding ***** | ||
| Additionally | 482 | |
| 2020.wmt-1.102 ***** Additionally *****, we focus on English to German and demonstrate how to combine BLEURT's predictions with those of YiSi and use alternative reference translations to enhance the performance. | ||
| D19-1097 ***** Additionally *****, we make the CAIS dataset publicly available for the research community. | ||
| 2020.emnlp-demos.1 ***** Additionally *****, the online system can extract information in various tasks, including relational triple extraction, slot & intent detection, event extraction, and so on. | ||
| L12-1600 ***** Additionally *****, we compare the gaze behavior of the human subjects to evaluate saliency regions in the multimodal and visual only conditions. | ||
| L12-1558 ***** Additionally *****, we have annotated a substantial part of the corpus (i.e. the Chatty subset) to provide a gold standard for the evaluation of future approaches to automatic (Flemish) chat language normalization | ||
| subtasks | 481 | |
| 2021.germeval-1.13 The task aims at extending the identification of offensive language, by including additional ***** subtasks ***** that identify comments which should be prioritized for fact-checking by moderators and community managers. | ||
| D19-1644 In this work, we propose a neural two-stage approach to recognizing discontiguous and overlapping entities by decomposing this problem into two ***** subtasks *****: 1) it first detects all the overlapping spans that either form entities on their own or present as segments of discontiguous entities, based on the representation of segmental hypergraph, 2) next it learns to combine these segments into discontiguous entities with a classifier, which filters out other incorrect combinations of segments. | ||
| 2021.wanlp-1.35 Our final approach achieved macro F1-scores of 0.216, 0.235, 0.054, and 0.043 in the four ***** subtasks *****, and we were ranked second in MSA identification ***** subtasks ***** and fourth in DA identification ***** subtasks *****. | ||
| 2021.inlg-1.11 Our experiments on the ***** subtasks ***** show that it is still challenging for a state-of-the-art vision encoder to capture useful information from videos to generate accurate commentaries. | ||
| 2020.semeval-1.97 This competition consists of three ***** subtasks ***** with different levels of granularity: (1) classification of sentences as definitional or non-definitional, (2) labeling of definitional sentences, and (3) relation classification | ||
| unsupervised | 481 | |
| D19-1161 We evaluate this approach empirically through ***** unsupervised ***** labeled constituency parsing. | ||
| 2020.challengehml-1.5 The proposed framework is evaluated through an interaction experiment between a human tutor and a robot, and compared to an existing ***** unsupervised ***** grounding framework. | ||
| K19-1027 In this paper, we alleviate the local optimality of back-translation by learning a policy (takes the form of an encoder-decoder and is defined by its parameters) with future rewarding under the reinforcement learning framework, which aims to optimize the global word predictions for ***** unsupervised ***** neural machine translation. | ||
| 2021.emnlp-main.840 There has been much progress in ***** unsupervised ***** learning of entailment graphs for this purpose. | ||
| 2021.sigmorphon-1.22 We analyze the distributions of different error classes using two ***** unsupervised ***** tasks as testbeds: converting informally romanized text into the native script of its language (for Russian, Arabic, and Kannada) and translating between a pair of closely related languages (Serbian and Bosnian) | ||
| neural networks | 476 | |
| W18-6230 This paper describes an approach to solve implicit emotion classification with the use of pre-trained word embedding models to train multiple ***** neural networks *****. | ||
| 2021.emnlp-main.11 Recent studies have leveraged graph ***** neural networks ***** to capture the inter-sentential relationship (e.g., the discourse graph) within the documents to learn contextual sentence embedding. | ||
| P18-1182 In this paper, we present a new approach (NeuralREG), relying on deep ***** neural networks *****, which makes decisions about form and content in one go without explicit feature extraction. | ||
| Q18-1005 Drawing inspiration from recent efforts to empower ***** neural networks ***** with a structural bias (Cheng et al., 2016; Kim et al., 2017), we propose a model that can encode a document while automatically inducing rich structural dependencies. | ||
| C18-1156 When used along with content-based feature extractors such as convolutional ***** neural networks *****, we see a significant boost in the classification performance on a large Reddit corpus. | ||
| grammatical | 475 | |
| L08-1437 An entry of RoSyllabiDict, in both formats, contains information about unsyllabified word, its syllabified correspondent, ***** grammatical ***** information and/or type of syllabification, if it is the case. | ||
| 2021.acl-srw.33 It is reported that ***** grammatical ***** information is useful for machine translation (MT) task. | ||
| 2021.rocling-1.40 A fine-grained view on the ***** grammatical ***** behavior and political implications is attempted, too. | ||
| P19-3033 The method involves parsing the sentence, identifying ***** grammatical ***** elements, and ranking related elements to recommend a higher level of ***** grammatical ***** element | ||
| 2020.sltu-1.22 We also considered the interaction of adjectives with other ***** grammatical ***** means, especially other part of speeches, e.g. | ||
| Therefore | 474 | |
| D18-1385 ***** Therefore *****, finding an automatic approach to verify rumors with multimedia content is a pressing task. | ||
| 2021.emnlp-main.228 ***** Therefore *****, different word pairs from the contexts within and across n-grams are weighted in the model and facilitate RE accordingly. | ||
| L10-1108 ***** Therefore ***** we can apply many methods proposed in the data mining domain to our task. | ||
| D19-5226 ***** Therefore *****, the UCSY corpus for WAT 2019 is not identical to those used in WAT 2018. | ||
| 2021.emnlp-main.176 ***** Therefore *****, this paper explores Domain-Lifelong Learning for Dialogue State Tracking (DLL-DST), which aims to continually train a DST model on new data to learn incessantly emerging new domains while avoiding catastrophically forgetting old learned domains | ||
| grammatical error correction | 474 | |
| 2020.lrec-1.835 The lack of large-scale datasets has been a major hindrance to the development of NLP tasks such as spelling correction and ***** grammatical error correction ***** (GEC). | ||
| 2020.acl-srw.5 Recently, several studies have focused on improving the performance of ***** grammatical error correction ***** (GEC) tasks using pseudo data. | ||
| 2020.nlptea-1.11 In this study, we treat the grammar error diagnosis (GED) task as a ***** grammatical error correction ***** (GEC) problem and propose a method that incorporates a pre-trained model into an encoder-decoder model to solve this problem. | ||
| 2020.coling-main.200 The incorporation of data augmentation method in ***** grammatical error correction ***** task has attracted much attention. | ||
| P18-3016 The proposed method is applied to two tasks: machine translation and ***** grammatical error correction *****. | ||
| dependency parsing | 472 | |
| 2021.acl-long.494 However, the improvement is limited due to the inaccuracy of the ***** dependency parsing ***** results and the informal expressions and complexity of online reviews. | ||
| 2021.nodalida-main.32 In much previous research on ***** dependency parsing *****, related languages have successfully been used. | ||
| P19-1442 We show that with ***** dependency parsing ***** and rule-based rubrics, we can curate a high quality sentence relation task by leveraging explicit discourse relations. | ||
| D19-5515 In this paper, we analyze the effect of manual as well as automatic lexical normalization for ***** dependency parsing *****. | ||
| Q16-1023 We present a simple and effective scheme for ***** dependency parsing ***** which is based on bidirectional-LSTMs (BiLSTMs) | ||
| adversarial training | 468 | |
| 2021.semeval-1.98 We solve the problem as a binary classification problem and also experiment with data augmentation and ***** adversarial training ***** techniques. | ||
| 2020.wnut-1.57 The ensemble of the models trained using ***** adversarial training ***** also produces similar result. | ||
| P19-1103 Performing ***** adversarial training ***** using our perturbed datasets improves the robustness of the models. | ||
| N19-1134 Finally, we employ ***** adversarial training ***** to improve the results further by leveraging the labeled data from synchronous domains and by explicitly modeling the distributional shift in two domains. | ||
| P19-1262 After ***** adversarial training *****, the baseline's performance improves but is still limited on the adversarial test. | ||
| models | 464 | |
| 2021.eacl-main.243 While the attention heatmaps produced by neural machine translation (NMT) ***** models ***** seem insightful, there is little evidence that they reflect a model's true internal reasoning. | ||
| 2020.acl-main.503 Abstention policies based solely on the model's softmax probabilities fare poorly, since ***** models ***** are overconfident on out-of-domain inputs. | ||
| 2021.unimplicit-1.6 we inspect the behavior of visually grounded and text-only ***** models *****, finding systematic divergences from human judgments even when a model's overall performance is high. | ||
| 2020.coling-main.207 This is a difficult subtask of natural language generation since ***** models ***** are limited to the given identifiers, without any specific descriptive information regarding the inputs, when generating the text. | ||
| D19-1072 To avoid this, ***** models ***** often resort to greedy search which only allows them to explore a limited portion of the latent space | ||
| treebank | 463 | |
| 2020.lrec-1.892 This paper describes Contemplata, an annotation platform that offers a generic solution for ***** treebank ***** building as well as ***** treebank ***** enrichment with relations between syntactic nodes. | ||
| 2020.lrec-1.639 In this paper, we describe the underlying `theory' dubbed ABC Grammar that is taken as a basis for our ***** treebank *****, outline the general construction of the corpus, and report on some preliminary results applying the ***** treebank ***** in a semantic parsing system for generating logical representations of sentences. | ||
| L08-1239 Therefore, we present some experiments that apply statistical subcategorization extraction methods, known in literature, on an Italian ***** treebank ***** that exploits a rich set of dependency relations that can be annotated at different degrees of specificity. | ||
| L16-1602 These modifications concern mainly (1) Enrichments of well identified features of the norm: temporal function of TIMEX time expressions, additional types for TLINK temporal relations; (2) Deeper modifications concerning the units or features annotated: clarification between time and tense for EVENT units, coherence of representation between temporal signals (the SIGNAL unit) and TIMEX modifiers (the MOD feature); (3) A recommendation to perform temporal annotation on top of a syntactic (rather than lexical) layer (temporal annotation on a ***** treebank *****). | ||
| K19-1029 We assess recent approaches to multilingual contextual word representations (CWRs), and compare them for crosslingual transfer from a language with a large ***** treebank ***** to a language with a small or nonexistent ***** treebank *****, by sharing parameters between languages in the parser itself | ||
| generative | 462 | |
| D18-1160 In this work, we propose a novel ***** generative ***** model that jointly learns discrete syntactic structure and continuous word representations in an unsupervised fashion by cascading an invertible neural network with a structured ***** generative ***** prior. | ||
| 2020.findings-emnlp.165 Furthermore, we demonstrate that the learned ***** generative ***** commonsense reasoning capability can be transferred to improve downstream tasks such as CommonsenseQA (76.9% to 78.4 in dev accuracy) by generating additional context. | ||
| D18-1378 For discourse relations, Limbic adopts a ***** generative ***** process regularized by a Markov Random Field. | ||
| 2020.findings-emnlp.192 Experimental results on two datasets indicate that our model significantly outperforms several competitive ***** generative ***** models in terms of automatic and human evaluation. | ||
| 2020.nlptea-1.10 Our solution combined data augmentation methods, spelling check methods, and ***** generative ***** grammatical correction methods, and achieved the best recall score in the Top 1 Correction track | ||
| style transfer | 460 | |
| W19-2309 By measuring ***** style transfer ***** quality, meaning preservation, and the fluency of generated outputs, we demonstrate that our method is able both to produce high-quality output while maintaining the flexibility to suggest syntactically rich stylistic edits. | ||
| 2021.emnlp-main.730 In this paper, we explore Non-AutoRegressive (NAR) decoding for unsupervised text ***** style transfer *****. | ||
| 2020.aacl-main.33 In this paper, the task of aspect-level sentiment controllable ***** style transfer ***** is introduced, where each of the aspect-level sentiments can individually be controlled at the output. | ||
| D18-1430 Some work explored ***** style transfer ***** but suffered from expensive expert labeling of poem styles. | ||
| 2021.emnlp-main.599 Experiments show the uniformly designed metrics achieve stronger or comparable correlations with human judgement compared to state-of-the-art metrics in each of diverse tasks, including text summarization, ***** style transfer *****, and knowledge-grounded dialog. | ||
| discourse relation | 460 | |
| D17-1142 We observe that the evidence-conclusion ***** discourse relation *****s, also known as arguments, often appear in product reviews, and we hypothesise that some argument-based features, e.g. | ||
| P19-1065 We firstly propose a method to automatically extract the implicit ***** discourse relation ***** argument pairs and labels from a dataset of dialogic turns, resulting in a novel corpus of ***** discourse relation ***** pairs; the first of its kind to attempt to identify the ***** discourse relation *****s connecting the dialogic turns in open-domain discourse. | ||
| E17-1027 Inferring implicit ***** discourse relation *****s in natural language text is the most difficult subtask in discourse parsing. | ||
| L10-1070 Following the D-LTAG approach to discourse, we have developed a lexically anchored description of attribution, considering this relation, contrary to the approach in the PDTB, independently from other ***** discourse relation *****s. | ||
| W18-5040 The research described in this paper examines how to learn linguistic knowledge associated with ***** discourse relation *****s from unlabeled corpora. | ||
| relation classification | 458 | |
| C16-1138 However, existing neural networks for ***** relation classification ***** are usually of shallow architectures (e.g., one-layer convolutional neural networks or recurrent networks). | ||
| P18-1046 We adopt the generator to filter distant supervision training dataset and redistribute the false positive instances into the negative set, in which way to provide a cleaned dataset for ***** relation classification *****. | ||
| C16-1139 Recently, a neural network architecture has been proposed to automatically extract features for ***** relation classification *****. | ||
| 2021.ranlp-1.101 To adapt to these constraints, we experiment with an active-learning based relation extraction pipeline, consisting of a binary LSTM-based lightweight model for detecting the relations that do exist, and a state-of-the-art model for ***** relation classification ***** | ||
| 2020.findings-emnlp.121 Temporal ***** relation classification ***** is the pair-wise task for identifying the relation of a temporal link (TLINKs) between two mentions, i.e. | ||
| distributional | 456 | |
| 2016.gwc-1.58 On the other hand, ***** distributional ***** models provide much better coverage. | ||
| Q18-1048 We extract predicate-argument structures from multiple-source news corpora, and compute local ***** distributional ***** similarity scores to learn entailments between predicates with typed arguments (e.g., person contracted disease). | ||
| 1963.earlymt-1.8 How a more and more refined classification can eliminate one by one the ambiguity resulting from all possible constructions arising from juxtaposition of two ***** distributional ***** classes is discussed in detail. | ||
| P18-2057 Methods for unsupervised hypernym detection may broadly be categorized according to two paradigms: pattern-based and ***** distributional ***** methods. | ||
| 2020.emnlp-main.513 Existing approaches for this task are discriminative, combining ***** distributional ***** and lexical semantics in an implicit rather than direct way | ||
| text | 449 | |
| 2021.naacl-main.413 When predicting medical diagnoses , for example , identifying predictive content in clinical notes not only enhances interpretability , but also allows unknown , descriptive ( i.e. , *****text***** - based ) risk factors to be identified . | ||
| 2021.acl-long.195 Recently , there has been significant progress in studying neural networks to translate *****text***** descriptions into SQL queries . | ||
| N19-1070 Most of the proposed supervised and unsupervised methods for keyphrase generation are unable to produce terms that are valuable but do not appear in the *****text***** . | ||
| 2021.privatenlp-1.2 Differentially - private mechanisms for *****text***** generation typically add carefully calibrated noise to input words and use the nearest neighbor to the noised input as the output word . | ||
| 2021.acl-short.116 Implicit discourse relation classification is a challenging task , in particular when the *****text***** domain is different from the standard Penn Discourse Treebank ( PDTB ; Prasad et al . , 2008 ) training corpus domain ( Wall Street Journal in 1990s ) . | ||
| Existing | 445 | |
| 2021.naacl-main.270 ***** Existing ***** work on tabular representation-learning jointly models tables and associated text using self-supervised objective functions derived from pretrained language models such as BERT. | ||
| 2021.ranlp-1.97 ***** Existing ***** approaches employ supervised models which are domain-dependent and require a large dataset of questions of known difficulty for training. | ||
| 2021.conll-1.40 ***** Existing ***** efforts often deal with the two problems separately regardless of their close essential correlations. | ||
| 2020.coling-main.236 ***** Existing ***** related methods usually incorporate extra submodels to help filter noise before the noisy data is input to main models. | ||
| 2020.lrec-1.667 ***** Existing ***** machine reading comprehension models are reported to be brittle for adversarially perturbed questions when optimizing only for accuracy, which led to the creation of new reading comprehension benchmarks, such as SQuAD 2.0 which contains such type of questions | ||
| entity | 445 | |
| 2021.emnlp-main.601 Introducing such multi label examples at the cost of annotating fewer examples brings clear gains on natural language inference task and ***** entity ***** typing task, even when we simply first train with a single label data and then fine tune with multi label examples. | ||
| L08-1137 After describing how we constructed a new, multi-tiered answer type hierarchy from the set of ***** entity ***** types recognized by Language Computer Corporations CICEROLITE named ***** entity ***** recognition system, we describe how we used this hierarchy to annotate a new corpus of more than 10,000 English factoid questions. | ||
| 2020.lrec-1.249 In specific, existing Chinese corpora for ***** entity ***** linking were mainly constructed from noisy short texts, such as microblogs and news headings, where long texts were largely overlooked, which yet constitute a wider spectrum of real-life scenarios. | ||
| D19-6219 In comparison to existing datasets, MedMentions contains a far greater number of ***** entity ***** types, and thus represents a more challenging but realistic scenario in a real-world setting. | ||
| W17-1413 In the paper we present an adaptation of Liner2 framework to solve the BSNLP 2017 shared task on multilingual named ***** entity ***** recognition. | ||
| Understanding | 444 | |
| 2021.ranlp-1.156 *****Understanding***** idioms is important in NLP . | ||
| 2021.semeval-1.39 *****Understanding***** tables is an important and relevant task that involves understanding table structure as well as being able to compare and contrast information within cells . | ||
| D19-1243 *****Understanding***** narratives requires reading between the lines , which in turn , requires interpreting the likely causes and effects of events , even when they are not mentioned explicitly . | ||
| D19-1332 *****Understanding***** time is crucial for understanding events expressed in natural language . | ||
| D19-1161 *****Understanding***** text often requires identifying meaningful constituent spans such as noun phrases and verb phrases . | ||
| semantic parsing | 443 | |
| 2021.emnlp-main.310 We conduct extensive experiments to study the research problems involved in continual ***** semantic parsing ***** and demonstrate that a neural semantic parser trained with TotalRecall achieves superior performance than the one trained directly with the SOTA continual learning algorithms and achieve a 3-6 times speedup compared to re-training from scratch. | ||
| D17-1127 Existing studies on ***** semantic parsing ***** mainly focus on the in-domain setting. | ||
| Q17-1009 This paper explores extending shallow ***** semantic parsing ***** beyond lexical-unit triggers, using causal relations as a test case. | ||
| C18-1280 The most approaches to Knowledge Base Question Answering are based on ***** semantic parsing *****. | ||
| P19-1007 This game between a primal model (***** semantic parsing *****) and a dual model (logical form to query) forces them to regularize each other, and can achieve feedback signals from some prior-knowledge | ||
| relation extraction | 443 | |
| L08-1453 The second form of annotation identifies the spans of each target and comparison set, which is of interest for ***** relation extraction *****. | ||
| D18-1248 Distant supervision is an effective method to generate large scale labeled data for ***** relation extraction *****, which assumes that if a pair of entities appears in some relation of a Knowledge Graph (KG), all sentences containing those entities in a large unlabeled corpus are then labeled with that relation to train a relation classifier. | ||
| D19-1118 Unresolved coreference is a bottleneck for ***** relation extraction *****, and high-quality coreference resolvers may produce an output that makes it a lot easier to extract knowledge triples. | ||
| 2021.acl-long.277 Distantly supervision automatically generates plenty of training samples for ***** relation extraction *****. | ||
| D18-1243 To mitigate this problem, we propose a novel word-level distant supervised approach for ***** relation extraction ***** | ||
| Shared | 442 | |
| W19-3024 It contributes to ***** Shared ***** Task A in the 2019 CLPsych workshop by predicting users' suicide risk given posts in the Reddit subforum r/SuicideWatch. | ||
| W18-0605 The Computational Linguistics and Clinical Psychology (CLPsych) 2018 ***** Shared ***** Task asked teams to predict cross-sectional indices of anxiety and distress, and longitudinal indices of psychological distress from a subsample of the National Child Development Study, started in the United Kingdom in 1958. | ||
| 2021.wmt-1.82 This paper describes Charles University sub-mission for Terminology translation ***** Shared ***** Task at WMT21. | ||
| 2020.wmt-1.81 In this paper, we describe the Bering Lab's submission to the WMT 2020 ***** Shared ***** Task on Automatic Post-Editing (APE) | ||
| W17-1608 *****Shared***** tasks are increasingly common in our field , and new challenges are suggested at almost every conference and workshop . | ||
| IWSLT | 442 | |
| 2013.iwslt-evaluation.6 Our results denoted a 13.5% word error rate on the ***** IWSLT ***** 2013 ASR English test data set. | ||
| 2020.acl-main.251 Our approach, which we call ENGINE (ENerGy-based Inference NEtworks), achieves state-of-the-art non-autoregressive results on the ***** IWSLT ***** 2014 DE-EN and WMT 2016 RO-EN datasets, approaching the performance of autoregressive models. | ||
| 2013.iwslt-evaluation.22 This paper describes the University of Edinburgh (UEDIN) English ASR system for the ***** IWSLT ***** 2013 Evaluation | ||
| 2011.iwslt-evaluation.8 This paper describes the system developed by the LIG laboratory for the 2011 *****IWSLT***** evaluation . | ||
| 2016.iwslt-1.14 This paper describes the speech recognition system of IOIT for *****IWSLT***** 2016 . | ||
| knowledge | 440 | |
| 2021.naacl-main.313 Therefore, we propose to adopt multi-task learning to transfer the AT ***** knowledge ***** to NAT models through encoder sharing. | ||
| L10-1595 In contrast, MR will make ***** knowledge ***** contained in text available in forms that machines can use for automated processing. | ||
| 2020.acl-main.735 However, for many cases, the joint tagging needs not only modeling from context features but also ***** knowledge ***** attached to them (e.g., syntactic relations among words); limited efforts have been made by existing research to meet such needs. | ||
| D18-1535 We focus on filling these ***** knowledge ***** gaps in the Science Entailment task, by leveraging an external structured ***** knowledge ***** base (KB) of science facts. | ||
| 2020.findings-emnlp.437 However, along with the proactive manner introduced into a dialogue agent, an issue arises that, with too many ***** knowledge ***** facts to express, the agent starts to talks endlessly, and even completely ignores what the other expresses in dialogue sometimes, which greatly harms the interest of the other chatter to continue the conversation | ||
| knowledge distillation | 438 | |
| 2021.naacl-main.310 We consider a scenario where the training is comprised of multiple stages and propose a dynamic ***** knowledge distillation ***** technique to alleviate the problem of catastrophic forgetting systematically. | ||
| 2020.emnlp-tutorials.4 After establishing these foundations, we will cover a wide range of techniques for improving efficiency, including ***** knowledge distillation *****, quantization, pruning, more efficient architectures, along with case studies and practical implementation tricks. | ||
| 2020.wmt-1.33 BPE(CITATION)), baseline model training, iterative back-translation, model ensemble, ***** knowledge distillation ***** and multilingual pre-training. | ||
| 2021.iwslt-1.3 Our primary submission is based on wait-k neural machine translation with sequence-level ***** knowledge distillation ***** to encourage literal translation. | ||
| 2020.emnlp-main.494 On prototypical language generation tasks such as translation and summarization, our method consistently outperforms other distillation algorithms, such as sequence-level ***** knowledge distillation *****. | ||
| disambiguation | 434 | |
| 1999.mtsummit-1.62 A probabilistic method is used for word sense ***** disambiguation ***** where the features taken are the surrounding six words. | ||
| 2021.semeval-1.93 We fine-tune XLM-RoBERTa model to solve the task of word in context ***** disambiguation *****, i.e., to determine whether the target word in the two contexts contains the same meaning or not. | ||
| N18-2054 This task is a difficult ***** disambiguation ***** problem in which one article must be selected among several candidate articles with similar titles and contents. | ||
| C16-1178 The experiments on Chinese Discourse Treebank show that the F1 scores of 0.7506, 0.7693, 0.7458, and 0.3134 are achieved for discourse usage ***** disambiguation *****, linking ***** disambiguation *****, relation type ***** disambiguation *****, and argument boundary identification, respectively, in a pipelined Chinese discourse parser. | ||
| S17-2070 The Duluth systems participated in all three subtasks, and relied on methods that included word sense ***** disambiguation ***** and measures of semantic relatedness | ||
| lexicon | 428 | |
| L14-1114 We describe a method to automatically extract a German ***** lexicon ***** from Wiktionary that is compatible with the finite-state morphological grammar SMOR. | ||
| L16-1463 The experiments show that usage of the developed ***** lexicon ***** improves the results over both the baseline and the publicly available ***** lexicon *****. | ||
| 2021.ranlp-1.67 Then we calculate the confidence of unlabeled data with ***** lexicon ***** and add them into labeled dataset for the robust pseudo-labeling approach. | ||
| 2020.signlang-1.11 Each subject viewed a set of 20 signs from the newly compiled Ghanaian sign language ***** lexicon ***** and was asked to replicate the signs | ||
| L06-1322 From our experimental results, we found that the correspondence between a group of adjectives and their category name was more suitable in our method than in the EDR ***** lexicon *****. | ||
| Computational | 419 | |
| 2021.naacl-main.174 ***** Computational ***** approaches have largely focused on classifying the frame of a full news article while framing signals are often subtle and local. | ||
| 2021.law-1.2 ***** Computational ***** resources such as semantically annotated corpora can play an important role in enabling speakers of indigenous minority languages to participate in government, education, and other domains of public life in their own language. | ||
| L14-1256 *****Computational***** Narratology is an emerging field within the Digital Humanities . | ||
| W19-3011 *****Computational***** linguistics holds promise for improving scientific integrity in clinical psychology , and for reducing longstanding inequities in healthcare access and quality . | ||
| W16-4803 *****Computational***** approaches for dialectometry employed Levenshtein distance to compute an aggregate similarity between two dialects belonging to a single language group . | ||
| natural language generation | 419 | |
| 2020.coling-main.420 Most current state-of-the art systems for generating English text from Abstract Meaning Representation (AMR) have been evaluated only using automated metrics, such as BLEU, which are known to be problematic for ***** natural language generation *****. | ||
| 2021.naacl-main.416 However, existing report generation systems, despite achieving high performances on ***** natural language generation ***** metrics such as CIDEr or BLEU, still suffer from incomplete and inconsistent generations. | ||
| 2020.coling-main.462 Recent advances in neural ***** natural language generation ***** have made possible remarkable progress on the task of keyphrase generation, demonstrated through improvements on quality metrics such as F1-score. | ||
| 2020.emnlp-main.701 Lexically constrained generation requires the target sentence to satisfy some lexical constraints, such as containing some specific words or being the paraphrase to a given sentence, which is very important in many real-world ***** natural language generation ***** applications. | ||
| 2001.mtsummit-papers.2 This paper presents an overview of the broad-coverage, application-independent ***** natural language generation ***** component of the NLP system being developed at Microsoft Research. | ||
| tokens | 418 | |
| 2021.naacl-main.170 Especially, widely used n-gram similarity metrics often fail to discriminate the incorrect answers since they equally consider all of the ***** tokens *****. | ||
| D19-1503 Most relation extraction models restrict inferring relations between ***** tokens ***** within a few neighboring sentences, mainly to avoid high computational complexity. | ||
| 2020.acl-main.434 We apply these tasks to examine the popular BERT, ELMo and GPT contextual encoders, and find that each of our tested information types is indeed encoded as contextual information across ***** tokens *****, often with near-perfect recoverability—but the encoders vary in which features they distribute to which ***** tokens *****, how nuanced their distributions are, and how robust the encoding of each feature is to distance. | ||
| C18-1124 DHA enables dynamic control of the ratios at which source and target contexts contribute to the generation of target words, offering a way to weakly induce structure relations among both source and target ***** tokens *****. | ||
| 2021.acl-long.9 Pre-trained S2S models or a Copy Mechanism are trained to copy the surface ***** tokens ***** from encoders to decoders, but they cannot guarantee constraint satisfaction | ||
| parsers | 418 | |
| 2021.spnlp-1.3 While AM dependency ***** parsers ***** have been shown to be fast and accurate across several graphbanks, they require explicit annotations of the compositional tree structures for training. | ||
| 2021.emnlp-main.823 One approach has been previously used in dependency ***** parsers ***** in practice, but remains undocumented in the parsing literature, and is considered a heuristic. | ||
| 2021.adaptnlp-1.7 We perform a systematic set of experiments using two neural constituency ***** parsers ***** to examine how different ***** parsers ***** behave in combination with different BERT models with varying source and target genres in English and Swedish. | ||
| 1997.mtsummit-papers.19 SYSTRAN'S dictionaries, along with its ***** parsers *****, transfer modules, and generators, have been tested on huge amounts of text, and contain large terminology databases covering various domains and detailed linguistic rules. | ||
| 2021.starsem-1.11 Our results indicate that these three baseline models exhibit poorer performance on sentences with predicate-argument structure with more than one level of embedding; we used compchains to characterize the errors made by these ***** parsers ***** and present examples of erroneous parses produced by the parser that were identified using compchains | ||
| word | 416 | |
| D17-1120 Our extensive evaluation over standard benchmarks and in multiple languages shows that sequence learning enables more versatile all-***** word *****s models that consistently lead to state-of-the-art results, even against ***** word ***** experts with engineered features. | ||
| D19-5904 The target outputs of many NLP tasks are ***** word ***** sequences. | ||
| 2021.naacl-main.87 In this paper, we propose WordDP to achieve certified robustness against ***** word ***** substitution at- | ||
| L16-1056 In fact, such relations may provide the basis for the construction of more complex structures such as taxonomies, or be used as effective background knowledge for many ***** word ***** understanding applications. | ||
| 2020.findings-emnlp.266 In this paper, we focus on robustness of text classification against ***** word ***** substitutions, aiming to provide guarantees that the model prediction does not change if a ***** word ***** is replaced with a plausible alternative, such as a synonym | ||
| entity recognition | 415 | |
| W17-1413 In the paper we present an adaptation of Liner2 framework to solve the BSNLP 2017 shared task on multilingual named ***** entity recognition *****. | ||
| N18-1131 Most named ***** entity recognition ***** (NER) systems deal only with the flat entities and ignore the inner nested ones, which fails to capture finer-grained semantic information in underlying texts. | ||
| D19-1100 Although over 100 languages are supported by strong off-the-shelf machine translation systems, only a subset of them possess large annotated corpora for named ***** entity recognition *****. | ||
| Q14-1037 We present a joint model of three core tasks in the entity analysis stack: coreference resolution (within-document clustering), named ***** entity recognition ***** (coarse semantic typing), and entity linking (matching to Wikipedia entities). | ||
| 2020.emnlp-main.304 The idea of using multi - task learning approaches to address the joint extraction of entity and relation is motivated by the relatedness between the *****entity recognition***** task and the relation classification task . | ||
| sentiment | 413 | |
| S18-1056 In this paper, we describe the first attempt to perform transfer learning from ***** sentiment ***** to emotions. | ||
| W17-0909 While previous research found no contribution from ***** sentiment ***** analysis to the accuracy on this task, we demonstrate that ***** sentiment ***** is an important aspect. | ||
| N19-1038 For this, we present a novel strategy for learning fully interpretable negation rules via weak supervision: we apply reinforcement learning to find a policy that reconstructs negation rules from ***** sentiment ***** predictions at document level. | ||
| 2020.wnut-1.13 The focus lays on two tasks: determining which ***** sentiment ***** a fragment contains (***** sentiment ***** analysis) and investigating which fundamental social rights (education, employment, legal aid, etc.) are addressed in the fragment. | ||
| 2021.ranlp-srw.23 Among other things, we investigate co-occurrence of different emotions in the dataset, and the relationship between ***** sentiment ***** and emotion of textual instances | ||
| discourse | 412 | |
| Q15-1024 Our solution computes distributed meaning representations for each ***** discourse ***** argument by composition up the syntactic parse tree. | ||
| N18-1013 We argue that semantic meanings of a sentence or clause can not be interpreted independently from the rest of a paragraph, or independently from all ***** discourse ***** relations and the overall paragraph-level ***** discourse ***** structure. | ||
| D18-1099 The text in many web documents is organized into a hierarchy of section titles and corresponding prose content, a structure which provides potentially exploitable information on ***** discourse ***** structure and topicality. | ||
| 2021.emnlp-main.40 This prevents researchers from building on top of previous annotation work and results in the existence, in ***** discourse ***** learning in particular, of many small class-imbalanced datasets | ||
| E17-1062 This study introduces a statistical model able to generate variations of a proper name by taking into account the person to be mentioned, the ***** discourse ***** context and variation. | ||
| word similarity | 412 | |
| 2020.repl4nlp-1.6 Empirical results on several ***** word similarity ***** and word analogy benchmarks illustrate the efficacy of the proposed framework. | ||
| 2020.findings-emnlp.235 We evaluated our methods and showed its effectiveness on four intrinsic and extrinsic tasks: ***** word similarity *****, embedding numeracy, numeral prediction, and sequence labeling. | ||
| 2020.aacl-main.76 In particular, the well-known analogy “man is to computer-programmer as woman is to homemaker” is due to ***** word similarity ***** rather than bias. | ||
| D18-1169 In order to fill this evaluation gap, we propose Cambridge Rare word Dataset (Card-660), an expert-annotated ***** word similarity ***** dataset which provides a highly reliable, yet challenging, benchmark for rare word representation techniques. | ||
| W17-0811 We also present baseline scores for word representation models using state-of-the-art techniques for Urdu, Telugu and Marathi by evaluating them on newly created ***** word similarity ***** datasets. | ||
| system | 411 | |
| 2020.acl-main.123 Most studies on abstractive summarization report ROUGE scores between ***** system ***** and reference summaries. | ||
| 2020.inlg-1.6 It is unfair to expect neural data-to-text to produce high quality output when there are gaps between ***** system ***** input data and information contained in the training text. | ||
| N18-3001 We address data imbalance issues by implementing two ***** system ***** architectures using convolutional neural networks and logistic regression models. | ||
| 2020.semeval-1.39 However, for Subtask C, there is still a considerable gap between ***** system ***** and human performance. | ||
| 2020.emnlp-main.151 The dependencies between ***** system ***** and user utterances in the same turn and across different turns are not fully considered in existing multidomain dialogue state tracking (MDST) models | ||
| clustering | 408 | |
| 2020.lrec-1.303 We present efficient implementations of Brown ***** clustering ***** and the alternative Exchange ***** clustering ***** as well as a number of methods to accelerate the computation of both hierarchical and flat clusters. | ||
| C18-1004 Depending on system choices, the affinity scores can be further used in ***** clustering ***** or mention ranking. | ||
| 2020.lrec-1.124 With a small alphabet, it can function as a proxy of phonemes, and as one of its main uses, we carry out dialect ***** clustering *****: cluster a dialect/sub-language mixed corpus into sub-groups and see if they coincide with the conventional boundaries of dialects and sub-languages. | ||
| 2021.repl4nlp-1.15 Existing supervised models for text ***** clustering ***** find it difficult to directly optimize for ***** clustering ***** results. | ||
| 2021.eacl-main.198 We show that the use of a suitable fine-tuning objective and external knowledge in pre-trained transformer models yields significant improvements in the effectiveness of contextual embeddings for ***** clustering ***** | ||
| information extraction | 406 | |
| L14-1009 Our formalization is based on the BDI model (Belief, Desire and Intetion) and constitues a first step toward a unifying model for subjective ***** information extraction *****. | ||
| L10-1469 Many natural language processing tasks, including ***** information extraction *****, question answering and recognizing textual entailment, require analysis of the polarity, focus of polarity, tense, aspect, mood and source of the event mentions in a text in addition to its predicate-argument structure analysis. | ||
| W19-1903 In this paper we describe an evaluation of the potential of classical ***** information extraction ***** methods to extract drug-related attributes, including adverse drug events, and compare to more recently developed neural methods. | ||
| 2020.emnlp-main.461 Extracting event temporal relations is a critical task for ***** information extraction ***** and plays an important role in natural language understanding. | ||
| W17-4417 In this paper, we describe the Lithium Natural Language Processing (NLP) system - a resource-constrained, high-throughput and language-agnostic system for ***** information extraction ***** from noisy user generated text on social media. | ||
| natural language understanding | 401 | |
| 2021.calcs-1.20 Multilingual language models have shown decent performance in multilingual and cross-lingual ***** natural language understanding ***** tasks. | ||
| 2021.naacl-main.342 In the pursuit of ***** natural language understanding *****, there has been a long standing interest in tracking state changes throughout narratives. | ||
| 2020.emnlp-main.549 Numerical reasoning over texts, such as addition, subtraction, sorting and counting, is a challenging machine reading comprehension task, since it requires both ***** natural language understanding ***** and arithmetic computation. | ||
| 2020.emnlp-main.461 Extracting event temporal relations is a critical task for information extraction and plays an important role in ***** natural language understanding *****. | ||
| D17-1216 Reasoning with commonsense knowledge is critical for ***** natural language understanding *****. | ||
| neural network | 400 | |
| P19-1516 In this paper, we propose a ***** neural network ***** inspired multi- task learning framework that can simultaneously extract ADRs from various sources. | ||
| W18-6230 This paper describes an approach to solve implicit emotion classification with the use of pre-trained word embedding models to train multiple ***** neural network *****s. | ||
| 2021.emnlp-main.11 Recent studies have leveraged graph ***** neural network *****s to capture the inter-sentential relationship (e.g., the discourse graph) within the documents to learn contextual sentence embedding. | ||
| P18-1182 In this paper, we present a new approach (NeuralREG), relying on deep ***** neural network *****s, which makes decisions about form and content in one go without explicit feature extraction. | ||
| 2020.lrec-1.624 These include a linear classifier as well as a ***** neural network ***** trained using a transformer word embedding model (BERT), and fine-tuned on the parliamentary speeches. | ||
| lexicons | 399 | |
| W19-4508 We explore ***** lexicons ***** from various application scenarios such as sentiment analysis and emotion detection. | ||
| C18-1074 However, most existing attention models did not take full advantage of sentiment ***** lexicons *****, which provide rich sentiment information and play a critical role in sentiment analysis. | ||
| 2021.naacl-main.350 While usage pressures might assign short words to frequent meanings (Zipf's law of abbreviation), the need for a productive and open-ended vocabulary, local constraints on sequences of symbols, and various other factors all shape the ***** lexicons ***** of the world's languages. | ||
| D18-1026 While post-processing specialization methods are applicable to arbitrary distributional vectors, they are limited to updating only the vectors of words occurring in external ***** lexicons ***** (i.e., seen words), leaving the vectors of all other words unchanged | ||
| 2004.amta-papers.29 However, these methods achieve unsatisfactory alignment results when performing word alignment on a small-scale domain-specific bilingual corpus without terminological ***** lexicons *****. | ||
| named entity | 398 | |
| W17-1413 In the paper we present an adaptation of Liner2 framework to solve the BSNLP 2017 shared task on multilingual ***** named entity ***** recognition. | ||
| N18-1131 Most ***** named entity ***** recognition (NER) systems deal only with the flat entities and ignore the inner nested ones, which fails to capture finer-grained semantic information in underlying texts. | ||
| P19-1068 We also present and analyze the most discriminative features of our best performing model, before and after ***** named entity ***** removal. | ||
| D19-1100 Although over 100 languages are supported by strong off-the-shelf machine translation systems, only a subset of them possess large annotated corpora for ***** named entity ***** recognition. | ||
| 2020.emnlp-main.711 State - of - the - art models for multi - hop question answering typically augment large - scale language models like BERT with additional , intuitively useful capabilities such as *****named entity***** recognition , graph - based reasoning , and question decomposition . | ||
| ii | 397 | |
| D19-6502 With this, we sidestep two drawbacks of current document-level systems: (i) we do not modify the training process so there is no increment in training time, and (***** ii *****) we do not require document-level an-notated data. | ||
| L10-1084 More specifically, LX-Parser is being made available (i) as a downloadable, stand-alone parsing tool that can be run locally by its users; (***** ii *****) as a Web service that exposes an interface that can be invoked remotely and transparently by client applications; and finally (***** ii *****i) as an on-line parsing service, aimed at human users, that can be accessed through any common Web browser. | ||
| L06-1236 Lexical information is encoded in the form of features in a Conditional Random Field tagger providing improved performance in cases where: i) limited training data is made available ***** ii *****) the data is case-less and ***** ii *****i) the test data genre or domain is different than that of the training data. | ||
| L06-1326 This new resource has a two-fold objective: (i) to be an important research tool which supports the development of MW expressions typologies and their lexicographic treatment; (***** ii *****) to be of major help in developing and evaluating language processing tools able of dealing with MW expressions. | ||
| 2020.coling-main.64 To this end, we (i) discuss the formalization of probing tasks for embeddings of image-caption pairs, (***** ii *****) define three concrete probing tasks within our general framework, (***** ii *****i) train classifiers to probe for those properties, and (iv) compare various state-of-the-art embeddings under the lens of the proposed probing tasks | ||
| dialogue | 397 | |
| L10-1611 In a second evaluation, the user's gaze direction was analysed in order to assess the difference in the user's (gazing) behaviour if interacting with the computer versus the other ***** dialogue ***** partner. | ||
| 2021.reinact-1.6 In this paper we will argue that the nature of dogwhistle communication is essentially dialogical, and that to account for dogwhistle meaning we must consider dialogical events in which ***** dialogue ***** partners can draw different conclusions based on communicative events. | ||
| 2020.lrec-1.80 Then, we present descriptive statistics of the annotation, particularly focusing on which ***** dialogue ***** acts often follow each other across speakers and which ***** dialogue ***** acts overlap with gestural behaviour. | ||
| W18-5020 We also propose to use CNN-based reranker for obtaining responses having semantic correspondence with input ***** dialogue ***** acts. | ||
| 2020.lrec-1.74 We highlight how thinking aloud affects interpretation of ***** dialogue ***** acts in our setting and how to best capture that information. | ||
| sentence | 395 | |
| P18-4012 Being optimized for relation extraction at ***** sentence ***** level, many annotation tools lack in facilitating the annotation of relational structures that are widely spread across the text. | ||
| 2021.starsem-1.1 The evaluation of these models was performed in comparison with SDM, a framework specifically designed to integrate events in ***** sentence ***** meaning representations, and we conducted a detailed error analysis to investigate which factors affect their behavior. | ||
| L12-1117 Parallel aligned treebanks (PAT) are linguistic corpora annotated with morphological and syntactic structures that are aligned at ***** sentence ***** as well as sub-***** sentence ***** levels. | ||
| W19-2905 In this paper, extending the computational models employed in ***** sentence ***** processing to morphological processing, we performed a computational simulation experiment where, given incremental surprisal as a linking hypothesis, five computational models with different representational assumptions were evaluated against human reaction times in visual lexical decision experiments available from the English Lexicon Project (ELP), a “shared task” in the morphological processing literature. | ||
| 2020.coling-main.492 However, it is unclear whether performing extraction at ***** sentence ***** level is the best solution | ||
| Dialogue | 394 | |
| 2021.emnlp-main.365 *****Dialogue***** summarization has drawn much attention recently . | ||
| P19-1541 *****Dialogue***** contexts are proven helpful in the spoken language understanding ( SLU ) system and they are typically encoded with explicit memory representations . | ||
| W19-4112 *****Dialogue***** systems and conversational agents are becoming increasingly popular in modern society . | ||
| 2021.emnlp-main.400 *****Dialogue***** disentanglement aims to group utterances in a long and multi - participant dialogue into threads . | ||
| W18-5004 *****Dialogue***** personalization is an important issue in the field of open - domain chat - oriented dialogue systems . | ||
| unlabeled | 392 | |
| 2020.emnlp-main.724 In this paper, we explore the potential of only using the label name of each class to train classification models on ***** unlabeled ***** data, without using any labeled documents. | ||
| P17-1156 Instead of the source domain sentiment classifiers, our approach adapts the general-purpose sentiment lexicons to target domain with the help of a small number of labeled samples which are selected and annotated in an active learning mode, as well as the domain-specific sentiment similarities among words mined from ***** unlabeled ***** samples of target domain. | ||
| 2021.emnlp-main.512 Our method further opens up the door to leverage weakly-labeled or ***** unlabeled ***** images in a principled way to enhance VQA models. | ||
| W17-4404 We also compare normalization to strategies that leverage large amounts of ***** unlabeled ***** data kept in its raw form. | ||
| 2021.emnlp-main.222 Among them, keyword-driven methods are the mainstream where user-provided keywords are exploited to generate pseudo-labels for ***** unlabeled ***** texts | ||
| nmt system | 389 | |
| W19-5204 We apply our model to the output of existing *****NMT systems*****, and demonstrate that, while the human-judged quality improves in all cases, BLEU scores drop with forward-translated test sets. | ||
| N18-2082 In applying this process to a representative *****NMT system*****, we find its encoder appears most suited to supporting inferences at the syntax-semantics interface, as compared to anaphora resolution requiring world knowledge. | ||
| 2021.iwslt-1.20 A pipeline approach was explored for the low-resource speech translation task, using a hybrid HMM/TDNN automatic speech recognition system fed by wav2vec features, coupled to an *****NMT system*****. | ||
| W18-6413 TenTrans is an improved *****NMT system***** based on Transformer self-attention mechanism. | ||
| D18-1509 Recent advances in Neural Machine Translation (NMT) show that adding syntactic information to *****NMT systems***** can improve the quality of their translations. | ||
| Discourse | 382 | |
| R19-1059 ***** Discourse ***** relations between sentences are often represented as a tree, and the tree structure provides important information for summarizers to create a short and coherent summary | ||
| 2020.emnlp-main.224 *****Discourse***** relations describe how two propositions relate to one another , and identifying them automatically is an integral part of natural language understanding . | ||
| P17-2037 *****Discourse***** segmentation is a crucial step in building end - to - end discourse parsers . | ||
| L08-1612 *****Discourse***** structure and coherence relations are one of the main inferential challenges addressed by computational pragmatics . | ||
| 2012.amta-caas14.1 *****Discourse***** connectives can often signal multiple discourse relations , depending on their context . | ||
| machine learning | 381 | |
| 2021.latechclfl-1.8 We have evaluated multiple traditional ***** machine learning ***** approaches as well as transformer-based models pretrained on historical and contemporary language for a single-label text sequence emotion classification for the different emotion categories. | ||
| 2020.acl-main.264 Advanced ***** machine learning ***** techniques have boosted the performance of natural language processing. | ||
| C16-2028 TextPro-AL is a web-based application integrating four components: a ***** machine learning ***** based NLP pipeline, an annotation editor for task definition and text annotations, an incremental re-training procedure based on active learning selection from a large pool of unannotated data, and a graphical visualization of the learning status of the system. | ||
| 2020.challengehml-1.6 We use these multimodal data sources to construct a composite representation, which is used for training ***** machine learning ***** classifiers to predict the class labels. | ||
| 2020.smm4h-1.13 Our approaches relied on a combination of traditional ***** machine learning ***** and deep learning models. | ||
| empirically | 380 | |
| N19-1423 BERT is conceptually simple and ***** empirically ***** powerful. | ||
| 2020.spnlp-1.8 We ***** empirically ***** validate our strategies on two sequence labeling tasks, showing easier paths to strong performance than prior work, as well as further improvements with global energy terms. | ||
| 2020.sigmorphon-1.24 Here we ***** empirically ***** examine the capacity of word-pieces to capture morphology by investigating the task of multi-tagging in Modern Hebrew, as a proxy to evaluate the underlying segmentation. | ||
| 2006.amta-papers.23 We present and ***** empirically ***** compare a range of novel probabilistic finite-state transducer (PFST) models targeted at two major natural language string transduction tasks, transliteration selection and cognate translation selection. | ||
| D17-1308 We show that this geometric concentration depends on the ratio of positive to negative examples, and that it is neither theoretically nor ***** empirically ***** inherent in related embedding algorithms | ||
| NLG | 379 | |
| 2020.emnlp-main.230 Neural Natural Language Generation (***** NLG *****) systems are well known for their unreliability. | ||
| 2020.inlg-1.23 We conclude that due to a pervasive lack of clarity in reports and extreme diversity in approaches, human evaluation in ***** NLG ***** presents as extremely confused in 2020, and that the field is in urgent need of standard methods and terminology. | ||
| W18-6560 This can be used to add morphology to any kind of ***** NLG ***** application (e.g., a multi-language chatbot), without requiring computational linguistic knowledge by the integrator. | ||
| D17-1238 The majority of ***** NLG ***** evaluation relies on automatic metrics, such as BLEU . | ||
| L08-1392 Evaluating the output of *****NLG***** systems is notoriously difficult , and performing assessments of text quality even more so . | ||
| speech | 376 | |
| L10-1463 Our evaluation is aimed at ***** speech ***** recognition consumers and potential consumers with limited experience with readily available recognizers. | ||
| 2011.iwslt-evaluation.3 Specifically, we focus on 1) ***** speech ***** recognition for lecture-like data, 2) cross-domain translation using MAP adaptation, and 3) improved Arabic morphology for MT preprocessing. | ||
| 2019.iwslt-1.29 In this paper, we propose a training architecture which aims at making a neural machine translation model more robust against ***** speech ***** recognition errors. | ||
| 2020.acl-demos.37 The results of an evaluation with professional translators suggest that pen and touch interaction are suitable for deletion and reordering tasks, while ***** speech ***** and multi-modal combinations of select & ***** speech ***** are considered suitable for replacements and insertions. | ||
| 2020.sltu-1.22 We also considered the interaction of adjectives with other grammatical means, especially other part of ***** speech *****es, e.g. | ||
| Evaluation | 373 | |
| 2001.mtsummit-papers.42 MT ***** Evaluation ***** framework represent two of the principal efforts in Machine Translation ***** Evaluation ***** (MTE) over the past decade | ||
| 2021.emnlp-main.701 *****Evaluation***** metrics are a key ingredient for progress of text generation systems . | ||
| 2021.eacl-main.59 Tasks , Datasets and *****Evaluation***** Metrics are important concepts for understanding experimental scientific papers . | ||
| 2015.jeptalnrecital-court.9 *****Evaluation***** approaches for unsupervised rank - based keyword assignment are nearly as numerous as are the existing systems . | ||
| 2020.ccl-1.90 *****Evaluation***** discrepancy and overcorrection phenomenon are two common problems in neural machine translation ( NMT ) . | ||
| conversational | 373 | |
| 2021.acl-long.138 In this paper, we present a neural model for joint dropped pronoun recovery (DPR) and ***** conversational ***** discourse parsing (CDP) in Chinese ***** conversational ***** speech. | ||
| P18-5002 This tutorial surveys neural approaches to ***** conversational ***** AI that were developed in the last few years. | ||
| W19-5912 In this work, we present the first complete attempt at concurrently training ***** conversational ***** agents that communicate only via self-generated language and show that they outperform supervised and deep learning baselines. | ||
| W17-5407 This paper challenges a cross-genre document retrieval task, where the queries are in formal writing and the target documents are in ***** conversational ***** writing. | ||
| 2021.emnlp-main.86 To tackle this issue, we study a new task, named Speaker Persona Detection (SPD), which aims to detect speaker personas based on the plain ***** conversational ***** text | ||
| machine translation systems | 372 | |
| D19-1100 Although over 100 languages are supported by strong off-the-shelf ***** machine translation systems *****, only a subset of them possess large annotated corpora for named entity recognition. | ||
| 2021.insights-1.10 In this work, we conduct a comprehensive investigation on one of the centerpieces of modern ***** machine translation systems *****: the encoder-decoder attention mechanism. | ||
| P19-1177 Users of ***** machine translation systems ***** may desire to obtain multiple candidates translated in different ways. | ||
| L16-1270 In contrast to most alternative systems, ours does not rely on either parallel corpora or ***** machine translation systems *****, making it suitable for low-resource languages as the language to be learned. | ||
| 2021.americasnlp-1.25 Our best neural ***** machine translation systems ***** used multilingual pretraining, ensembling, finetuning, training on parts of the development data, and subword regularization. | ||
| convolutional | 370 | |
| 2020.sustainlp-1.21 We compare a classical CNN architecture for sequence classification involving several ***** convolutional ***** and max-pooling layers against a simple model based on weighted finite state automata (WFA). | ||
| 2020.acl-main.379 In addition, we introduce a new variation of a discriminative reranker based on graph ***** convolutional ***** networks (GCNs). | ||
| 2020.coling-main.266 First, we introduce graph ***** convolutional ***** networks to explicitly encode multiple heterogeneous dependency parse trees. | ||
| E17-1091 We contrast linear models, gradient boosted trees (GBTs) and ***** convolutional ***** neural networks (CNNs), and show that GBTs and CNNs yield the highest gains in error reduction. | ||
| 2019.iwslt-1.12 We used a slightly altered Transformer architecture with standard ***** convolutional ***** layer preparing the audio input to Transforme | ||
| generated | 370 | |
| D19-5809 The experiments using the CoQA dataset demonstrate that the quality of ***** generated ***** questions greatly improves if the question foci and the question patterns are correctly identified. | ||
| 2020.coling-main.452 Existing methods mainly focus on enhancing document representations, with little attention paid to the answer information, which may result in the ***** generated ***** question not matching the answer type and being answerirrelevant. | ||
| 2020.wmt-1.96 Following the recent recommendations for a responsible use of GPUs for NLP research, we include an estimation of the ***** generated ***** CO2 emissions, based on the power consumed for training the MT systems. | ||
| 2020.acl-main.101 Most existing methods ignore the faithfulness between a ***** generated ***** text description and the original table, leading to ***** generated ***** information that goes beyond the content of the table. | ||
| 2020.nlpcss-1.7 A subjective evaluation with 80 participants demonstrated that the ***** generated ***** biased news is generally fluent, and a bias evaluation with 24 participants demonstrated that the bias (left or right) is usually evident in the ***** generated ***** articles and can be easily identified | ||
| syntax | 370 | |
| 1993.iwpt-1.27 In particular we adopt a systemic functional ***** syntax ***** as the basis for implementing a chart based probabilistic incremental parser for a non-trivial subset of English. | ||
| 2020.lrec-1.752 Predictably, performance was twice as good in tweets with standard orthography than in tweets with spelling/casing irregularities or lack of sentence separation, the effect being more marked for morphology than for ***** syntax *****. | ||
| D18-2002 TRANX uses a transition system based on the abstract ***** syntax ***** description language for the target MR, which gives it two major advantages: (1) it is highly accurate, using information from the ***** syntax ***** of the target MR to constrain the output space and model the information flow, and (2) it is highly generalizable, and can easily be applied to new types of MR by just writing a new abstract ***** syntax ***** description corresponding to the allowable structures in the MR. | ||
| P17-1105 The outputs are represented as abstract ***** syntax ***** trees (ASTs) and constructed by a decoder with a dynamically-determined modular structure paralleling the structure of the output tree. | ||
| L10-1335 POS tagging, ***** syntax *****, pragmatics) including a toolkit used to perform complex queries across speech and text labels. | ||
| dependency | 370 | |
| C18-1044 In this paper, we propose a novel text matching network (TMN) that encodes the discourse units and the paragraphs by combining Bi-LSTM and CNN to capture both global ***** dependency ***** information and local n-gram information. | ||
| 2000.iwpt-1.43 We describe a new model for ***** dependency ***** structure analysis. | ||
| L16-1452 The proposed methods incorporate the type information of the ***** dependency ***** relations for sentence similarity calculation. | ||
| P18-2069 However, most current approaches have difficulty scaling up with domains because of the ***** dependency ***** of the model parameters on the dialogue ontology. | ||
| 2021.ranlp-1.172 To address these issues, we propose global positional encoding for ***** dependency ***** tree, a new scheme that facilitates syntactic relation modeling between any two words with keeping exactness and without immediate neighbor constraint | ||
| commonsense | 369 | |
| D19-6002 An ablation study shows that language models and semantic similarity models are complementary approaches to ***** commonsense ***** reasoning, and HNN effectively combines the strengths of both. | ||
| D19-1625 Our dataset will serve as a useful testbed for future research in ***** commonsense ***** reasoning, especially as it relates to adjectives and objects | ||
| P19-1470 We posit that an important step toward automatic ***** commonsense ***** completion is the development of generative models of ***** commonsense ***** knowledge, and propose COMmonsEnse Transformers (COMET) that learn to generate rich and diverse ***** commonsense ***** descriptions in natural language. | ||
| 2020.findings-emnlp.327 The test suite consists of three test sets, covering lexical and contextless/contextual syntactic ambiguity that requires ***** commonsense ***** knowledge to resolve. | ||
| 2021.deelio-1.2 We make use of pre-trained language models which we refine by fine-tuning them on specifically prepared corpora that we enriched with implicit information, and by constraining them with relevant concepts and connecting ***** commonsense ***** knowledge paths | ||
| graphs | 368 | |
| 2020.coling-main.46 Experimental results on multiple benchmark knowledge ***** graphs ***** show that the proposed approach outperforms existing state-of-the-art models for link prediction. | ||
| N18-1037 We begin by introducing an alternative but equivalent edge-centric view of scene ***** graphs ***** that connect to dependency parses. | ||
| 2021.naacl-main.287 Thus, in our work, we address a new and challenging problem of generating multiple proof ***** graphs ***** for reasoning over natural language rule-bases. | ||
| 2021.dash-1.1 Our system generates knowledge ***** graphs ***** from the articles mentioned in the template, which we then process using Wikidata and machine learning algorithms. | ||
| D19-1194 Also, we propose a preliminary model that selects an output from two networks at each time step: a sequence-to-sequence model (Seq2Seq) and a multi-hop reasoning model, in order to support dynamic knowledge ***** graphs ***** | ||
| labeled | 366 | |
| W19-1908 We establish a new state-of-the-art result for the task, 0.684F for in-domain (0.055-point improvement) and 0.565F for cross-domain (0.018-point improvement), by fine-tuning BERT and pre-training domain-specific BERT models on sentence-agnostic temporal relation instances with WordPiece-compatible encodings, and augmenting the ***** labeled ***** data with automatically generated “silver” instances. | ||
| 2021.emnlp-main.727 In this paper, we consider the unsupervised cross-lingual transfer for the ABSA task, where only ***** labeled ***** data in the source language is available and we aim at transferring its knowledge to the target language having no ***** labeled ***** data. | ||
| 2020.iwpt-1.13 The parsing model trained on the revised corpus shows a significant improvement of 3.0% in ***** labeled ***** attachment score over the model trained on the previous corpus. | ||
| 2020.findings-emnlp.269 Detection of some types of toxic language is hampered by extreme scarcity of ***** labeled ***** training data. | ||
| 2020.acl-demos.42 Successfully training a deep neural network demands a huge corpus of ***** labeled ***** data | ||
| natural | 366 | |
| 2021.calcs-1.20 Multilingual language models have shown decent performance in multilingual and cross-lingual ***** natural ***** language understanding tasks. | ||
| 2020.repl4nlp-1.24 We highlight that on several tasks while such perturbations are ***** natural *****, state of the art trained models are surprisingly brittle. | ||
| 2020.winlp-1.17 In the following, we present a system for assisted typing in LS whose accuracy and speed is largely due to the deployment of real time ***** natural *****-language processing enabling efficient prediction and context-sensitive grammar support. | ||
| 2021.naacl-main.342 In the pursuit of ***** natural ***** language understanding, there has been a long standing interest in tracking state changes throughout narratives. | ||
| W89-0222 The probabilities provide a ***** natural ***** mechanism for exploring more common grammatical constructions first. | ||
| semantically | 361 | |
| D19-1238 We then manually validated PCG, finding that 67% of the causation semantic frame arguments present in the news corpus were directly connected in the PCG, the remaining being connected through a ***** semantically ***** relevant intermediate node. | ||
| 2021.emnlp-main.53 An NLG response is considered acceptable if it is both ***** semantically ***** correct and grammatical. | ||
| N19-1346 This paper introduces a new task – Chinese address parsing – the task of mapping Chinese addresses into ***** semantically ***** meaningful chunks. | ||
| N18-1136 Recognizing that even correct translations are not always ***** semantically ***** equivalent, we automatically detect meaning divergences in parallel sentence pairs with a deep neural model of bilingual semantic similarity which can be trained for any parallel corpus without any manual annotation. | ||
| W19-8607 Given a pool of units for any unseen topic-stance pair, the model selects a set of unit types according to a basic rhetorical strategy (logos vs. pathos), arranges the structure of the types based on the units' argumentative roles, and finally “phrases” an argument by instantiating the structure with ***** semantically ***** coherent units from the pool | ||
| dependencies | 361 | |
| 2020.acl-main.599 In this way, we can model the ***** dependencies ***** between the two-grained answers to provide evidence for each other. | ||
| N19-1133 Although Transformer has achieved great successes on many NLP tasks, its heavy structure with fully-connected attention connections leads to ***** dependencies ***** on large training data. | ||
| P19-1228 In contrast to traditional formulations which learn a single stochastic grammar, our context-free rule probabilities are modulated by a per-sentence continuous latent variable, which induces marginal ***** dependencies ***** beyond the traditional context-free assumptions. | ||
| W18-3509 We formulate emotion detection in dialogues as a sequence labeling problem to capture the ***** dependencies ***** among labels. | ||
| L10-1387 Spanish FreeLing Dependency Grammar, named EsTxala, provides deep and robust parse trees, solving attachments for any structure and assigning syntactic functions to ***** dependencies ***** | ||
| extractive summarization | 361 | |
| W19-8909 An experimental evaluation on the MultiLing 2015 MSS dataset illustrates that semantic information can introduce benefits to the ***** extractive summarization ***** process in terms of F1, ROUGE-1 and ROUGE-2 scores, with LSA-based post-processing introducing the largest improvements. | ||
| 2021.sdp-1.11 The sentence labeling module of our method is based on SummaRuNNer, a neural sequence model for ***** extractive summarization *****. | ||
| 2021.newsum-1.14 To indicate SUBSUME's usefulness, we explore a collection of baseline algorithms for subjective ***** extractive summarization ***** and show that (i) as expected, example-based approaches better capture subjective intents than query-based ones, and (ii) there is ample scope for improving upon the baseline algorithms, thereby motivating further research on this challenging problem. | ||
| 2020.acl-main.552 Instead of following the commonly used framework of extracting sentences individually and modeling the relationship between sentences, we formulate the ***** extractive summarization ***** task as a semantic text matching problem, in which a source document and candidate summaries will be (extracted from the original text) matched in a semantic space. | ||
| W17-2307 We make use of ***** extractive summarization ***** techniques to address this task and experiment with different biomedical ontologies and various algorithms including agglomerative clustering, Maximum Marginal Relevance (MMR) and sentence compression | ||
| verbs | 358 | |
| 1998.amta-papers.38 We verify this by demonstrating that ***** verbs ***** with similar argument structure as encoded in Lexical Conceptual Structure (LCS) are rarely synonymous in WordNet. | ||
| 2019.gwc-1.23 This paper describes our project on Japanese compound ***** verbs *****. | ||
| L08-1597 AnCora-Verb-Es contains a total of 1,965 different ***** verbs ***** corresponding to 3,671 senses and AnCora-Verb-Ca contains 2,151 ***** verbs ***** and 4,513 senses. | ||
| 2020.lrec-1.384 These candidate ***** verbs ***** have been manually verified and annotation of their reflexive and reciprocal constructions has been integrated into the valency lexicon of Czech ***** verbs ***** VALLEX. | ||
| W19-6135 The exercise allows learners of the aforementioned languages to train their knowledge of particle ***** verbs ***** receiving clues from the exercise application | ||
| pretraining | 355 | |
| 2020.winlp-1.19 We define an experimental setup in which we analyze correlations between language model perplexity on specific clusters and downstream NLP task performances during ***** pretraining *****. | ||
| 2021.acl-long.243 We find that while the ***** pretraining ***** data size is an important factor, a designated monolingual tokenizer plays an equally important role in the downstream performance. | ||
| 2021.sustainlp-1.16 Recent work has explored approaches to adapt pretrained language models to new domains by incorporating additional ***** pretraining ***** on domain-specific corpora and task data. | ||
| 2020.emnlp-main.152 In-depth analysis show that 1) ***** pretraining ***** schemes could further enhance our model; 2) two-pass mechanism indeed remedy the uncoordinated slots. | ||
| 2021.emnlp-main.249 We further validate our methods using smaller models, showing that ***** pretraining ***** a model with 41% of the BERT-BASE's parameters, BERT-MEDIUM results in only a 1% drop in GLUE scores with our best objective | ||
| beam search | 354 | |
| 2020.findings-emnlp.406 Structured prediction is often approached by training a locally normalized model with maximum likelihood and decoding approximately with ***** beam search *****. | ||
| W18-6563 We observe that ***** beam search ***** heuristics for termination seem to override the model's knowledge of what a good stopping point is. | ||
| 2020.emnlp-main.738 It firstly estimates the input data's supportiveness for each target word with an estimator and then applies a supportiveness adaptor and a rebalanced ***** beam search ***** to harness the over-generation problem in the training and generation phases respectively. | ||
| 2021.acl-long.563 Experiments on four WMT directions show that our discriminative reranking approach is effective and complementary to existing generative reranking approaches, yielding improvements of up to 4 BLEU over the ***** beam search ***** output. | ||
| 2021.emnlp-main.662 Our aim is to establish whether ***** beam search ***** can be replaced by a more powerful metric-driven search technique. | ||
| encoding | 353 | |
| W19-8635 We further demonstrate that ***** encoding ***** both local and global prediction contexts yields another considerable performance boost. | ||
| 2021.naacl-main.83 However, pretrained models are underexplored in the existing work because they do not generate individual vector representations for text or labels, making it unintuitive to combine them with conventional graph ***** encoding ***** methods. | ||
| L06-1164 The grammatical formalism we make use of is Head-driven Phrase Structure Grammar, which offers one of the most comprehensive frames of ***** encoding ***** various types of linguistic information for lexical items. | ||
| K19-1016 We propose a method called reverse mapping bytepair ***** encoding *****, which maps named-entity information and other word-level linguistic features back to subwords during the ***** encoding ***** procedure of bytepair ***** encoding ***** (BPE). | ||
| C16-1234 Also, we hypothesize that ***** encoding ***** this linguistic prior in the Subword-LSTM architecture leads to the superior performance | ||
| outputs | 352 | |
| W18-6450 The collected scores were evaluated in terms of system-level correlation (how well each metric's scores correlate with WMT18 official manual ranking of systems) and in terms of segment-level correlation (how often a metric agrees with humans in judging the quality of a particular sentence relative to alternate ***** outputs *****). | ||
| W19-5352 We have manually checked the ***** outputs ***** and identified types of translation errors that are relevant to document-level translation. | ||
| P19-1604 We report improvements obtained over the state-of-the-art on the SQuAD dataset according to automated metrics (BLEU, ROUGE), as well as qualitative human assessments of the system ***** outputs *****. | ||
| 2020.emnlp-main.446 To mitigate these vulnerabilities, we propose a defense that modifies translation ***** outputs ***** in order to misdirect the optimization of imitation models. | ||
| 2020.autosimtrans-1.2 In our method, the existing speech translation model is considered as a Generator to gain a target language output, and another neural Discriminator is used to guide the distinction between ***** outputs ***** of speech translation model and true target monolingual sentences | ||
| Syntactic | 351 | |
| 2020.acl-demos.7 *****Syntactic***** dependencies can be predicted with high accuracy , and are useful for both machine - learned and pattern - based information extraction tasks . | ||
| 2020.iwpt-1.6 *****Syntactic***** surprisal has been shown to have an effect on human sentence processing , and can be predicted from prefix probabilities of generative incremental parsers . | ||
| L14-1631 *****Syntactic***** parsing of speech transcriptions faces the problem of the presence of disfluencies that break the syntactic structure of the utterances . | ||
| W17-6310 *****Syntactic***** annotation is costly and not available for the vast majority of the world 's languages . | ||
| 2021.acl-long.344 *****Syntactic***** information , especially dependency trees , has been widely used by existing studies to improve relation extraction with better semantic guidance for analyzing the context information associated with the given entities . | ||
| speech recognition | 350 | |
| 2020.coling-main.314 We introduce dual-decoder Transformer, a new model architecture that jointly performs automatic ***** speech recognition ***** (ASR) and multilingual speech translation (ST). | ||
| 2020.acl-main.215 Our model treats speech and its own textual representation as two separate modalities or views, as it jointly learns from streamed audio and its noisy transcription into text via automatic ***** speech recognition *****. | ||
| L04-1263 These databases were used to train and test ***** speech recognition ***** systems applied in a multilingual telephone-based prototype hotel booking system. | ||
| 2021.winlp-1.1 We present a state-of-the-art automatic ***** speech recognition ***** (ASR) model for Fon, and a benchmark ASR model result for Igbo. | ||
| W19-3603 This work presents a *****speech recognition***** model for Tigrinya language .The Deep Neural Network is used to make the recognition model . | ||
| WMT | 349 | |
| C18-2019 This is the software used to run the yearly evaluation campaigns for shared tasks at the ***** WMT ***** Conference on Machine Translation. | ||
| 2020.wmt-1.47 This paper describes the participation of the NLP research team of the IPN Computer Research center in the ***** WMT ***** 2020 Similar Language Translation Task. | ||
| 2020.wmt-1.111 This paper describes the Alibaba Machine Translation Group submissions to the ***** WMT ***** 2020 Shared Task on Parallel Corpus Filtering and Alignment. | ||
| 2021.acl-long.507 Further, we evaluate on competitive translation benchmarks such as ***** WMT ***** and WAT | ||
| 2021.wmt-1.39 In this paper , we discuss the various techniques that we used to implement the Russian - Chinese machine translation system for the Triangular MT task at *****WMT***** 2021 . | ||
| deep learning | 349 | |
| 2020.starsem-1.19 Obfuscation can, however, be thought of as the construction of adversarial examples to attack author identification, suggesting that the ***** deep learning ***** architectures used for adversarial attacks could have application here. | ||
| 2020.smm4h-1.13 Our approaches relied on a combination of traditional machine learning and ***** deep learning ***** models. | ||
| W18-6223 This work investigates the ability of ***** deep learning ***** architectures to build an accurate and robust model for suicidal ideation detection and compares their performance with standard baselines in text classification problems. | ||
| W18-6247 Our inference results indicate the feasibility of using ***** deep learning ***** based verbal content representation in inferring hirability scores from online conversational video resumes. | ||
| S18-1068 We first investigate several traditional Natural Language Processing (NLP) features, and then design several ***** deep learning ***** models. | ||
| estimation | 348 | |
| 2020.eamt-1.4 The Predictor-Estimator framework for quality ***** estimation ***** (QE) is commonly used for its strong performance. | ||
| 2021.acl-long.436 HMCEval includes a model confidence ***** estimation ***** module to estimate the confidence of the predicted sample assignment, and a human effort ***** estimation ***** module to estimate the human effort should the sample be assigned to human evaluation, as well as a sample assignment execution module that finds the optimum assignment solution based on the estimated confidence and effort. | ||
| L16-1473 In absence of information on previous shows, Twitter-based counts may be a viable alternative to classic ***** estimation ***** methods for TV ratings. | ||
| W19-5401 The task includes ***** estimation ***** at three granularity levels: word, sentence and document. | ||
| W16-4124 The article presents results of entropy rate ***** estimation ***** for human languages across six languages by using large, state-of-the-art corpora of up to 7.8 gigabytes | ||
| information retrieval | 348 | |
| 2021.humeval-1.9 Our contributions include the annotated dataset that we make publicly available and the proposal of Success Rate @k as an evaluation metric that is more appropriate than the traditional QA's and ***** information retrieval *****'s metrics. | ||
| 2020.wnut-1.15 We evaluate the approach by comparing the results to TF-IDF using the discounted cumulative gain metric with human annotations, finding our method outperforms TF-IDF on ***** information retrieval *****. | ||
| L16-1583 Ranking is used for a wide array of problems, most notably ***** information retrieval ***** (search). | ||
| L06-1475 It is a central component in the cross-lingual ***** information retrieval ***** (CLIR) system CINDOR (Conceptual INterlingua for DOcument Retrieval). | ||
| Q16-1003 Word meanings change over time and an automated procedure for extracting this information from text would be useful for historical exploratory studies, ***** information retrieval ***** or question answering. | ||
| contexts | 338 | |
| L14-1331 The typology concerns the forms, the lemmas and the POS involved in erroneous chunks, and in the surrounding ***** contexts *****. | ||
| S18-2008 (3) How does the entropy of concrete and abstract word ***** contexts ***** differ? | ||
| 2021.sigdial-1.29 Current TOIAs exist in niche ***** contexts ***** involving high production costs. | ||
| L14-1665 The word ***** contexts ***** are used as a basis for extracting multiword expressions and constructing thematic chains. | ||
| 2020.emnlp-main.20 Like BERT, it is a conditional generative model of tokens given their ***** contexts ***** | ||
| dialogue state | 337 | |
| W19-5910 To support this argument, the research presented in this paper is structured into three stages: (i) analyzing variable dependencies in dialogue data; (ii) applying an energy-based methodology to model ***** dialogue state ***** tracking as a structured prediction task; and (iii) evaluating the impact of inter-slot relationships on model performance. | ||
| 2021.emnlp-main.87 We test our approach on the cross-lingual ***** dialogue state ***** tracking task for the parallel MultiWoZ (English - Chinese, Chinese - English) and Multilingual WoZ (English - German, English - Italian) datasets. | ||
| W19-4109 To this end, we present an energy-based approach to ***** dialogue state ***** tracking as a structured classification task. | ||
| W19-5924 In order to overcome these limitations and to provide such an approach, we give a logical analysis of the “intent+slot” dialogue setting using a modal logic of intention and including a more expansive notion of “***** dialogue state *****”. | ||
| D19-1125 We find that by leveraging un-annotated data instead, the amount of turn-level annotations of ***** dialogue state ***** can be significantly reduced when building a neural dialogue system. | ||
| evaluation | 336 | |
| 2020.emnlp-main.751 We find that conclusions about ***** evaluation ***** metrics on older datasets do not necessarily hold on modern datasets and systems. | ||
| W17-2625 We empirically show that, our skip-thought neighbor model performs as well as the skip-thought model on ***** evaluation ***** tasks. | ||
| 2020.aespen-1.1 The workshop attracted research papers related to ***** evaluation ***** of machine learning methodologies, language resources, material conflict forecasting, and a shared task participation report in the scope of socio-political event information collection. | ||
| S17-2165 A total of 16 teams participated in ***** evaluation ***** scenario 1 (subtasks A, B, and C), with only 7 teams competing in all sub-tasks. | ||
| N18-1097 While robust ***** evaluation *****s are needed to drive further progress, so far it is unclear which ***** evaluation ***** approaches are suitable | ||
| languages | 334 | |
| L14-1520 The multilingual PPDB has over a billion paraphrase pairs in total, covering the following ***** languages *****: Arabic, Bulgarian, Chinese, Czech, Dutch, Estonian, Finnish, French, German, Greek, Hungarian, Italian, Latvian, Lithuanian, Polish, Portugese, Romanian, Russian, Slovak, Slovenian, and Swedish. | ||
| 2021.wmt-1.49 This paper describes the submission of LMU Munich to the WMT 2021 multilingual machine translation task for small track #1, which studies translation between 6 ***** languages ***** (Croatian, Hungarian, Estonian, Serbian, Macedonian, English) in 30 directions. | ||
| 2020.coling-main.387 We focus on 6 ***** languages *****: French, Spanish, Italian, Portuguese, Romanian, and Turkish. | ||
| 2020.wat-1.4 New things are being created and new words are constantly being added to ***** languages ***** worldwide. | ||
| 2020.vardial-1.10 We explore English and German as source ***** languages *****, different sizes and types of training corpora, as well as bilingual and multilingual systems | ||
| translation performance | 333 | |
| W19-5319 We conduct an in-depth evaluation of the ***** translation performance ***** of different models, highlighting the trade-offs between methods of sharing decoder parameters. | ||
| 2021.wmt-1.53 Also, model averaging is used to further improve the ***** translation performance ***** based on these systems. | ||
| 2021.emnlp-main.263 Experimental results show that BiT pushes the SOTA neural machine ***** translation performance ***** across 15 translation tasks on 8 language pairs (data sizes range from 160K to 38M) significantly higher. | ||
| W17-4812 We then conduct some evaluation experiments to testify the translation of implicit connectives and whether representing implicit connectives explicitly in source language can improve the final ***** translation performance ***** significantly. | ||
| 2021.mtsummit-research.1 Low-resource Multilingual Neural Machine Translation (MNMT) is typically tasked with improving the ***** translation performance ***** on one or more language pairs with the aid of high-resource language pairs. | ||
| domains | 331 | |
| 2021.inlg-1.19 We show that our task is practical, feasible but challenging for state-of-the-art Transformer models, and that our methods can be readily deployed for various other datasets and ***** domains ***** with decent zero-shot performance. | ||
| 2020.emnlp-main.412 The concept of Dialogue Act (DA) is universal across different task-oriented dialogue ***** domains ***** - the act of “request” carries the same speaker intention whether it is for restaurant reservation or flight booking. | ||
| 2020.lrec-1.613 We report results on 20 ***** domains ***** (all possible pairs) using 11 similarity metrics. | ||
| D19-1165 Fine-tuning pre-trained Neural Machine Translation (NMT) models is the dominant approach for adapting to new languages and ***** domains *****. | ||
| D17-1038 We show the importance of complementing similarity with diversity, and that learned measures are–to some degree–transferable across models, ***** domains *****, and even tasks | ||
| interpretable | 327 | |
| 2020.acl-main.475 Though the model does not analyze their votes or political affiliations, the TBIP separates lawmakers by party, learns ***** interpretable ***** politicized topics, and infers ideal points close to the classical vote-based ideal points. | ||
| 2020.emnlp-main.236 Such alignment-based approaches are both intuitive and ***** interpretable *****; however, they are empirically inferior to the simple cosine similarity between general-purpose sentence vectors. | ||
| D19-3001 The system is ***** interpretable ***** and user friendly and does not require labeled training data, hence can be rapidly and cost-effectively used across different domains in applied setups. | ||
| 2021.emnlp-main.523 Recently, it has been argued that encoder-decoder models can be made more ***** interpretable ***** by replacing the softmax function in the attention with its sparse variants. | ||
| 2021.eacl-main.220 Despite advances in modeling techniques, abstractive summarization models still suffer from several key challenges: (i) layout bias: they overfit to the style of training corpora; (ii) limited abstractiveness: they are optimized to copying n-grams from the source rather than generating novel abstractive summaries; (iii) lack of transparency: they are not ***** interpretable ***** | ||
| Coreference | 324 | |
| 2021.acl-long.448 *****Coreference***** resolution is essential for natural language understanding and has been long studied in NLP . | ||
| 2021.ranlp-1.10 *****Coreference***** resolution is an NLP task to find out whether the set of referring expressions belong to the same concept in discourse . | ||
| 2020.coling-main.507 *****Coreference***** resolution is the task of identifying all mentions in a text that refer to the same real - world entity . | ||
| 2021.crac-1.11 *****Coreference***** decisions among event mentions and among co - occurring entity mentions are highly interdependent , thus motivating joint inference . | ||
| 2020.emnlp-demos.27 *****Coreference***** annotation is an important , yet expensive and time consuming , task , which often involved expert annotators trained on complex decision guidelines . | ||
| metadata | 322 | |
| L16-1072 In order to determine what aspects of an LR are useful for MT practitioners, a user study was made, providing a guide to the most relevant ***** metadata ***** and the most relevant quality criteria. | ||
| 2020.lrec-1.101 Our project aims to fill both the access to the textual resources available on the web and the possibility of combining these resources with sources of ***** metadata ***** that can enrich the texts with useful information lengthening the life and maintenance of the data itself. | ||
| C18-2006 Then, we extract Tweet Scholar Blocks indicating ***** metadata ***** of papers. | ||
| L08-1258 DrStorage has been integrated with an automatic language guesser and with an automatic keyword extractor: these ***** metadata ***** can be assigned automatically to documents, because the DrStorages server part has benn modified to allow that ***** metadata ***** assignment takes place as documents are put in the repository. | ||
| L10-1179 All documents are available as UTF-8 encoded XML files with extensive ***** metadata ***** in Dublin Core standard | ||
| coreference resolution | 322 | |
| L14-1646 The ECB corpus is one of the data sets used for evaluation of the task of event ***** coreference resolution *****. | ||
| 2020.emnlp-main.686 This paper analyzes the impact of higher-order inference (HOI) on the task of ***** coreference resolution *****. | ||
| 2020.findings-emnlp.237 Applying this model to the Korean ***** coreference resolution *****, we significantly reduce the coreference linking search space. | ||
| L16-1027 Domain-general ***** coreference resolution ***** algorithms perform poorly on biomedical documents, because the cues they rely on such as gender are largely absent in this domain, and because they do not encode domain-specific knowledge such as the number and type of participants required in chemical reactions. | ||
| Q14-1037 We present a joint model of three core tasks in the entity analysis stack: ***** coreference resolution ***** (within-document clustering), named entity recognition (coarse semantic typing), and entity linking (matching to Wikipedia entities). | ||
| human | 322 | |
| 2021.acl-demo.41 To guarantee acceptability, all the text transformations are linguistically based and all the transformed data selected (up to 100,000 texts) scored highly under ***** human ***** evaluation. | ||
| P18-1153 Automatic and ***** human ***** evaluations show that our models are able to generate homographic puns of good readability and quality. | ||
| P18-1020 We describe a novel method for efficiently eliciting scalar annotations for dataset construction and system quality estimation by ***** human ***** judgments. | ||
| 2021.triton-1.5 Due to the wide-spread development of Machine Translation (MT) systems –especially Neural Machine Translation (NMT) systems– MT evaluation, both automatic and ***** human *****, has become more and more important as it helps us establish how MT systems perform. | ||
| C16-1095 In the absence of large annotated corpora, parallel corpora, treebanks, bilingual lexica, etc., we found the following to be effective: exploiting distributional regularities in monolingual data, projecting information across closely related languages, and utilizing ***** human ***** linguist judgments. | ||
| multilingual bert | 322 | |
| 2020.pam-1.16 We introduce new methods both for bitext alignment, using optimal transport, and for direct cross-lingual projection, utilizing *****multilingual BERT*****. | ||
| 2021.bsnlp-1.11 Our system uses pre-trained *****multilingual BERT***** Language Model and is fine-tuned for six Slavic languages of this task on texts distributed by organizers. | ||
| 2021.eacl-main.264 In this study, we present decoding experiments for *****multilingual BERT***** across 18 languages in order to test the generalizability of the claim that dependency syntax is reflected in attention patterns. | ||
| 2020.lrec-1.223 In addition, we provide the scores of different popular models, including LSTM, ELMo, and *****multilingual BERT***** so that the NLP community can compare against state-of-the-art systems. | ||
| 2021.iwpt-1.9 The methodology consists in leveraging *****multilingual BERT***** self-attention model pretrained on large datasets to develop a multilingual multi-task model that can predict Universal Dependencies annotations for three African low-resource languages. | ||
| domain | 321 | |
| L12-1561 We present a methodology for analyzing cross-cultural similarities and differences using language as a medium, love as ***** domain *****, social media as a data source and 'Terms' and 'Topics' as cultural features. | ||
| L14-1485 This methodology can be applied to ***** domain ***** adaptation to deal with OOV problems. | ||
| 2020.coling-main.55 Due to privacy issues, dialog data is even scarcer in the health ***** domain *****. | ||
| P19-1408 Our experiments show that using minimum spans is in particular important in cross-dataset coreference evaluation, in which detected mention boundaries are noisier due to ***** domain ***** shift. | ||
| 2021.acl-long.432 In this way, we find that crowdsourcing could be highly similar to ***** domain ***** adaptation, and then the recent advances of cross-***** domain ***** methods can be almost directly applied to crowdsourcing | ||
| semantic textual similarity | 321 | |
| P17-2099 Such an unsupervised representation is empirically validated via ***** semantic textual similarity ***** tasks on 19 different datasets, where it outperforms the sophisticated neural network models, including skip-thought vectors, by 15% on average. | ||
| 2020.findings-emnlp.39 Natural language inference (NLI) and ***** semantic textual similarity ***** (STS) are key tasks in natural language understanding (NLU). | ||
| 2020.lrec-1.847 Monolingual phrase alignment is a fundamental problem in natural language understanding and also a crucial technique in various applications such as natural language inference and ***** semantic textual similarity ***** assessment. | ||
| W19-4606 We evaluate the various selection strategies extrinsically on several downstream applications: neural machine translation, part-of-speech tagging, and ***** semantic textual similarity *****. | ||
| S17-2001 *****Semantic Textual Similarity***** (STS) measures the meaning similarity of sentences. | ||
| namely | 320 | |
| W17-1317 In this paper, we devise a recipe for building largescale Speech Corpora by harnessing Web resources ***** namely ***** YouTube, other Social Media, Online Radio and TV. | ||
| W18-5612 We evaluated our model over two tasks, ***** namely *****, identifying section boundaries and identifying section types and orders. | ||
| W17-1806 We applied the proposed framework and the guidelines built on top of it to the annotation of written texts, ***** namely ***** news articles and tweets, thus producing annotated data for a total of over 36,000 tokens. | ||
| 2020.acl-main.732 Thus, to compensate for this loss, we investigate the use of multi-task learning to jointly optimize diacritic restoration with related NLP problems ***** namely ***** word segmentation, part-of-speech tagging, and syntactic diacritization. | ||
| 2020.parlaclarin-1.11 We present a case study focusing on lexical items associated with political parties in two diachronic corpora of Austrian German, ***** namely ***** a diachronic media corpus (AMC) and a corpus of parliamentary records (ParlAT), and measure the cross-temporal stability of lexical usage over a period of 20 years | ||
| hierarchical | 316 | |
| 2011.iwslt-evaluation.21 The rule set is used as a core module in our ***** hierarchical ***** model together with two other modules, namely, a basic reordering module and an optional gap phrase module. | ||
| C18-1203 The experimental results on two datasets demonstrate the effectiveness of the proposed ***** hierarchical ***** attention neural model. | ||
| J17-3001 Technically, hybrid grammars are related to synchronous grammars, where one grammar component generates linear structures and another generates ***** hierarchical ***** structures. | ||
| 1963.earlymt-1.28 Various relations lead to ***** hierarchical ***** systems of linguistic description | ||
| 2020.emnlp-main.52 Specifically, we devise two components, prototype enhanced retrospection and ***** hierarchical ***** distillation, to mitigate the adverse effects of semantic ambiguity and class imbalance, respectively. | ||
| semantic role labeling | 316 | |
| C16-1121 We present a successful collaboration of word embeddings and co-training to tackle in the most difficult test case of ***** semantic role labeling *****: predicting out-of-domain and unseen semantic frames. | ||
| W18-0530 We present a novel rule-based system for automatic generation of factual questions from sentences, using ***** semantic role labeling ***** (SRL) as the main form of text analysis. | ||
| 2021.newsum-1.11 Using keyphrase extraction and ***** semantic role labeling ***** (SRL), we find that SRL captures relevant information without overwhelming the the model architecture. | ||
| Q19-1022 The auxiliary tasks provide syntactic information that is specific to ***** semantic role labeling ***** and are learned from training data (dependency annotations) without relying on existing dependency parsers, which can be noisy (e.g., on out-of-domain data or infrequent constructions). | ||
| 2020.acl-main.192 We perform extensive experiments to test this insight on 10 disparate tasks spanning dependency parsing (syntax), ***** semantic role labeling ***** (semantics), relation extraction (information content), aspect based sentiment analysis (sentiment), and many others, achieving performance comparable to state-of-the-art specialized models. | ||
| POS | 313 | |
| L14-1619 Second, ***** POS ***** information is transferred from the resourced language along translation pairs to the non-resourced language and used for tagging the corpus. | ||
| 2020.emnlp-main.488 For the supervised settings, we conduct extensive experiments on named entity recognition (NER), part of speech (***** POS *****) tagging and end-to-end target based sentiment analysis (E2E-TBSA) tasks. | ||
| 2020.lrec-1.482 To this end, wepropose a new tagging scheme (with 36 ***** POS ***** tags) consisting of exclusive tags for special phenomena of conversational words, developthe annotation guideline and manually annotate 16.310K sentences using this guideline | ||
| K17-3014 We present a novel neural network model that learns *****POS***** tagging and graph - based dependency parsing jointly . | ||
| L06-1132 This paper presents the results ( 1st phase ) of the on - going research in the Computational Linguistics Laboratory at Autnoma University of Madrid ( LLI - UAM ) aiming at the development of a multi - lingual parallel corpus ( Arabic - Spanish - English ) aligned on the sentence level and tagged on the *****POS***** level . | ||
| transformer | 310 | |
| 2020.emnlp-main.120 To that end, we propose a new approach to encourage learning of a contextualized sentence-level representation by shuffling the sequence of input sentences and training a hierarchical ***** transformer ***** model to reconstruct the original ordering. | ||
| 2020.conll-1.32 We compared both ***** transformer ***** and long short-term memory LMs to find that, contrary to humans, implicit causality only influences LM behavior for reference, not syntax, despite model representations that encode the necessary discourse information. | ||
| 2021.emnlp-main.137 We therefore explore the construction of a sentence-level autoencoder from a pretrained, frozen ***** transformer ***** language model. | ||
| 2021.starsem-1.10 In particular, ***** transformer ***** based state-of-the-art models achieve F1-scores of only 39.0. | ||
| 2020.acl-main.588 The idea is to allow the dependency graph to guide the representation learning of the ***** transformer ***** encoder and vice versa. | ||
| RNN | 308 | |
| 2020.winlp-1.16 We also compare the use of ***** RNN ***** and classic machine learning approaches for text classification, exploring the most used methods in the field. | ||
| D19-1376 Building on an ***** RNN ***** language model, PaLM adds an attention layer over text spans in the left context. | ||
| D18-1319 We also generalize the model for vocabulary sparsification to filter out unnecessary words and compress the ***** RNN ***** even further. | ||
| R19-1085 Since the Tunisian dialect is an under-resourced language of MSA and as there are a lot of resemblance between both languages, we suggest to investigate a recurrent neural network (***** RNN *****) for this dialect diacritization problem. | ||
| D18-1460 To construct the equivalence class, similar target hidden states are combined, leading to less ***** RNN ***** expansion operations on the target side and less softmax operations over the large target vocabulary | ||
| reading comprehension | 308 | |
| W19-5932 We formulate dialog state tracking as a ***** reading comprehension ***** task to answer the question what is the state of the current dialog? | ||
| D18-1238 This paper presents a new compositional encoder for ***** reading comprehension ***** (RC). | ||
| 2020.coling-main.235 The novel framework shows an interesting perspective on machine ***** reading comprehension ***** and cognitive science. | ||
| 2020.coling-main.329 Machine ***** reading comprehension ***** (MRC) is the task that asks a machine to answer questions based on a given context. | ||
| 2020.emnlp-main.549 Numerical reasoning over texts, such as addition, subtraction, sorting and counting, is a challenging machine ***** reading comprehension ***** task, since it requires both natural language understanding and arithmetic computation. | ||
| sarcasm detection | 308 | |
| W17-5201 This talk will describe the approach, datasets and challenges in ***** sarcasm detection ***** using different forms of incongruity. | ||
| 2020.figlang-1.1 As the community working on computational approaches for ***** sarcasm detection ***** is growing, it is imperative to conduct benchmarking studies to analyze the current state-of-the-art, facilitating progress in this area. | ||
| D17-1169 Through emoji prediction on a dataset of 1246 million tweets containing one of 64 common emojis we obtain state-of-the-art performance on 8 benchmark datasets within emotion, sentiment and ***** sarcasm detection ***** using a single pretrained model. | ||
| 2020.findings-emnlp.124 Existing multi-modal ***** sarcasm detection ***** methods either simply concatenate the features from multi modalities or fuse the multi modalities information in a designed manner. | ||
| D17-3002 The tutorial is motivated by our continually evolving survey paper of ***** sarcasm detection *****, that is available on arXiv at: Joshi, Aditya, Pushpak Bhattacharyya, and Mark James Carman. | ||
| relational | 306 | |
| L06-1197 It was originally developed for the annotation of semantic roles in the frame semantics paradigm, but can be used for graphical annotation of treebanks with general ***** relational ***** information in a simple drag-and-drop fashion. | ||
| 2021.naacl-main.370 Additionally, we introduce a new binary classification task for English scalar adjective identification which examines the models' ability to distinguish scalar from ***** relational ***** adjectives. | ||
| 2020.acl-main.255 However, they make limited use of the vast amount of ***** relational ***** information encoded in Lexical Knowledge Bases (LKB). | ||
| I17-1086 Previous open Relation Extraction (open RE) approaches mainly rely on linguistic patterns and constraints to extract important ***** relational ***** triples from large-scale corpora. | ||
| P18-1047 Existing methods mainly focus on Normal class and fail to extract ***** relational ***** triplets precisely | ||
| Representation | 305 | |
| W18-1304 The Graphical Knowledge ***** Representation ***** which is output by the parser is inspired by the Abstract Knowledge ***** Representation *****, which separates out conceptual and contextual levels of representation that deal respectively with the subject matter of a sentence and its existential commitments. | ||
| C18-1173 *****Representation***** learning is a key issue for most Natural Language Processing ( NLP ) tasks . | ||
| D18-1002 Recent advances in *****Representation***** Learning and Adversarial Training seem to succeed in removing unwanted features from the learned representation . | ||
| 2021.naacl-main.293 *****Representation***** learning is widely used in NLP for a vast range of tasks . | ||
| 2020.acl-main.207 *****Representation***** learning is a critical ingredient for natural language processing systems . | ||
| text classification | 303 | |
| 2021.emnlp-main.643 Here, we introduce the application of balancing loss functions for multi-label ***** text classification *****. | ||
| 2021.blackboxnlp-1.19 Tsetlin Machine (TM) is an interpretable pattern recognition algorithm based on propositional logic, which has demonstrated competitive performance in many Natural Language Processing (NLP) tasks, including sentiment analysis, ***** text classification *****, and Word Sense Disambiguation. | ||
| 2020.findings-emnlp.185 In addition, we propose a weakly supervised pretraining, where labels for ***** text classification ***** are obtained automatically from an existing approach. | ||
| 2020.findings-emnlp.130 Based on our results we encourage using data balancing prior to training for ***** text classification ***** tasks. | ||
| W18-6223 This work investigates the ability of deep learning architectures to build an accurate and robust model for suicidal ideation detection and compares their performance with standard baselines in ***** text classification ***** problems. | ||
| abusive language | 303 | |
| 2020.acl-main.380 For example, texts containing some demographic identity-terms (e.g., “gay”, “black”) are more likely to be abusive in existing ***** abusive language ***** detection datasets. | ||
| W19-3508 We propose an experimental study that has three aims: 1) to provide us with a deeper understanding of current data sets that focus on different types of ***** abusive language *****, which are sometimes overlapping (racism, sexism, hate speech, offensive language, and personal attacks); 2) to investigate what type of attention mechanism (contextual vs. self-attention) is better for ***** abusive language ***** detection using deep learning architectures; and 3) to investigate whether stacked architectures provide an advantage over simple architectures for this task. | ||
| 2021.socialnlp-1.10 In this paper, we investigate the effectiveness of several Unsupervised Domain Adaptation (UDA) approaches for the task of cross-corpora ***** abusive language ***** detection. | ||
| W19-4112 The agent also needs to detect and respond to ***** abusive language *****, sensitive topics and trolling behaviour of the users. | ||
| 2021.emnlp-demo.9 Furthermore, most of previous studies mainly focus on the detection of ***** abusive language *****, disregarding implicit offensiveness and underestimating a different degree of intensity. | ||
| paraphrase | 302 | |
| D19-5820 A lot of diverse reading comprehension datasets have recently been introduced to study various phenomena in natural language, ranging from simple ***** paraphrase ***** matching and entity typing to entity tracking and understanding the implications of the context. | ||
| 2021.wnut-1.32 We present new state-of-the-art benchmarks for ***** paraphrase ***** detection on all six languages in the Opusparcus sentential ***** paraphrase ***** corpus: English, Finnish, French, German, Russian, and Swedish. | ||
| W18-0906 Our model is trained for a binary classification of ***** paraphrase ***** candidates, and then used to predict graded ***** paraphrase ***** acceptability. | ||
| W17-1914 Our sense inventory is constructed using a clustering method which generates ***** paraphrase ***** clusters that are congruent with lexical substitution annotations in a development set. | ||
| W19-8709 These findings confirm the ability of NMT to produce correct ***** paraphrase *****s, which could also explain why BLEU is often considered as an inadequate metric to evaluate the performance of NMT systems. | ||
| statistical machine | 301 | |
| C16-1172 While most sentences are more accurate and fluent than translations by ***** statistical machine ***** translation (SMT)-based systems, in some cases, the NMT system produces translations that have a completely different meaning. | ||
| 2012.iwslt-papers.9 Although ***** statistical machine ***** translation (SMT) has made great progress since it came into being, the translation of numerical and time expressions is still far from satisfactory. | ||
| 2014.amta-wptp.15 Such post-editing (e.g., PET [Aziz et al., 2012]) can be used practically for translation between European languages, which has a high performance in ***** statistical machine ***** translation. | ||
| L14-1176 This paper presents a systematic human evaluation of translations of English support verb constructions produced by a rule-based machine translation (RBMT) system (OpenLogos) and a ***** statistical machine ***** translation (SMT) system (Google Translate) for five languages: French, German, Italian, Portuguese and Spanish. | ||
| 2012.iwslt-papers.3 We present a novel approach for continuous space language models in *****statistical machine***** translation by using Restricted Boltzmann Machines ( RBMs ) . | ||
| ie | 298 | |
| 2020.acl-main.461 While the majority of previous work has focused on the extractive setting, i.e., selecting fragments from input rev***** ie *****ws to produce a summary, we let the model generate novel sentences and hence produce abstractive summar***** ie *****s. | ||
| L16-1313 This interest is due to the opportunity of using it in the framework of Amb***** ie *****nt Assisted Living both for home automation (vocal command) and for call for help in case of distress situations, i.e. after a fall. | ||
| 2020.findings-emnlp.330 Attending to broader context at test time provides complementary information to pretraining (Gururangan et al., 2020), y***** ie *****lds strong gains over equivalently parameterized models lacking such context, and performs best at recognizing entit***** ie *****s with high TF-IDF scores (i.e., those that are important within a document). | ||
| N19-1082 In this paper, we introduce GraphIE, a framework that operates over a graph representing a broad set of dependenc***** ie *****s between textual units (i.e. words or sentences). | ||
| W17-3535 Many data-to-text NLG systems work with data sets which are incomplete, ***** ie ***** some of the data is missing | ||
| dictionaries | 297 | |
| C16-3002 The goal of this tutorial is to introduce the proposed sentiment analysis technologies and datasets in the literature, and give the audience the opportunities to use resources and tools to process Chinese texts from the very basic preprocessing, i.e., word segmentation and part of speech tagging, to sentiment analysis, i.e., applying sentiment ***** dictionaries ***** and obtaining sentiment scores, through step-by-step instructions and a hand-on practice. | ||
| L06-1356 Market surveys have pointed out translators demand for integrated specialist ***** dictionaries ***** in translation memory tools which they could use in addition to their own compiled ***** dictionaries ***** or stored translated parts of text. | ||
| 1999.mtsummit-1.28 However, building ***** dictionaries ***** needs time and labor. | ||
| 2014.amta-researchers.7 Since naïve application of the system for N languages would require N(N - 1) ***** dictionaries *****, it is also evaluated using a pivot language, where only 2(N - 1) ***** dictionaries ***** would be required, with surprisingly similar performance. | ||
| 2020.vardial-1.6 However, these approaches require cross-lingual information such as seed ***** dictionaries ***** to train the model and find a linear transformation between the word embedding spaces. | ||
| detection task | 295 | |
| 2021.ranlp-1.88 In addition, considering emoji position can further improve the performance for the irony ***** detection task ***** compared to the emoji label prediction. | ||
| 2021.naacl-main.303 The stance ***** detection task ***** aims at detecting the stance of a tweet or a text for a target. | ||
| 2021.semeval-1.26 Semeval-2021, Task 5 - Toxic Spans Detection is based on a novel annotation of a subset of the Jigsaw Unintended Bias dataset and is the first language toxicity ***** detection task ***** dedicated to identifying the toxicity-level spans. | ||
| S18-1025 We test their performance on twitter affect ***** detection task ***** to determine which features produce the most informative representation of a sentence. | ||
| 2020.acl-main.279 We propose a Semi-supervIsed GeNerative Active Learning (SIGNAL) model to address the imbalance, efficiency, and text camouflage problems of Chinese text spam ***** detection task *****. | ||
| utterance | 292 | |
| D19-1097 Through stacking multiple CM-blocks, our CM-Net is able to alternately perform information exchange among specific memories, local contexts and the global ***** utterance *****, and thus incrementally enriches each other. | ||
| 2017.jeptalnrecital-recital.12 Using syntax/semantic interface of categorial grammar, this work can be used for deriving possible semantic readings of an incomplete ***** utterance *****. | ||
| 2021.inlg-1.11 We divide the task into two subtasks: ***** utterance ***** timing identification and ***** utterance ***** generation. | ||
| D19-5535 If an action can be separated into subactions, the reaction time of the systems can be improved through incremental processing of the user ***** utterance ***** and starting subactions while the ***** utterance ***** is still being uttered. | ||
| S19-2043 This paper describes our system for SemEval-2019 Task 3: EmoContext, which aims to predict the emotion of the third ***** utterance ***** considering two preceding ***** utterance *****s in a dialogue | ||
| polarity | 291 | |
| D17-1059 The results are enhanced further with a new dictionary-based technique and a novel ***** polarity ***** classification technique. | ||
| 2020.mmw-1.1 Based on these clusters, we propose a new semi-automatic model for SUMO attributes and their mapping to WordNet, which also includes ***** polarity ***** information. | ||
| 2020.lrec-1.616 Many ***** polarity ***** shifters can affect both positive and negative polar expressions, shifting them towards the opposing ***** polarity *****. | ||
| S18-1194 The experiments show that the first framework is more effective and sentiment ***** polarity ***** is useful. | ||
| W19-3601 It returned an average ***** polarity ***** agreement of 95% with other general purpose sentiment lexicons | ||
| algorithm | 291 | |
| 2020.iwpt-1.21 To accomplish the shared task on dependency parsing we explore the use of a linear transition-based neural dependency parser as well as a combination of three of them by means of a linear tree combination ***** algorithm *****. | ||
| 2020.ccl-1.107 The final knowledge representation is obtained by a weight-based disease prediction ***** algorithm *****, and it is fused with the text representation through a linear weighting method. | ||
| L10-1429 In the process of implementing a basic CCGbank conversion ***** algorithm *****, we reveal properties of Arabic grammar that interfere with conversion, such as subject topicalization, genitive constructions, relative clauses, and optional pronominal subjects. | ||
| I17-1092 We show that the results of our ***** algorithm ***** are mostly perceived similarly to human generated elaborateness and indirectness and can be used to adapt a conversation to the current user and situation. | ||
| 2021.acl-long.415 Once the explanation ***** algorithm ***** is distilled into an explainer network, it can be used to explain new instances | ||
| parse | 289 | |
| P19-1283 We then our methods to correlate neural representations of English sentences with their constituency ***** parse ***** trees. | ||
| D19-1393 One novel characteristic of GSP is that it constructs a ***** parse ***** graph incrementally in a top-down fashion. | ||
| 2009.iwslt-evaluation.15 To integrate deep syntactic information, we propose the use of ***** parse ***** trees and semantic dependencies on English sentences described respectively by Head-driven Phrase Structure Grammar and Predicate-Argument Structures. | ||
| 2020.coling-main.325 As case studies, we investigate the degree to which a verb embedding encodes the verb's subject, a pronoun embedding encodes the pronoun's antecedent, and a full-sentence representation encodes the sentence's head word (as determined by a dependency ***** parse *****). | ||
| W19-1102 To match natural language stories with existing schemas, we first ***** parse ***** the stories into an underspecified variant of the logical form used by the schemas, which is suitable for most concrete stories | ||
| probabilistic | 289 | |
| D18-1174 Unlike previous ***** probabilistic ***** models for learning multi-sense word embeddings, disambiguated skip-gram is end-to-end differentiable and can be interpreted as a simple feed-forward neural network. | ||
| 2021.semspace-1.5 Using the framework of Contextuality-by-Default (CbD), we explore the ***** probabilistic ***** variants of these and show that CbD-contextuality is also possible. | ||
| 2006.amta-papers.2 The Joint Probability Model proposed by Marcu and Wong (2002) provides a ***** probabilistic ***** framework for modeling phrase-based statistical machine transla- tion (SMT). | ||
| W19-8605 Following an earlier work on abbreviations in English (Mahowald et al, 2013), we bring a ***** probabilistic ***** perspective to these questions, using both a behavioral and a corpus-based approach. | ||
| P19-1428 This paper proposes a novel AutoML strategy based on ***** probabilistic ***** grammatical evolution, which is evaluated on the health domain by facing the knowledge discovery challenge in Spanish text documents | ||
| latent | 289 | |
| 2021.emnlp-main.78 Sparse representations, either in symbolic or ***** latent ***** form, are more efficient with an inverted index. | ||
| D18-1108 Using the recently proposed SparseMAP inference, which retrieves a sparse distribution over ***** latent ***** structures, we propose a novel approach for end-to-end learning of ***** latent ***** structure predictors jointly with a downstream predictor. | ||
| S19-2021 One could consider the emotion of each dialogue turn to be independent, but in this paper, we introduce a hierarchical approach to classify emotion, hypothesizing that the current emotional state depends on previous ***** latent ***** emotions. | ||
| L12-1516 In order to handle the increasing amount of textual information today available on the web and exploit the knowledge ***** latent ***** in this mass of unstructured data, a wide variety of linguistic knowledge and resources (Language Identification, Morphological Analysis, Entity Extraction, etc.). | ||
| 2020.emnlp-main.73 As each refinement step only involves computation in the ***** latent ***** space of low dimensionality (we use 8 in our experiments), we avoid computational overhead incurred by existing non-autoregressive inference procedures that often refine in token space | ||
| Dependencies | 287 | |
| N18-2109 The resulting parser outperforms the original version and achieves the best accuracy on the Stanford ***** Dependencies ***** conversion of the Penn Treebank among greedy transition-based parsers. | ||
| W19-6102 This approach is motivated specifically in the context of Universal ***** Dependencies *****, an effort to develop uniform and cross-lingually consistent treebanks across multiple languages. | ||
| 2021.disrpt-1.4 Key features of the presented approach are the formulation as a clause-level classification task, a language-independent feature inventory based on Universal ***** Dependencies ***** grammar, and composite-verb-form analysis. | ||
| 2020.emnlp-main.422 We apply our framework to all languages included in the Universal ***** Dependencies ***** project, with promising results | ||
| 2020.udw-1.15 We provide a linguistic discussion around decisions on how to appropri- ately label Irish MWEs using the compound , flat and fixed dependency relation labels within the framework of the Universal *****Dependencies***** annotation guidelines . | ||
| alignment | 286 | |
| P19-1148 We develop a hard attention sequence-to-sequence model that enforces strict monotonicity and learns ***** alignment ***** jointly. | ||
| P19-1044 State-of-the-art models of lexical semantic change detection suffer from noise stemming from vector space ***** alignment *****. | ||
| L14-1123 We report on an evaluation of the tool (Web)MAUS (Kisler, 2012) on several language documentation corpora and discuss practical issues in the application of forced ***** alignment *****. | ||
| 2012.amta-papers.21 However, parallel data is quite inhomogeneous in many practical applications with respect to several factors like data source, ***** alignment ***** quality, appropriateness to the task, etc. | ||
| 2020.wmt-1.111 In the ***** alignment *****-filtering task, the extraction pipeline of bilingual sentence pairs includes the following steps: bilingual lexicon mining, language identification, sentence segmentation and sentence ***** alignment ***** | ||
| FrameNet | 285 | |
| L12-1241 We present the first results on semantic role labeling using the Swedish ***** FrameNet *****, which is a lexical resource currently in development. | ||
| 2020.lrec-1.306 Our work introduces 11 new manually crafted frames along with 9 existing ***** FrameNet ***** frames, all of which have been selected with fact-checking in mind. | ||
| L10-1598 Especially, the method of linking ***** FrameNet ***** frame elements with VerbaLex semantic roles is built using the information provided by the ontology of semantic types in ***** FrameNet *****. | ||
| L08-1262 We present an experiment in extracting collocations from the *****FrameNet***** corpus , specifically , support verbs such as direct in Environmentalists directed strong criticism at world leaders . | ||
| 2021.eacl-demos.19 Given a text document as input , our core system identifies spans of textual entity and event mentions with a *****FrameNet***** ( Baker et al . , 1998 ) parser . | ||
| crowdsourcing | 284 | |
| L12-1066 Our findings are based on lessons learned while developing and deploying Sentiment Quiz, a ***** crowdsourcing ***** application for creating sentiment lexicons (an essential component of most sentiment detection algorithms). | ||
| N18-3005 In this paper, we present a study of ***** crowdsourcing ***** methods for a user intent classification task in our deployed dialogue system. | ||
| D18-1367 First, we introduce HYPO, a corpus containing overstatements (or hyperboles) collected on the web and validated via ***** crowdsourcing *****. | ||
| L12-1452 We present a framework for the acquisition of sentential paraphrases based on ***** crowdsourcing *****. | ||
| N19-1150 For specialized domains, expert annotations may be prohibitively expensive; the alternative is to rely on ***** crowdsourcing ***** to reduce costs at the risk of introducing noise | ||
| summaries | 284 | |
| 2021.naacl-main.395 In this work we introduce a new corpus of parallel texts in English comprising technical and lay ***** summaries ***** of all published evidence pertaining to different clinical topics. | ||
| L06-1392 Annotations of Summary Content Units (SCUs) generate models referred to as pyramids which can be used to evaluate unseen human ***** summaries ***** or machine ***** summaries *****. | ||
| 2020.acl-main.457 Sequence-to-sequence models for abstractive summarization have been studied extensively, yet the generated ***** summaries ***** commonly suffer from fabricated content, and are often found to be near-extractive. | ||
| 2020.coling-main.502 In this work, we present a model to generate e-commerce product ***** summaries *****. | ||
| L10-1062 We have collected 932 model ***** summaries ***** in English from existing image descriptions and machine translated these ***** summaries ***** into German | ||
| dialogues | 283 | |
| 2020.acl-main.133 The results of our experiments show that our model substantially (more than 20 accuracy points) outperforms its strong competitors on the DailyDialogue corpus, and performs on par with them on the SwitchBoard corpus for ranking ***** dialogues ***** concerning their coherence. | ||
| 1999.mtsummit-1.32 TDMT, a machine translation model, was developed by ATR-ITL to deal with ***** dialogues ***** in the travel domain. | ||
| 2021.acl-long.436 We assess the performance of HMCEval on the task of evaluating malevolence in ***** dialogues *****. | ||
| L14-1253 In this paper, we focus particularly on data collected during a dialogue to discuss the application of conversation analysis (CA) to signed ***** dialogues ***** and signed conversations. | ||
| D19-1508 In this paper, we first construct a dialogue symptom diagnosis dataset based on an online medical forum with a large amount of ***** dialogues ***** between patients and doctors | ||
| learning | 281 | |
| D17-2003 Case studies tend to be used in legal, business, and health education contexts, but less in the teaching and ***** learning ***** of linguistics. | ||
| D19-1212 Multi-view ***** learning ***** algorithms are powerful representation ***** learning ***** tools, often exploited in the context of multimodal problems. | ||
| 2020.aacl-main.29 To resolve the cold start problem in training, we propose a method using a pseudo data generator which generates pseudo texts and KB triples for ***** learning ***** an initial model. | ||
| P19-1516 In this paper, we propose a neural network inspired multi- task ***** learning ***** framework that can simultaneously extract ADRs from various sources. | ||
| 2020.acl-main.588 The idea is to allow the dependency graph to guide the representation ***** learning ***** of the transformer encoder and vice versa. | ||
| alignments | 280 | |
| 2021.emnlp-main.665 With the advent of end-to-end deep learning approaches in machine translation, interest in word ***** alignments ***** initially decreased; however, they have again become a focus of research more recently. | ||
| L04-1166 This paper investigates a number of factors that may contribute to highly accurate forced ***** alignments ***** to support the rapid production of these multimodal corpora including the acoustic model, the match between the speech used for training the system and that to be force aligned, the amount of data used to train the ASR system, the availability of speaker adaptation, and the duration of alignment segments. | ||
| W16-4803 In contrast to the alignment-based methods, our method does not require explicit ***** alignments *****. | ||
| D19-1453 Finally, by incorporating IBM model ***** alignments ***** into our multi-task training, we report significantly better alignment accuracies compared to GIZA++ on three publicly available data sets. | ||
| P18-1037 We introduce a neural parser which treats ***** alignments ***** as latent variables within a joint probabilistic model of concepts, relations and ***** alignments ***** | ||
| simplification | 280 | |
| 2020.lrec-1.404 In this paper, we present a corpus for use in automatic readability assessment and automatic text ***** simplification ***** for German, the first of its kind for this language. | ||
| 2020.readi-1.2 Parallel monolingual resources are imperative for data-driven sentence ***** simplification ***** research. | ||
| 2020.aacl-srw.22 Previous studies in text ***** simplification ***** employ the weighted sum of sub-rewards from three perspectives: grammaticality, meaning preservation, and simplicity. | ||
| 2021.emnlp-main.500 An important task in NLP applications such as sentence ***** simplification ***** is the ability to take a long, complex sentence and split it into shorter sentences, rephrasing as necessary | ||
| 2021.wnut-1.1 In this work, we investigate the effect of text ***** simplification ***** in the task of question-answering using a comprehension context. | ||
| adversarial attack | 279 | |
| 2020.deelio-1.3 On average, the proposed defense improved the classification accuracy of the CNN and Bi-LSTM models by 41.30% and 55.66%, respectively, when tested under ***** adversarial attack *****s. | ||
| 2020.starsem-1.19 Obfuscation can, however, be thought of as the construction of adversarial examples to attack author identification, suggesting that the deep learning architectures used for ***** adversarial attack *****s could have application here. | ||
| 2021.calcs-1.19 Inspired by this phenomenon, we present two strong black-box ***** adversarial attack *****s (one word-level, one phrase-level) for multilingual models that push their ability to handle code-mixed sentences to the limit. | ||
| 2021.alta-1.12 In this work, we present an investigation of Groverâs susceptibility to ***** adversarial attack *****s such as character-level and word-level perturbations. | ||
| 2020.challengehml-1.9 Further, We use these insights to craft ***** adversarial attack *****s which inflict significant damage to these systems with negligible change in meaning of the input questions. | ||
| morphology | 278 | |
| N18-1006 In our architecture, an additional ***** morphology ***** table is plugged into the model. | ||
| 2009.iwslt-evaluation.11 Specifically, we focus on 1) Cross-domain translation using MAP adaptation and unsupervised training, 2) Turkish morphological processing and translation, 3) improved Arabic ***** morphology ***** for MT preprocessing, and 4) system combination methods for machine translation. | ||
| 2020.lrec-1.879 With the high cost of manually labeling data for ***** morphology ***** and the increasing interest in low-resource languages, unsupervised morphological segmentation has become essential for processing a typologically diverse set of languages, whether high-resource or low-resource. | ||
| L06-1240 Previous modifications and enhancements attempted to capture more elegantly and concisely different aspects of the complex ***** morphology ***** of Arabic, finding theoretical grounding in Lexeme-Based Morphology. | ||
| 2020.lrec-1.752 Predictably, performance was twice as good in tweets with standard orthography than in tweets with spelling/casing irregularities or lack of sentence separation, the effect being more marked for ***** morphology ***** than for syntax. | ||
| language generation | 278 | |
| C16-1191 Natural ***** language generation ***** (NLG) is an important component of question answering(QA) systems which has a significant impact on system quality. | ||
| 2020.coling-main.420 Most current state-of-the art systems for generating English text from Abstract Meaning Representation (AMR) have been evaluated only using automated metrics, such as BLEU, which are known to be problematic for natural ***** language generation *****. | ||
| 2021.naacl-main.416 However, existing report generation systems, despite achieving high performances on natural ***** language generation ***** metrics such as CIDEr or BLEU, still suffer from incomplete and inconsistent generations. | ||
| 2020.acl-main.705 However, it remains an open question how to utilize BERT for ***** language generation *****. | ||
| N18-5020 The system architecture consists of several components including spoken language processing, dialogue management, ***** language generation *****, and content management, with emphasis on user-centric and content-driven design. | ||
| generation | 276 | |
| 2021.wanlp-1.17 By initializing the weights of the encoder and decoder with AraBERT pre-trained weights, our model was able to leverage knowledge transfer and boost performance in response ***** generation *****. | ||
| 2020.emnlp-main.68 We experimentally confirm that training data filtered by the proposed method improves the quality of neural dialogue agents in response ***** generation *****. | ||
| 2021.alta-1.12 Grover is a model for both ***** generation ***** and detection of neural fake news. | ||
| 2021.naacl-main.476 (2) Word unit prediction constrains the word usage to impose strong lexical control during ***** generation *****. | ||
| P19-1538 To incorporate meta-words into ***** generation *****, we propose a novel goal-tracking memory network that formalizes meta-word expression as a goal in response ***** generation ***** and manages the ***** generation ***** process to achieve the goal with a state memory panel and a state controller | ||
| dialogue systems | 274 | |
| P17-1120 Recently emerged intelligent assistants on smartphones and home electronics (e.g., Siri and Alexa) can be seen as novel hybrids of domain-specific task-oriented spoken ***** dialogue systems ***** and open-domain non-task-oriented ones. | ||
| W18-1401 The challenge for computational models of spatial descriptions for situated ***** dialogue systems ***** is the integration of information from different modalities. | ||
| W17-5503 We test state of the art ***** dialogue systems ***** for their behaviour in response to user-initiated sub-dialogues, i.e. | ||
| 2020.acl-main.765 A narrative plays a different role than the context (i.e., previous utterances), which is generally used in current ***** dialogue systems *****. | ||
| 2021.codi-main.4 We describe a graphical Discourse-Driven Integrated Dialogue Development Environment (DD-IDDE) for spoken open-domain ***** dialogue systems *****. | ||
| abstract meaning representation | 273 | |
| 2020.emnlp-main.196 In the literature, the research on ***** abstract meaning representation ***** (AMR) parsing is much restricted by the size of human-curated dataset which is critical to build an AMR parser with good performance. | ||
| 2020.conll-shared.8 Among the five frameworks, we address only the ***** abstract meaning representation ***** framework and propose a joint state model for the graph-sequence iterative inference of (Cai and Lam, 2020) for a simplified graph-sequence inference. | ||
| W19-3320 Existing approaches such as semantic role labeling (SRL) and ***** abstract meaning representation ***** (AMR) still have features related to the peculiarities of the particular language. | ||
| 2021.eacl-main.129 We show that besides well-known issues from which such metrics suffer, an additional problem arises when applying these metrics for AMR-to-text evaluation, since an ***** abstract meaning representation ***** allows for numerous surface realizations. | ||
| Q19-1002 In this work, we study the usefulness of AMR (***** abstract meaning representation *****) on NMT. | ||
| stance detection | 271 | |
| 2021.naacl-main.303 The ***** stance detection ***** task aims at detecting the stance of a tweet or a text for a target. | ||
| 2020.socialnlp-1.5 We investigate whether pre-trained bidirectional transformers with sentiment and emotion information improve ***** stance detection ***** in long discussions of contemporary issues. | ||
| 2020.acl-main.291 In this paper, we proposed a Semantic-Emotion Knowledge Transferring (SEKT) model for cross-target ***** stance detection *****, which uses the external knowledge (semantic and emotion lexicons) as a bridge to enable knowledge transfer across different targets. | ||
| 2020.nlp4if-1.3 Given that ***** stance detection ***** can significantly aid in veracity prediction, this work focuses on boosting automated ***** stance detection *****, a task on which pre-trained models have been extremely successful on, as on several other tasks. | ||
| N19-4014 We present FAKTA which is a unified framework that integrates various components of a fact-checking process: document retrieval from media sources with various types of reliability, *****stance detection***** of documents with respect to given claims, evidence extraction, and linguistic analysis. | ||
| question | 270 | |
| 2021.naacl-main.160 We propose an interview assistant system to automatically, and in an objective manner, select an optimal set of technical ***** question *****s (from ***** question ***** banks) personalized for a candidate. | ||
| 2021.acl-long.318 Extensive experiments on four ***** question ***** answering datasets show that our method significantly outperforms previous learning methods in terms of task performance and is more effective in training models to produce correct solutions. | ||
| L08-1178 In the EU-funded project, QALL-ME, a domain-specific ontology was developed and applied for ***** question ***** answering in the domain of tourism, along with the assistance of two upper ontologies for concept expansion and reasoning. | ||
| N19-4013 In contrast to most ***** question ***** answering and reading comprehension models today, which operate over small amounts of input text, our system integrates best practices from IR with a BERT-based reader to identify answers from a large corpus of Wikipedia articles in an end-to-end fashion. | ||
| 2021.eacl-main.253 We also present an effective method to combine text and table-based predictions for ***** question ***** answering from full documents, obtaining significant improvements on the Natural Questions dataset (Kwiatkowski et al., 2019) | ||
| Speech | 269 | |
| 2020.lrec-1.813 We address transparency and fairness in spoken language systems by proposing a study about gender representation in speech resources available through the Open ***** Speech ***** and Language Resource platform. | ||
| L06-1289 *****Speech***** synthesis or text - to - speech ( TTS ) systems are currently available for a number of the world 's major languages , but for thousands of other , unsupported , languages no such technology is available . | ||
| L08-1528 *****Speech***** synthesis by unit selection requires the segmentation of a large single speaker high quality recording . | ||
| 1999.mtsummit-1.16 *****Speech***** communication includes many important issues on natural language processing and they are related with desirable advanced speech translation systems . | ||
| 2020.lrec-1.778 *****Speech***** recognition has seen dramatic improvements in the last decade , though those improvements have focused primarily on adult speech . | ||
| memory network | 268 | |
| Q19-1012 On one of the hardest class of programs (comparative reasoning) with 5–10 steps, CIPITR outperforms NSM by a factor of 89 and ***** memory network *****s by 9 times. | ||
| D19-5811 We present a system for answering questions based on the full text of books (BookQA), which first selects book passages given a question at hand, and then uses a ***** memory network ***** to reason and predict an answer. | ||
| 2021.naacl-main.157 We then propose a ***** memory network ***** to generate personalized responses in dialogue that utilizes a novel mechanism of splitting memories: one for user profile meta attributes and the other for user-generated information like comment histories. | ||
| 2020.acl-main.313 To this end, we introduce a novel embedding model, named R-MeN, that explores a relational ***** memory network ***** to encode potential dependencies in relationship triples. | ||
| 2020.semeval-1.281 I present the system based on the architecture of bidirectional long short-term ***** memory network *****s (BiLSTM) concatenated with lexicon-based features and a social-network specific feature and then followed by two fully connected dense layers for detecting Turkish offensive tweets. | ||
| Domain | 266 | |
| W19-5920 *****Domain***** adaptation in natural language generation ( NLG ) remains challenging because of the high complexity of input semantics across domains and limited data of a target domain . | ||
| 2010.amta-papers.16 This paper presents a set of experiments on *****Domain***** Adaptation of Statistical Machine Translation systems . | ||
| 2021.naacl-main.147 *****Domain***** divergence plays a significant role in estimating the performance of a model in new domains . | ||
| W19-5009 *****Domain***** adaptation remains one of the most challenging aspects in the wide - spread use of Semantic Role Labeling ( SRL ) systems . | ||
| 2020.nlpmc-1.2 *****Domain***** Adaptation for Automatic Speech Recognition ( ASR ) error correction via machine translation is a useful technique for improving out - of - domain outputs of pre - trained ASR systems to obtain optimal results for specific in - domain tasks . | ||
| parallel | 266 | |
| 2020.iwslt-1.3 For simultaneous translation, we utilize a novel architecture that makes dynamic decisions, learned from ***** parallel ***** data, to determine when to continue feeding on input or generate output words. | ||
| I17-1030 While the recently introduced Newsela corpus has alleviated the first problem, simplifications still need to be learned directly from ***** parallel ***** text using black-box, end-to-end approaches rather than from explicit annotations. | ||
| 2008.amta-srw.1 Syntax-based approaches to statistical MT require syntax-aware methods for acquiring their underlying translation models from ***** parallel ***** data. | ||
| 2010.amta-papers.35 We embed this extended hybrid measure in a distributional paraphrasing technique, benefiting from both linguistic knowledge and independence from ***** parallel ***** texts | ||
| 2020.wmt-1.51 Finally, we make use of additional monolingual data by creating synthetic ***** parallel ***** data through back-translation. | ||
| preprocessing | 265 | |
| 2020.wnut-1.17 We describe these noise types and propose a ***** preprocessing ***** pipeline to denoise user's answers. | ||
| 2009.iwslt-evaluation.5 This year we worked on the Arabic-English and Turkish-English BTEC tasks with a special effort on linguistic ***** preprocessing ***** techniques involving morphological segmentation. | ||
| W19-4214 In two evaluations, we consistently outperform competitive unsupervised baselines and approach the performance of state-of-the-art supervised models trained on large amounts of data, providing evidence for the value of linguistic input during ***** preprocessing *****. | ||
| D19-1365 While a rule-based system is still a common ***** preprocessing ***** step for formality style transfer in the neural era, it could introduce noise if we use the rules in a naive way such as data ***** preprocessing *****. | ||
| 1998.amta-papers.15 EasyEnglish is used as a ***** preprocessing ***** step for machine-translating IBM manuals | ||
| tagging | 264 | |
| D17-1036 We introduce a novel neural easy-first decoder that learns to solve sequence ***** tagging ***** tasks in a flexible order. | ||
| 2020.emnlp-main.406 Yet, state-of-the-art models often rely on simple approaches to model the label space, e.g. bigram Conditional Random Fields (CRFs) in sequence ***** tagging *****. | ||
| 2021.dravidianlangtech-1.4 The proposed work tries to solve this by using bi-directional LSTMs along with language ***** tagging *****. | ||
| 2020.lrec-1.259 We carried on experiments using available datasets (e.g., from the Evalita shared tasks) on two sequence ***** tagging ***** tasks (i.e., named entities recognition and nominal entities recognition) and four classification tasks (i.e., lexical relations among words, semantic relations among sentences, sentiment analysis and text classification) | ||
| 2021.ranlp-1.42 Often, the LI and POS ***** tagging ***** tasks are interdependent in the code-mixing scenario. | ||
| annotating | 262 | |
| L10-1245 In this paper, we discuss the theoretical, sociolinguistic, methodological and technical objectives and issues of the French Creagest Project (2007-2012) in setting up, documenting and ***** annotating ***** a large corpus of adult and child French Sign Language (LSF) and of natural gestural language. | ||
| L12-1552 While ***** annotating ***** this richer tagset is more complicated than ***** annotating ***** the base tagset, it is much easier than ***** annotating ***** treebank data. | ||
| 2020.lrec-1.862 We describe a procedure for ***** annotating ***** low resource languages using Dragonfly that others can use, which we developed based on our experience ***** annotating ***** data in more than ten languages. | ||
| D18-1194 We present: (1) a form of decompositional semantic analysis designed to allow systems to target varying levels of structural complexity (shallow to deep analysis), (2) an evaluation metric to measure the similarity between system output and reference semantic analysis, (3) an end-to-end model with a novel ***** annotating ***** mechanism that supports intra-sentential coreference, and (4) an evaluation dataset on which our model outperforms strong baselines by at least 1.75 F1 score. | ||
| L12-1433 This data is, generally, collected by manual ***** annotating ***** SL video corpus | ||
| treebanks | 259 | |
| 2020.udw-1.8 Yet, the usages of the relations can be categorically different even for ***** treebanks ***** of the same language. | ||
| 2020.mwe-1.2 The resulting lexicon interoperates with dependency tree searching software so that instances can be quickly found within dependency ***** treebanks *****. | ||
| L16-1414 On the other hand, each valency-capable word in the ***** treebanks ***** is linked to a frame entry in the lexicon. | ||
| W16-4009 Historical ***** treebanks ***** tend to be manually annotated, which is not surprising, since state-of-the-art parsers are not accurate enough to ensure high-quality annotation for historical texts. | ||
| C16-1002 Various ***** treebanks ***** have been released for dependency parsing | ||
| automatic speech recognition | 259 | |
| 2020.coling-main.314 We introduce dual-decoder Transformer, a new model architecture that jointly performs ***** automatic speech recognition ***** (ASR) and multilingual speech translation (ST). | ||
| 2020.acl-main.215 Our model treats speech and its own textual representation as two separate modalities or views, as it jointly learns from streamed audio and its noisy transcription into text via ***** automatic speech recognition *****. | ||
| 2021.winlp-1.1 We present a state-of-the-art ***** automatic speech recognition ***** (ASR) model for Fon, and a benchmark ASR model result for Igbo. | ||
| L14-1558 We also present an add-on for the GOS corpus, which enables its usage for ***** automatic speech recognition *****. | ||
| L06-1257 One of the most critical components in the process of building ***** automatic speech recognition ***** (ASR) capabilities for a new language is the lexicon, or pronouncing dictionary. | ||
| verb | 258 | |
| 1991.iwpt-1.9 The data, originally collected by Bach, Brown and Marslen-Wilson (1986), concern the comprehensibility of ***** verb ***** dependency constructions in Dutch and German: right-branching, center-embedded, and cross-serial dependencies of one to four levels deep. | ||
| W19-4733 Our model shows that Frisian ***** verb ***** cluster word orders are associated with different context features than Dutch ***** verb ***** orders, supporting the `learned borrowing' hypothesis. | ||
| W16-4120 Several multifactorial corpus studies of Dutch ***** verb ***** clusters have used other measures of processing complexity to show that this factor affects word order choice. | ||
| L12-1221 This paper outlines a proposal for encoding and describing ***** verb ***** phrase constructions in the knowledge base on the environment EcoLexicon, with the objective of helping translators in specialized text production. | ||
| L12-1088 We present a carefully designed dependency conversion of the German phrase - structure treebank TiGer that explicitly represents *****verb***** ellipses by introducing empty nodes into the tree . | ||
| discourse parsing | 258 | |
| W19-2703 Implicit discourse relation classification is one of the most challenging and important tasks in ***** discourse parsing *****, due to the lack of connectives as strong linguistic cues. | ||
| W19-2715 Segmentation is the first step in building practical discourse parsers, and is often neglected in ***** discourse parsing ***** studies. | ||
| 2021.acl-long.303 Identifying the atomic sentences within complex sentences is important for applications such as summarization, argument mining, discourse analysis, ***** discourse parsing *****, and question answering. | ||
| W19-0416 Implicit discourse relation classification is one of the most difficult steps in ***** discourse parsing *****. | ||
| P18-1210 We believe that these new results can both inform future progress in theoretical work on discourse coherence and lead to higher levels of performance in ***** discourse parsing ***** | ||
| convolutional network | 258 | |
| P19-1131 To tackle the joint type inference task, we propose a novel graph ***** convolutional network ***** (GCN) running on an entity-relation bipartite graph. | ||
| 2021.emnlp-main.663 Then, we incorporate both source and target graphs into the conventional Transformer architecture with graph ***** convolutional network *****s. | ||
| N19-1103 In this work we introduce ConvR, an adaptive ***** convolutional network ***** designed to maximize entity-relation interactions in a convolutional fashion. | ||
| 2020.acl-main.642 To simultaneously capture the relations between objects in an image and the syntactic dependency relations between words in a question, we propose a novel dual channel graph ***** convolutional network ***** (DC-GCN) for better combining visual and textual advantages. | ||
| 2021.emnlp-main.658 To this end, we propose a novel continuum model by extending the idea of neural ordinary differential equations (ODEs) to multi-relational graph ***** convolutional network *****s. | ||
| Treebank | 257 | |
| 2020.lrec-1.117 The treebank is obtained from the automated conversion of the Late Latin Charter ***** Treebank ***** 2 (LLCT2), originally in the Prague Dependency ***** Treebank ***** (PDT) style. | ||
| W17-2622 We evaluate our proposed architecture on the Penn ***** Treebank ***** language modeling task. | ||
| D18-1277 English part-of-speech taggers regularly make egregious errors related to noun-verb ambiguity, despite having achieved 97%+ accuracy on the WSJ Penn ***** Treebank ***** since 2002 | ||
| D19-1092 *****Treebank***** translation is a promising method for cross - lingual transfer of syntactic dependency knowledge . | ||
| P18-1252 *****Treebank***** conversion is a straightforward and effective way to exploit various heterogeneous treebanks for boosting parsing performance . | ||
| Identifying | 255 | |
| 2020.findings-emnlp.36 ***** Identifying ***** metaphors in text is very challenging and requires comprehending the underlying comparison. | ||
| 2020.acl-main.49 ***** Identifying ***** controversial posts on social media is a fundamental task for mining public sentiment, assessing the influence of events, and alleviating the polarized views | ||
| L14-1184 *****Identifying***** cognates is an interesting task with applications in numerous research areas , such as historical and comparative linguistics , language acquisition , cross - lingual information retrieval , readability and machine translation . | ||
| C16-1136 *****Identifying***** events of a specific type is a challenging task as events in texts are described in numerous and diverse ways . | ||
| 2020.lrec-1.768 *****Identifying***** irony in user - generated social media content has a wide range of applications ; however to date Arabic content has received limited attention . | ||
| paper | 255 | |
| 2010.amta-***** paper *****s.6 In this ***** paper *****, we present the insights gained from a detailed study of coupling a highly modular English-Hindi RBMT system with a standard phrase-based SMT system. | ||
| W17-1413 In the ***** paper ***** we present an adaptation of Liner2 framework to solve the BSNLP 2017 shared task on multilingual named entity recognition. | ||
| 2008.amta-***** paper *****s.19 We also build a cascaded translation model that dynamically shifts translation units from phrase level to word and morpheme phrase levels. | ||
| S18-1073 This ***** paper ***** describes our approach to SemEval-2018 Task 2, which aims to predict the most likely associated emoji, given a tweet in English or Spanish. | ||
| I17-4035 In this ***** paper *****, we propose the use of an attention-based LSTM (AT-LSTM) model for these tasks. | ||
| Generating | 253 | |
| 2021.naacl-main.336 *****Generating***** metaphors is a challenging task as it requires a proper understanding of abstract concepts , making connections between unrelated concepts , and deviating from the literal meaning . | ||
| D17-1017 *****Generating***** captions for images is a task that has recently received considerable attention . | ||
| D19-1314 *****Generating***** text from graph - based data , such as Abstract Meaning Representation ( AMR ) , is a challenging task due to the inherent difficulty in how to properly encode the structure of a graph with labeled edges . | ||
| P19-1208 *****Generating***** keyphrases that summarize the main points of a document is a fundamental task in natural language processing . | ||
| 2020.lantern-1.2 *****Generating***** images from textual descriptions has recently attracted a lot of interest . | ||
| sequence tagging | 253 | |
| Q18-1030 In this paper, we present a ***** sequence tagging ***** framework and apply it to word segmentation for a wide range of languages with different writing systems and typological characteristics. | ||
| 2021.ecnlp-1.10 Our model jointly learns the similarity between attributes of the two verticals along with the model parameters for the ***** sequence tagging ***** model. | ||
| 2020.lrec-1.139 We use a neural approach for ***** sequence tagging ***** and focus on the extraction of explicit discourse arguments. | ||
| 2020.emnlp-main.183 Our observation is that the three elements within a triplet are highly related to each other, and this motivates us to build a joint model to extract such triplets using a ***** sequence tagging ***** approach. | ||
| 2021.acl-long.254 It adopts ***** sequence tagging ***** to extract relevant cells from the table along with relevant spans from the text to infer their semantics, and then applies symbolic reasoning over them with a set of aggregation operators to arrive at the final answer. | ||
| dialogue state tracking | 251 | |
| W19-5910 To support this argument, the research presented in this paper is structured into three stages: (i) analyzing variable dependencies in dialogue data; (ii) applying an energy-based methodology to model ***** dialogue state tracking ***** as a structured prediction task; and (iii) evaluating the impact of inter-slot relationships on model performance. | ||
| 2021.emnlp-main.87 We test our approach on the cross-lingual ***** dialogue state tracking ***** task for the parallel MultiWoZ (English - Chinese, Chinese - English) and Multilingual WoZ (English - German, English - Italian) datasets. | ||
| W19-4109 To this end, we present an energy-based approach to ***** dialogue state tracking ***** as a structured classification task. | ||
| 2021.acl-long.287 We demonstrate the effectiveness of the proposed method in the zero-shot domain transfer learning for *****dialogue state tracking*****. | ||
| 2020.nlp4convai-1.10 *****Dialogue state tracking***** (DST) is at the heart of task-oriented dialogue systems. | ||
| paraphrase generation | 251 | |
| 2021.eacl-main.33 ParaSCI obtains satisfactory results on human evaluation and downstream tasks, especially long ***** paraphrase generation *****. | ||
| N18-1018 Most recent approaches use the sequence-to-sequence model for ***** paraphrase generation *****. | ||
| 2020.acl-main.28 We model ***** paraphrase generation ***** as an optimization problem and propose a sophisticated objective function, involving semantic similarity, expression diversity, and language fluency of paraphrases. | ||
| 2021.emnlp-main.199 The proposed paradigm offers merits over existing ***** paraphrase generation ***** methods: (1) using the context regularizer on meanings, the model is able to generate massive amounts of high-quality paraphrase pairs; (2) the combination of the huge amount of paraphrase candidates and further diversity-promoting filtering yields paraphrases with more lexical and syntactic diversity; and (3) using human-interpretable scoring functions to select paraphrase pairs from candidates, the proposed framework provides a channel for developers to intervene with the data generation process, leading to a more controllable model. | ||
| W18-3814 However, automatic processing techniques based on interlingual relations, such as machine translation or ***** paraphrase generation ***** exploiting translational equivalence, have not exploited these relations explicitly until now. | ||
| Empirical | 250 | |
| 2020.coling-main.221 ***** Empirical ***** studies illustrate the importance of proposed sentiment forecasting task, and justify the effectiveness of our NSF model over several strong baselines. | ||
| 2020.lrec-1.234 ***** Empirical ***** results suggest that the proposed methodology can be meaningfully applied to parsing into graph-structured target representations, uncovering hitherto unknown properties of the different systems that can inform future development and cross-fertilization across approaches. | ||
| N19-1271 ***** Empirical ***** study shows that our approach can be applied to many existing MRC models. | ||
| W19-3814 ***** Empirical ***** results demonstrate that, under explicit syntactic supervision and without the need to fine tune BERT, R-GCN's embeddings outperform the original BERT embeddings on the coreference task | ||
| 2003.mtsummit-tttt.8 *****Empirical***** methods in Natural Language Processing ( NLP ) and Machine Translation ( MT ) have become mainstream in the research field . | ||
| negation | 250 | |
| 2021.naacl-main.227 Within this paper we show that these models are not robust to linguistic phenomena, specifically ***** negation ***** and speculation. | ||
| 2020.blackboxnlp-1.13 We explore the imprint of two specific linguistic alternations, namely passivization and ***** negation *****, on the representations generated by neural models trained with two different objectives: masked language modeling and translation. | ||
| C18-1078 The availability of corpora annotated with ***** negation ***** information is essential to develop ***** negation ***** processing systems in any language. | ||
| 2020.acl-main.429 We find that after fine-tuning BERT and RoBERTa on a ***** negation ***** scope task, the average attention head improves its sensitivity to ***** negation ***** and its attention consistency across ***** negation ***** datasets compared to the pre-trained models | ||
| W16-5007 This paper discusses the need for a dictionary of affixal ***** negation *****s and regular antonyms to facilitate their automatic detection in text. | ||
| universal dependencies | 250 | |
| D17-1258 We therefore investigate to what extent constituents can be replaced with ***** universal dependencies *****, or left out completely, as well as how state-of-the-art segmenters fare in the absence of sentence boundaries. | ||
| N18-1088 We study the problem of analyzing tweets with ***** universal dependencies ***** (UD). | ||
| 2020.iwpt-1.23 In this paper, we present the submission of team CLASP to the IWPT 2020 Shared Task on parsing enhanced ***** universal dependencies *****. | ||
| 2020.udw-1.13 We report on an application of ***** universal dependencies ***** for the study of diachronic shifts in syntactic usage patterns. | ||
| E17-1034 Our results reveal that converting to ***** universal dependencies ***** is not necessarily trivial, moreover, using language-specific morphological features may have an impact on overall performance. | ||
| penn treebank | 250 | |
| C16-1131 We show that the neural LM perplexity can be reduced by 7.395 and 12.011 using the proposed domain adaptation mechanism on the *****Penn Treebank***** and News data, respectively. | ||
| 2021.conll-1.23 Experiments on Chinese *****Penn Treebank***** 5.1 and 7.0 show that our joint model consistently outperforms the pipeline approach on both settings of w/o and w/ BERT, and achieves new state-of-the-art performance. | ||
| L08-1441 ProPOSEL is a prototype prosody and PoS (part-of-speech) English lexicon for Language Engineering, derived from the following language resources: the computer-usable dictionary CUVPlus, the CELEX-2 database, the Carnegie-Mellon Pronouncing Dictionary, and the BNC, LOB and *****Penn Treebank***** PoS-tagged corpora. | ||
| N18-1089 In our experiments on the *****Penn Treebank***** WSJ corpus and the Universal Dependencies (UD) dataset (27 languages), we find that AT not only improves the overall tagging accuracy, but also 1) prevents over-fitting well in low resource languages and 2) boosts tagging accuracy for rare / unseen words. | ||
| P17-2018 Our parsers reduce error by 3.7–7.6% relative to those using existing transition systems on the *****Penn Treebank***** dependency parsing task and English Universal Dependencies. | ||
| visual question | 247 | |
| W19-1808 Recent work on visually grounded language learning has focused on broader applications of grounded representations, such as ***** visual question ***** answering and multimodal machine translation. | ||
| W19-4806 Focusing on the FiLM ***** visual question ***** answering model, our experiments indicate that a form of approximate number system emerges whose performance declines with more difficult scenes as predicted by Weber's law. | ||
| 2021.acl-long.564 However, we uncover a striking contrast to this promise: across 5 models and 4 datasets on the task of ***** visual question ***** answering, a wide variety of active learning approaches fail to outperform random selection. | ||
| 2021.eacl-main.240 In fact, this is the case with most existing ***** visual question ***** answering (VQA) datasets where they assume only one ground-truth answer for each question. | ||
| 2021.emnlp-main.517 Knowledge-based ***** visual question ***** answering (VQA) requires answering questions with external knowledge in addition to the content of images. | ||
| language representation | 246 | |
| 2021.emnlp-main.30 A computationally expensive and memory intensive neural network lies behind the recent success of ***** language representation ***** learning. | ||
| 2020.wanlp-1.2 It also evaluates the recent ***** language representation ***** model BERT on the task of Arabic hate speech detection. | ||
| P19-1139 In this paper, we utilize both large-scale textual corpora and KGs to train an enhanced ***** language representation ***** model (ERNIE), which can take full advantage of lexical, syntactic, and knowledge information simultaneously. | ||
| 2020.acl-main.76 The T-TA computes contextual ***** language representation *****s without repetition and displays the benefits of a deep bidirectional architecture, such as that of BERT. | ||
| 2020.emnlp-main.159 We first present algorithms to model phrase-object relevance by leveraging fine-grained visual representations and visually-aware ***** language representation *****s. | ||
| robustness | 245 | |
| D18-1153 In model training, LMs are learned with layer-wise dropouts for better ***** robustness *****. | ||
| 2021.acl-long.412 Additionally, CTFN still maintains ***** robustness ***** when considering missing modality. | ||
| 2021.acl-long.22 We find that MBR still exhibits a length and token frequency bias, owing to the MT metrics used as utility functions, but that MBR also increases ***** robustness ***** against copy noise in the training data and domain shift. | ||
| D17-1034 We leverage reinforcement learning to enable joint training on the proposed modules, and introduce various exploration techniques on sense selection for better ***** robustness *****. | ||
| 2021.eacl-main.208 To achieve such ***** robustness *****, prior research has considered multi-task objectives when training neural encoders | ||
| argumentation | 245 | |
| C16-1260 We also establish the first deep learning baselines for three ***** argumentation ***** mining tasks. | ||
| W19-4512 One corpus that has been annotated in parallel for ***** argumentation ***** structure and for discourse structure (RST, SDRT) are the `argumentative microtexts' (Peldszus and Stede, 2016a). | ||
| D19-1568 This paper presents a new dataset to initiate the study of this aspect of ***** argumentation *****: it consists of a diverse collection of arguments covering 741 controversial topics and comprising over 47,000 claims. | ||
| J17-1004 The goal of ***** argumentation ***** mining, an evolving research field in computational linguistics, is to design methods capable of analyzing people's ***** argumentation *****. | ||
| N19-1054 Empirical results show that using this approach improves the state of art performance across four benchmark ***** argumentation ***** data sets by an average of 4 absolute F1 points in claim detection | ||
| bilingual | 244 | |
| I17-1069 Since ***** bilingual ***** word embeddings have recently shown efficient models for learning ***** bilingual ***** distributed representation of words, we explore different word embedding models and show how a general-domain comparable corpus can enrich a specialized comparable corpus via neural networks | ||
| K18-1021 Most recent approaches to ***** bilingual ***** dictionary induction find a linear alignment between the word vector spaces of two languages. | ||
| 2020.cl-1.1 (iv) Do the representations learned by multilingual NMT models capture the same amount of linguistic information as their ***** bilingual ***** counterparts? | ||
| W16-4705 This paper proposes and evaluates a novel method to generate ***** bilingual ***** term candidates by using existing terminologies and delving into their systematicity. | ||
| 2010.amta-srw.1 English-Manipuri language pair is one of the rarely investigated with restricted ***** bilingual ***** resources | ||
| contextualized | 242 | |
| W19-2310 In this paper, we explore using ***** contextualized ***** word embeddings to compute more accurate relatedness scores, thus better evaluation metrics. | ||
| 2021.acl-long.145 Recent work on analyzing ***** contextualized ***** text representations has focused on hand-designed probe models to understand how and to what extent do these representations encode a particular linguistic phenomenon. | ||
| N19-1391 We also implement cross-lingual mapping of deep ***** contextualized ***** word embeddings using parallel sentences with word alignments. | ||
| D19-1006 In all layers of ELMo, BERT, and GPT-2, on average, less than 5% of the variance in a word's ***** contextualized ***** representations can be explained by a static embedding for that word, providing some justification for the success of ***** contextualized ***** representations. | ||
| W19-3805 Recently, ***** contextualized ***** word embeddings have enhanced previous word embedding techniques by computing word vector representations dependent on the sentence they appear in | ||
| topic | 241 | |
| 2020.acl-main.648 We further construct benchmark datasets for ***** topic ***** classification and script conversion. | ||
| 2020.acl-srw.41 Our results show that the prior tweet and ***** topic ***** features can improve performance on this task. | ||
| K19-1054 We propose BeamSeg, a joint model for segmentation and ***** topic ***** identification of documents from the same domain. | ||
| Q16-1021 Rule-based stemmers such as the Porter stemmer are frequently used to preprocess English corpora for ***** topic ***** modeling. | ||
| W17-5538 In addition, we illustrate the mechanism through which the frame and ***** topic ***** information enable the more accurate metaphor detection | ||
| automatic | 241 | |
| L08-1087 We argue that while ***** automatic ***** semantic role labeling systems (ASRL) have an important contribution to make, they cannot solve the problem for all cases. | ||
| L16-1314 This is achieved by limiting human effort to transcribing parts for which ***** automatic ***** transcription quality is insufficient. | ||
| R17-1056 We present here ***** automatic ***** methods for detecting potential secondary errors that would result from ***** automatic ***** inference mechanisms when they rely on an initial error manually detected. | ||
| 2020.lrec-1.743 We propose three challenges appropriate for this corpus that are related to processing units of signs in context: ***** automatic ***** alignment of text and video, semantic segmentation of sign language, and production of video-text embeddings for cross-modal retrieval. | ||
| 2020.lrec-1.430 Given the enormous amount of content created every day, ***** automatic ***** methods are required to detect and deal with this type of content | ||
| Compared | 240 | |
| W19-8673 ***** Compared ***** to the general VAE-RNN architectures, we show that our model can achieve much more stable training process and can generate text with significantly better quality. | ||
| W19-2304 ***** Compared ***** to the generations of a traditional left-to-right language model, BERT generates sentences that are more diverse but of slightly worse quality. | ||
| P18-2115 ***** Compared ***** with the source content, the annotated summary is short and well written. | ||
| D19-5624 ***** Compared ***** to a state of the art attention model, our counterfactual attention models produce 68% of function words and 21% of content words in our German-English dataset. | ||
| 2020.emnlp-main.600 ***** Compared ***** with NGDG, we are able to achieve increases of 3% and 5% on TREC-6 and SST-2 | ||
| discriminative | 240 | |
| S18-1158 For example, the ***** discriminative ***** feature red characterizes the first word from the (apple, banana) pair, but not the second. | ||
| Q15-1042 We use integer linear programming to generate equation trees and score their likelihood by learning local and global ***** discriminative ***** models. | ||
| L14-1621 Finally, we discuss the impact of ***** discriminative ***** and non-***** discriminative ***** words extracted by both methods in terms of transcription accuracy. | ||
| D18-1031 Most of existing personalized microblog sentiment classification methods suffer from the insufficiency of ***** discriminative ***** tweets for personalization learning | ||
| 2013.iwslt-evaluation.24 Furthermore, we investigated different reordering models as well as an extended ***** discriminative ***** word lexicon. | ||
| interpretability | 238 | |
| P19-1284 The success of neural networks comes hand in hand with a desire for more ***** interpretability *****. | ||
| P18-2040 Our experimental results indicate that involving more named entities in topic descriptors positively influences the overall quality of topics, improving their ***** interpretability *****, specificity and diversity. | ||
| E17-1009 While these models yield state-of-the-art results on a range of tasks, their drawback is poor ***** interpretability *****. | ||
| 2021.naacl-main.413 When predicting medical diagnoses, for example, identifying predictive content in clinical notes not only enhances ***** interpretability *****, but also allows unknown, descriptive (i.e., text-based) risk factors to be identified. | ||
| C18-1157 This paper attempts to marry the ***** interpretability ***** of statistical machine learning approaches with the more robust models of joke structure and joke semantics capable of being learned by neural models | ||
| Sentence | 238 | |
| C16-1272 *****Sentence***** intersection captures the semantic overlap of two texts , generalizing over paradigms such as textual entailment and semantic text similarity . | ||
| 2020.findings-emnlp.40 *****Sentence***** function is an important linguistic feature indicating the communicative purpose in uttering a sentence . | ||
| D17-1062 *****Sentence***** simplification aims to make sentences easier to read and understand . | ||
| 2021.acl-long.72 *****Sentence***** embeddings are an important component of many natural language processing ( NLP ) systems . | ||
| W17-5911 *****Sentence***** retrieval is an important NLP application for English as a Second Language ( ESL ) learners . | ||
| entailment | 237 | |
| 2021.naacl-main.223 Our empirical evaluation examines multiple NLP tasks, including sentence and document classification, question answering and textual ***** entailment *****. | ||
| L14-1314 By means of crowdsourcing techniques, each pair was annotated for two crucial semantic tasks: relatedness in meaning (with a 5-point rating scale as gold score) and ***** entailment ***** relation between the two elements (with three possible gold labels: ***** entailment *****, contradiction, and neutral). | ||
| 2021.emnlp-main.661 In this paper, we propose a query efficient attack strategy to generate plausible adversarial examples on text classification and ***** entailment ***** tasks. | ||
| 2021.ranlp-1.131 Siamese networks train the text hypothesis pairs with word embeddings and language agnostic embeddings, and the results are evaluated against classification metrics for binary classification into ***** entailment ***** and contradiction classes. | ||
| P18-1225 We consider the problem of learning textual ***** entailment ***** models with limited supervision (5K-10K training examples), and present two complementary approaches for it. | ||
| event | 237 | |
| W19-0407 We show that LTAG allows us to separate constructional from lexical meaning components and that frames enable elegant generalizations over ***** event ***** types and related constraints. | ||
| 2020.aespen-1.1 The ***** event ***** consists of regular research papers and a shared task, which is about ***** event ***** sentence coreference identification (ESCI), tracks. | ||
| L16-1228 The resulting data provides insights into ***** event ***** patterns resulting from task specific user behavior and thus constitutes a basis for machine learning approaches to learn automation rules. | ||
| C16-1265 Specifically, while morpho-syntactic and context features are considered sufficient for classifying ***** event *****-timex pairs, we believe that exploiting distributional semantic information about ***** event ***** words can benefit supervised classification of other types of pairs. | ||
| 2021.ranlp-1.40 A core task in information extraction is ***** event ***** detection that identifies ***** event ***** triggers in sentences that are typically classified into ***** event ***** types | ||
| causal | 236 | |
| N19-1179 The challenges we identified are two: 1) event ***** causal ***** relations are sparse among all possible event pairs in a document, in addition, 2) few ***** causal ***** relations are explicitly stated. | ||
| W19-5031 We experiment with both generic public relation extraction datasets and a new biomedical ***** causal ***** sentence detection dataset, a subset of which we make publicly available. | ||
| 2021.acl-long.361 By formulating OpenRE using a structural ***** causal ***** model, we identify that the above-mentioned problems stem from the spurious correlations from entities and context to the relation type. | ||
| E17-3023 It automatically extracts variables (“CO2”) involved in events of change/increase/decrease (“increasing CO2”), as well as co-occurrence and ***** causal ***** relations among these events (“increasing CO2 causes a decrease in pH in seawater”), resulting in a big knowledge graph. | ||
| 2021.cinlp-1.3 By adjusting the strength of the penalty for each type of feature, we build a predictive model that relies more on ***** causal ***** features and less on non-***** causal ***** features | ||
| token | 233 | |
| 2020.wnut-1.43 The key features of our method are its use of graph attention networks to encode syntactic dependencies and word positions in the sentence, and a loss function based on connectionist temporal classification (CTC) that can learn a label for each ***** token ***** without reference data for each ***** token *****. | ||
| 2020.acl-main.15 3) Source-target alignment constraint encourages dependency of a target ***** token ***** on source ***** token *****s and thus eases the training of NAR models. | ||
| 2020.lrec-1.108 It takes up challenges reported on in previous works, such as how to cover material properties of a name ***** token ***** and how to define lemmatization principles, and elaborates on possible solutions. | ||
| 2021.acl-short.112 To alleviate overfitting, we develop a multi-task learning approach, which regularizes the data-deficient dialog generation task with a masked ***** token ***** prediction task. | ||
| W19-8620 However, we still obtain only marginal gains under full linguistic context and posit that visual embeddings extracted from deep vision models (ResNet for Multi30k, ResNext for How2) do not lend themselves to increasing the discriminativeness between the vocabulary elements at ***** token ***** level prediction in NMT | ||
| Morphological | 232 | |
| W19-4213 This paper presents the submission by the Charles University - University of Malta team to the SIGMORPHON 2019 Shared Task on *****Morphological***** Analysis and Lemmatization in context . | ||
| 2021.mrl-1.23 *****Morphological***** tasks have gained decent popularity within the NLP community in the recent years , with large multi - lingual datasets providing morphological analysis of words , either in or out of context . | ||
| W19-4211 This paper presents the UNT HiLT+Ling system for the Sigmorphon 2019 shared Task 2 : *****Morphological***** Analysis and Lemmatization in Context . | ||
| N18-1005 *****Morphological***** segmentation for polysynthetic languages is challenging , because a word may consist of many individual morphemes and training data can be extremely scarce . | ||
| W18-5808 *****Morphological***** segmentation is beneficial for several natural language processing tasks dealing with large vocabularies . | ||
| Dependency | 231 | |
| W18-0709 By projecting gold coreference from Czech to English and vice versa on Prague Czech - English *****Dependency***** Treebank 2.0 Coref , we set an upper bound of a proposed projection approach for these two languages . | ||
| L14-1218 In this paper we present the results of an ongoing experiment of bootstrapping a Treebank for Catalan by using a *****Dependency***** Parser trained with Spanish sentences . | ||
| 2021.iwcs-1.12 *****Dependency***** parsing is a tool widely used in the field of Natural language processing and computational linguistics . | ||
| 2020.coling-main.341 *****Dependency***** trees have been shown to be effective in capturing long - range relations between target entities . | ||
| P19-1024 *****Dependency***** trees convey rich structural information that is proven useful for extracting relations among entities in text . | ||
| paraphrases | 231 | |
| 2021.emnlp-main.349 First, we strengthen the model's ability to rewrite by further pre-training BART on both an existing collection of generic ***** paraphrases *****, as well as on synthetic pairs created using a general-purpose lexical resource. | ||
| 2020.lrec-1.848 This graph is then traversed to extract sets of ***** paraphrases *****. | ||
| D19-5307 We demonstrate that our system obtains high quality ***** paraphrases *****, as evaluated by crowd workers. | ||
| 2021.emnlp-main.199 The proposed paradigm offers merits over existing paraphrase generation methods: (1) using the context regularizer on meanings, the model is able to generate massive amounts of high-quality paraphrase pairs; (2) the combination of the huge amount of paraphrase candidates and further diversity-promoting filtering yields ***** paraphrases ***** with more lexical and syntactic diversity; and (3) using human-interpretable scoring functions to select paraphrase pairs from candidates, the proposed framework provides a channel for developers to intervene with the data generation process, leading to a more controllable model | ||
| W19-8709 These findings confirm the ability of NMT to produce correct ***** paraphrases *****, which could also explain why BLEU is often considered as an inadequate metric to evaluate the performance of NMT systems. | ||
| Automatic | 231 | |
| L16-1359 This paper introduces JATE 2.0, a complete remake of the free Java ***** Automatic ***** Term Extraction Toolkit (Zhang et al., 2008) delivering new features including: (1) highly modular, adaptable and scalable ATE thanks to integration with Apache Solr, the open source free-text indexing and search platform; (2) an extended collection of state-of-the-art algorithms. | ||
| L08-1243 *****Automatic***** tagging in Spanish has historically faced many problems because of some specific grammatical constructions . | ||
| 2020.findings-emnlp.289 *****Automatic***** summarization research has traditionally focused on providing high quality general - purpose summaries of documents . | ||
| 2021.eacl-main.312 *****Automatic***** detection of the four MBTI personality dimensions from texts has recently attracted noticeable attention from the natural language processing and computational linguistic communities . | ||
| P19-1630 *****Automatic***** summarization is typically treated as a 1 - to-1 mapping from document to summary . | ||
| hate speech detection | 231 | |
| 2021.woah-1.10 In ***** hate speech detection *****, however, equalizing model predictions may ignore important differences among targeted social groups, as hate speech can contain stereotypical language specific to each SGT. | ||
| 2021.acl-long.556 In other words, getting more affective features from other affective resources will significantly affect the performance of ***** hate speech detection *****. | ||
| 2020.wanlp-1.2 It also evaluates the recent language representation model BERT on the task of Arabic ***** hate speech detection *****. | ||
| W18-1105 While relevant research has been done independently on code-mixed social media texts and ***** hate speech detection *****, our work is the first attempt in detecting hate speech in Hindi-English code-mixed social media text. | ||
| 2021.acl-short.114 This work is the first to shed light on the limits of this zero-shot, cross-lingual transfer learning framework for ***** hate speech detection *****. | ||
| grammars | 230 | |
| L08-1548 The architecture includes domain ontology, domain texts, language specific lexicons, regular ***** grammars ***** and disambiguation rules. | ||
| 2020.coling-main.203 Moreover, we observe that informal and formal sentences closely resemble each other, which is different from the translation task where two languages have different vocabularies and ***** grammars *****. | ||
| 1998.amta-papers.3 The MT engine of the JANUS speech-to-speech translation system is designed around four main principles: 1) an interlingua approach that allows the efficient addition of new languages, 2) the use of semantic ***** grammars ***** that yield low cost high quality translations for limited domains, 3) modular ***** grammars ***** that support easy expansion into new domains, and 4) efficient integration of multiple ***** grammars ***** using multi-domain parse lattices and domain re-scoring. | ||
| 1995.iwpt-1.28 No restrictions are made such as prescribing normal form, proscribing empty rules or cyclic ***** grammars *****. | ||
| 1997.iwpt-1.24 The main result is a technique for generation of efficient LR-like parsers for ambiguous ***** grammars ***** disambiguated by means of priorities. | ||
| abstractive | 229 | |
| 2021.newsum-1.8 This paper proposes self-supervised strategies for speaker-focused post-correction in ***** abstractive ***** dialogue summarization. | ||
| P17-1108 Recently impressive progress has been made to ***** abstractive ***** sentence summarization using neural models. | ||
| 2021.emnlp-demo.33 iFᴀᴄᴇᴛSᴜᴍ integrates interactive summarization together with faceted search, by providing a novel faceted navigation scheme that yields ***** abstractive ***** summaries for the user's selections. | ||
| 2021.ranlp-1.98 We present GeSERA, an open-source improved version of SERA for evaluating automatic extractive and ***** abstractive ***** summaries from the general domain. | ||
| N18-1138 Supervised training of ***** abstractive ***** language generation models results in learning conditional probabilities over language sequences based on the supervised training signal | ||
| Conversational | 228 | |
| 2021.acl-long.255 *****Conversational***** KBQA is about answering a sequence of questions related to a KB . | ||
| 2020.wanlp-1.6 *****Conversational***** models have witnessed a significant research interest in the last few years with the advancements in sequence generation models . | ||
| P18-3020 *****Conversational***** agents , having the goal of natural language generation , must rely on language models which can integrate emotion into their responses . | ||
| 2021.eacl-demos.38 *****Conversational***** Agent for Daily Living Assessment Coaching ( CADLAC ) is a multi - modal conversational agent system designed to impersonate individuals with various levels of ability in activities of daily living ( ADLs : e.g. , dressing , bathing , mobility , etc . ) | ||
| 2021.eacl-main.177 *****Conversational***** systems enable numerous valuable applications , and question - answering is an important component underlying many of these . | ||
| sentence classification | 228 | |
| 2021.semeval-1.127 We propose two approaches: the first approach fine-tunes transformer models that are pre-trained on ***** sentence classification ***** samples. | ||
| S19-2183 We build on an existing deep learning approach for ***** sentence classification ***** based on a Convolutional Neural Network. | ||
| W19-4323 Lastly, we evaluate the different models on a downstream ***** sentence classification ***** task in which a CNN model is initialized with our embeddings and find promising results. | ||
| N18-2061 Our experimental results in the task of ***** sentence classification *****, on two benchmarking DE datasets (one generic, one domain-specific), show that these models obtain consistent state of the art results. | ||
| 2020.figlang-1.13 We extended latest pre-trained transformers like BERT, RoBERTa, spanBERT on different task objectives like single ***** sentence classification *****, sentence pair classification, etc. | ||
| readability | 227 | |
| W18-5307 We incorporate a sentence fusion approach, based on Integer Linear Programming, along with three novel approaches for sentence ordering, in an attempt to improve the human ***** readability ***** of ideal answers. | ||
| 2020.evalnlgeval-1.1 The evaluation of Natural Language Generation (NLG) systems has recently aroused much interest in the research community, since it should address several challenging aspects, such as ***** readability ***** of the generated texts, adequacy to the user within a particular context and moment and linguistic quality-related issues (e.g., correctness, coherence, understandability), among others. | ||
| W18-3703 Arabic, being a low-resource and morphologically complex language, presents numerous challenges to the task of automatic ***** readability ***** assessment. | ||
| 2020.winlp-1.23 No such corpus has yet been developed for Urdu and we fill this gap by developing one such corpus to help start ***** readability ***** and automatic sentence simplification research. | ||
| L08-1230 For a given text passage, the ***** readability ***** measurement method determines the grade level to which the passage is the most similar by using character-unigram models, which are constructed from the textbook corpus | ||
| Detecting | 227 | |
| N18-3027 ***** Detecting ***** the similarity between job advertisements is important for job recommendation systems as it allows, for example, the application of item-to-item based recommendations. | ||
| W18-1303 ***** Detecting ***** sarcasm in text is a particularly challenging problem in computational semantics, and its solution may vary across different types of text | ||
| W18-6239 *****Detecting***** stress from social media gives a non - intrusive and inexpensive alternative to traditional tools such as questionnaires or physiological sensors for monitoring mental state of individuals . | ||
| W16-4310 *****Detecting***** depression or personality traits , tutoring and student behaviour systems , or identifying cases of cyber - bulling are a few of the wide range of the applications , in which the automatic detection of emotion is a crucial element . | ||
| 2021.acl-long.297 *****Detecting***** rumors on social media is a very critical task with significant implications to the economy , public health , etc . | ||
| coherence | 227 | |
| L16-1649 We describe COHERE, our ***** coherence ***** toolkit which incorporates various complementary models for capturing and measuring different aspects of text ***** coherence *****. | ||
| 2020.acl-main.133 We address these issues by introducing a novel approach to dialogue ***** coherence ***** assessment. | ||
| 2020.acl-main.333 Our metrics consist of (1) GPT-2 based context ***** coherence ***** between sentences in a dialogue, (2) GPT-2 based fluency in phrasing, (3) n-gram based diversity in responses to augmented queries, and (4) textual-entailment-inference based logical self-consistency. | ||
| N19-1381 In this paper, we present interpretable metrics for evaluating topic ***** coherence ***** by making use of distributed sentence representations. | ||
| P19-1640 Together with the widely used ***** coherence ***** measure NPMI, we offer a more wholistic evaluation of topic quality | ||
| Subtask | 225 | |
| S17-2047 In the main ***** Subtask ***** C, our primary submission was ranked fourth, with a MAP of 13.48 and accuracy of 97.08. | ||
| S18-1084 Even without additional lexicons and word embeddings we achieved fourth place in ***** Subtask ***** A and seventh in ***** Subtask ***** B in terms of accuracy. | ||
| S17-2133 NileTMRG participated in three Arabic related subtasks which are: ***** Subtask ***** A (Message Polarity Classification), ***** Subtask ***** B (Topic-Based Message Polarity classification) and ***** Subtask ***** D (Tweet quantification). | ||
| S17-2174 The presented system was mainly focused on the use of part-of-speech tag sequences to filter candidate keyphrases for ***** Subtask ***** A. ***** Subtask *****s A and B were addressed as a sequence labeling problem using Conditional Random Fields (CRFs) and even though ***** Subtask ***** C was out of the scope of this approach, one rule was included to identify synonyms. | ||
| S19-2225 For in-domain evaluation (***** Subtask ***** A), we use the same technique to augment the training set | ||
| comprehension | 225 | |
| L14-1319 Students achieved higher ***** comprehension ***** scores when hold information was provided. | ||
| 2020.acl-main.701 First, we argue that existing approaches do not adequately define ***** comprehension *****; they are too unsystematic about what content is tested. | ||
| Q19-1014 Collected from English as a Foreign Language examinations designed by human experts to evaluate the ***** comprehension ***** level of Chinese learners of English, our data set contains 10,197 multiple-choice questions for 6,444 dialogues. | ||
| D19-1457 Our approach builds on a prior process ***** comprehension ***** framework for predicting actions' effects, to also identify subsequent steps that those effects enable. | ||
| D17-1168 Automatic story ***** comprehension ***** is a fundamental challenge in Natural Language Understanding, and can enable computers to learn about social norms, human behavior and commonsense | ||
| contrastive | 223 | |
| 2021.emnlp-main.204 In this paper, we introduce a novel approach based on ***** contrastive ***** learning that learns better representations by exploiting relation label information. | ||
| L10-1379 Differently from other ***** contrastive ***** methods proposed in the literature that focus on single terms to overcome the multi-word terms' sparsity problem, the proposed ***** contrastive ***** function is able to handle variation in low frequency events by directly operating on pre-selected multi-word terms. | ||
| 2021.emnlp-main.552 Then, we propose a supervised approach, which incorporates annotated pairs from natural language inference datasets into our ***** contrastive ***** learning framework, by using “entailment” pairs as positives and “contradiction” pairs as hard negatives | ||
| 2021.acl-short.29 Based on Vision-and-Language BERT, we train UMIC to discriminate negative captions via ***** contrastive ***** learning. | ||
| L08-1206 The availability of this resource, on the one hand, enables ***** contrastive ***** analysis of the linguistic phenomena surrounding events in both languages, and on the other hand, can be used to perform multilingual temporal analysis of texts. | ||
| words | 223 | |
| 2021.emnlp-main.510 In this study, we investigate lexicon usages across styles throughout two lenses: human perception and machine word importance, since ***** words ***** differ in the strength of the stylistic cues that they provide. | ||
| 2021.semeval-1.163 Detecting humor is a challenging task since ***** words ***** might share multiple valences and, depending on the context, the same ***** words ***** can be even used in offensive expressions. | ||
| 2021.eacl-main.49 In this paper, we relax this assumption by binding a word's sentiment to its collocation ***** words ***** instead of domain labels. | ||
| W17-2602 By multiplying the matrix of trained word2vec embeddings with a word's average context vector, out-of-vocabulary (OOV) embeddings and representations for ***** words ***** with multiple meanings can be created based on the ***** words *****' local contexts. | ||
| 2021.eacl-srw.26 The present paper investigates the impact of the anaphoric one ***** words ***** in English on the Neural Machine Translation (NMT) process using English-Hindi as source and target language pair | ||
| linguistically | 222 | |
| 2008.iwslt-evaluation.3 For all translation tasks except Arabic–English, we exploit ***** linguistically ***** motivated bilingual phrase pairs extracted from parallel treebanks. | ||
| 2020.intexsempar-1.5 Prior work in this area has largely focused on textual input that is ***** linguistically ***** correct and semantically unambiguous. | ||
| K19-2012 For the MRP 2019, which features five formally and ***** linguistically ***** different approaches to meaning representation (DM, PSD, EDS, UCCA and AMR), we propose a uniform, language and framework agnostic graph-tograph neural network architecture. | ||
| 2001.mtsummit-ebmt.7 We show that Example-Based Machine Translation, as long as it is ***** linguistically ***** principled, significantly overlaps with other ***** linguistically ***** principled approaches to Machine Translation. | ||
| L10-1413 In this paper we use statistical machine translation and morphology information from two different morphological analyzers to try to improve translation quality by ***** linguistically ***** motivated segmentation | ||
| text generation | 222 | |
| 2021.emnlp-main.351 Pretrained language models (PLM) have recently advanced graph-to-***** text generation *****, where the input graph is linearized into a sequence and fed into the PLM to obtain its representation. | ||
| 2020.findings-emnlp.115 Generating natural language under complex constraints is a principled formulation towards controllable ***** text generation *****. | ||
| 2021.acl-long.73 Upon the availability of English AMR dataset and English-to- X parallel datasets, in this paper we propose a novel cross-lingual pre-training approach via multi-task learning (MTL) for both zeroshot AMR parsing and AMR-to-***** text generation *****. | ||
| D18-1075 Different from conventional ***** text generation ***** tasks, the mapping between inputs and responses in conversations is more complicated, which highly demands the understanding of utterance-level semantic dependency, a relation between the whole meanings of inputs and outputs. | ||
| 2020.findings-emnlp.322 Despite significant progress in ***** text generation ***** models, a serious limitation is their tendency to produce text that is factually inconsistent with information in the input. | ||
| graph convolutional network | 222 | |
| P19-1131 To tackle the joint type inference task, we propose a novel ***** graph convolutional network ***** (GCN) running on an entity-relation bipartite graph. | ||
| 2021.emnlp-main.663 Then, we incorporate both source and target graphs into the conventional Transformer architecture with ***** graph convolutional network *****s. | ||
| 2020.acl-main.642 To simultaneously capture the relations between objects in an image and the syntactic dependency relations between words in a question, we propose a novel dual channel ***** graph convolutional network ***** (DC-GCN) for better combining visual and textual advantages. | ||
| 2021.emnlp-main.658 To this end, we propose a novel continuum model by extending the idea of neural ordinary differential equations (ODEs) to multi-relational ***** graph convolutional network *****s. | ||
| 2021.dravidianlangtech-1.8 In this paper, we propose the ***** graph convolutional network *****s (GCN) for sentiment analysis on code-mixed text. | ||
| ROUGE | 221 | |
| 2021.acl-long.119 Our method outperforms existing MTL methods across 4 datasets of medical question pairs, in ***** ROUGE ***** scores, RQE accuracy and human evaluation. | ||
| 2021.spnlp-1.1 In some cases, reinforcement learning has been added to train the models with an objective that is closer to their evaluation measures (e.g. ***** ROUGE *****). | ||
| D19-1307 However, summaries with high ***** ROUGE ***** scores often receive low human judgement. | ||
| L06-1300 We describe the evaluation of the system in the recent Multilingual Summarization Evaluation MSE 2005 using the pyramids and ***** ROUGE ***** methods. | ||
| 2020.lrec-1.822 Our results indicate that ***** ROUGE ***** can indeed be adapted to non-English data – both homogeneous and heterogeneous | ||
| source domain | 221 | |
| 2021.emnlp-main.442 Extensive experiments on four benchmarks show that PDALN can effectively adapt high-re***** source domain *****s to low-resource target domains, even if they are diverse in terms and writing styles. | ||
| C18-1103 It is especially crucial for Natural Language Generation (NLG) in Spoken Dialogue Systems when there are sufficient annotated data in the ***** source domain *****, but there is a limited labeled data in the target domain. | ||
| D19-6210 While past works have focused on single ***** source domain ***** adaptation for bio-medical relation classification, we classify relations in an unlabeled target domain by transferring useful knowledge from one or more related ***** source domain *****s. | ||
| W19-5335 We also present a new method to ensure ***** source domain ***** adherence in back-translated data. | ||
| W19-5945 In this paper, we conduct a user study and show that the performance of a multi-dimensional system, which can be adapted from a ***** source domain *****, is equivalent to that of a one-dimensional baseline, which can only be trained from scratch. | ||
| ontologies | 220 | |
| 2021.nllp-1.16 Despite potential utility in populating glossaries and ***** ontologies ***** or as arguments in information extraction and document classification tasks, there has been limited work done for legal terminology extraction. | ||
| L06-1270 In this paper, we present a general method for aligning ***** ontologies *****, which was used to align a conceptual thesaurus, lexicalized in 20 languages with a partial version of it lexicalized in Romanian. | ||
| L08-1266 The target and the translated ***** ontologies ***** are then used as input for the mapping process. | ||
| L08-1265 OntoSelect allows searching as well as browsing of ***** ontologies ***** according to size (number of classes, properties), representation format (DAML, RDFS, OWL), connectedness (score over the number of included and referring ***** ontologies *****) and human languages used for class- and object property-labels. | ||
| L16-1141 We introduce PreMOn (predicate model for ***** ontologies *****), a linguistic resource for exposing predicate models (PropBank, NomBank, VerbNet, and FrameNet) and mappings between them (e.g, SemLink) as Linked Open Data | ||
| translation system | 219 | |
| 2004.amta-papers.23 This paper describes an evaluation experiment about a Japanese-Uighur machine ***** translation system ***** which consists of verbal suffix processing, case suffix processing, phonetic change processing, and a Japanese-Uighur dictionary including about 20,000 words. | ||
| 2021.americasnlp-1.27 Our neural machine ***** translation system ***** ranked first in Track two (development set not used for training) and third in Track one (training includes development data). | ||
| 2005.mtsummit-posters.1 This paper presents TTPlayer, a trace file analysis tool used to develop TransType, an innovative computer-aided ***** translation system *****. | ||
| 2020.iwslt-1.2 This paper describes the ON-TRAC Consortium ***** translation system *****s developed for two challenge tracks featured in the Evaluation Campaign of IWSLT 2020, offline speech translation and simultaneous speech translation. | ||
| 2011.iwslt-evaluation.21 Although such rules can be extracted from an aligned parallel corpus simply as original phrase pairs , their structure is hierarchical and thus can be used in a hierarchical *****translation system***** . | ||
| encoders | 218 | |
| 2020.acl-main.479 In this work, we explore to what extent neural network sentence ***** encoders ***** can learn to predict the strength of scalar inferences. | ||
| N19-1253 Specifically, we compare ***** encoders ***** and decoders based on Recurrent Neural Networks (RNNs) and modified self-attentive architectures. | ||
| D19-1593 As a framework, RSA has several advantages over existing approaches to interpretation of language ***** encoders ***** based on probing or diagnostic classification: namely, it does not require large training samples, is not prone to overfitting, and it enables a more transparent comparison between the representational geometries of different models and modalities. | ||
| D17-1159 GCNs over syntactic dependency trees are used as sentence ***** encoders *****, producing latent feature representations of words in a sentence. | ||
| D18-1327 This is achieved by employing separate ***** encoders ***** for the sequential and parsed versions of the same source sentence; the resulting representations are then combined using a hierarchical attention mechanism | ||
| Bayesian | 218 | |
| 2021.latechclfl-1.4 We develop a ***** Bayesian ***** implementation of Cohen's kappa for multiple annotators that allows us to assess the influence of various contextual effects on the inter-annotator agreement, producing both more robust estimates of the agreement indices as well as insights into the annotation process that leads to the estimated indices | ||
| 2021.reinact-1.3 Starting from an existing account of semantic classification and learning from interaction formulated in a Probabilistic Type Theory with Records , encompassing *****Bayesian***** inference and learning with a frequentist flavour , we observe some problems with this account and provide an alternative account of classification learning that addresses the observed problems . | ||
| L08-1211 This presentation focuses on the semi - automatic extension of Arabic WordNet ( AWN ) using lexical and morphological rules and applying *****Bayesian***** inference . | ||
| C18-1010 We present a method for detecting annotation errors in manually and automatically annotated dependency parse trees , based on ensemble parsing in combination with *****Bayesian***** inference , guided by active learning . | ||
| 2020.nl4xai-1.7 In order to increase trust in the usage of *****Bayesian***** Networks and to cement their role as a model which can aid in critical decision making , the challenge of explainability must be faced . | ||
| spoken language | 218 | |
| 2021.eacl-main.159 We introduce a data augmentation technique based on byte pair encoding and a BERT-like self-attention model to boost performance on ***** spoken language ***** understanding tasks. | ||
| W18-3910 However, linguistic research suggests that ***** spoken language ***** often differs from written language. | ||
| 2020.findings-emnlp.244 Visually-grounded models of ***** spoken language ***** understanding extract semantic information directly from speech, without relying on transcriptions. | ||
| 2014.iwslt-papers.3 In the past, this task has been treated separately in ASR or MT contexts and we propose here a joint estimation of word confidence for a ***** spoken language ***** translation (SLT) task involving both ASR and MT. | ||
| N18-5020 The system architecture consists of several components including ***** spoken language ***** processing, dialogue management, language generation, and content management, with emphasis on user-centric and content-driven design. | ||
| SVM | 216 | |
| W16-5101 Evaluation using a recently introduced cancer domain dataset involving the categorization of documents according to the well-established hallmarks of cancer shows that a basic CNN model can achieve a level of performance competitive with a Support Vector Machine (***** SVM *****) trained using complex manually engineered features optimized to the task. | ||
| U18-1013 Our final submission used a Support Vector Machine (***** SVM *****) and Universal Language Model with Fine-tuning (ULMFiT). | ||
| 2020.semeval-1.158 We use a common traditional machine learning, which is ***** SVM *****, by utilizing the combination of text and images features. | ||
| W16-4828 However, our submitted runs used ***** SVM ***** with a linear kernel. | ||
| W17-2006 We present an approach where an *****SVM***** classifier learns to classify head movements based on measurements of velocity , acceleration , and the third derivative of position with respect to time , jerk . | ||
| normalization | 215 | |
| D18-1132 In this paper, by considering the uniqueness of expression tree, we propose an equation ***** normalization ***** method to normalize the duplicated equations. | ||
| 2021.emnlp-main.523 Training stability is achieved with layer ***** normalization ***** with either a specialized initialization or an additional gating function. | ||
| Q18-1025 This paper presents the first model for time ***** normalization ***** trained on the SCATE corpus. | ||
| 2020.emnlp-main.418 We evaluate our method on five NLP tasks (text ***** normalization *****, sentence fusion, sentence splitting & rephrasing, text simplification, and grammatical error correction) and report competitive results across the board. | ||
| 2021.acl-long.219 Moreover, we introduce attention mechanisms to take advantage of the text surface form of each candidate concept for better ***** normalization ***** performance | ||
| distillation | 214 | |
| D19-1078 Specifically, we first pretrain in-domain and out-of-domain NMT models using their own training corpora respectively, and then iteratively perform bidirectional translation knowledge transfer (from in-domain to out-of-domain and then vice versa) based on knowledge ***** distillation ***** until the in-domain NMT model convergences. | ||
| 2021.iwslt-1.24 Further investigation that combined knowledge ***** distillation ***** and fine-tuning revealed that the combination consistently improved two language pairs: English-Italian and Spanish-English. | ||
| 2021.acl-short.40 We design several techniques: start position randomization, knowledge ***** distillation *****, and history discount to improve pre-training performance. | ||
| 2020.findings-emnlp.250 In this paper, we propose Distilled Embedding, an (input/output) embedding compression method based on low-rank matrix decomposition and knowledge ***** distillation *****. | ||
| 2020.acl-main.202 Some recent works use knowledge ***** distillation ***** to compress these huge models into shallow ones | ||
| Convolutional Neural | 214 | |
| W17-3013 Four ***** Convolutional Neural ***** Network models were trained on resp. | ||
| S19-2162 Our second variant was a ***** Convolutional Neural ***** Network that did not perform as well. | ||
| N18-2122 ***** Convolutional Neural ***** Networks (CNNs) can learn about semantics through images. | ||
| S19-2225 A simple ***** Convolutional Neural ***** Network (CNN) classifier with contextual word representations from a pre-trained language model was used for sentence classification | ||
| R19-1048 The study explores application of a simple *****Convolutional Neural***** Network for the problem of authorship attribution of tweets written in Polish . | ||
| wordnet | 214 | |
| 2016.gwc-1.59 Currently, the Open Multilingal Wordnet has made many ***** wordnet *****s accessible as a single linked ***** wordnet *****, but as it used the Princeton Wordnet of English (PWN) as a pivot, it loses concepts that are not part of PWN. | ||
| 2020.rail-1.8 Creating a new ***** wordnet ***** is by no means a trivial task and when the target language is under-resourced as is the case for the languages currently included in the multilingual African Wordnet (AfWN), developers need to rely heavily on human expertise. | ||
| 2019.gwc-1.21 We present our approach to constructing the ***** wordnet ***** which uses multilingual Coptic dictionaries and ***** wordnet *****s for five different languages. | ||
| 2016.gwc-1.39 We have used these existing versions of the ***** wordnet ***** to perform an automatic evaluation. | ||
| 2020.globalex-1.8 To this end, we study a number of highly ambiguous Danish nouns and examine the effectiveness of sense representations constructed by combining vectors from a distributional model with the information from a ***** wordnet *****. | ||
| online | 214 | |
| 2020.nlpcss-1.14 This limits their use for understanding the dynamics, patterns and prevalence of ***** online ***** abuse. | ||
| C16-1258 When processing arguments in ***** online ***** user interactive discourse, it is often necessary to determine their bases of support. | ||
| 2021.eacl-srw.20 However, there is a large number of new ***** online ***** recipes generated daily with a large number of users reviews, with recommendations to improve the recipe flavor and ideas to modify them. | ||
| W19-6130 We introduce the first manually annotated non-English corpus of ***** online ***** registers featuring the full range of linguistic variation found ***** online *****. | ||
| D18-1403 We present a neural framework for opinion summarization from ***** online ***** product reviews which is knowledge-lean and only requires light supervision (e.g., in the form of product domain labels and user-provided ratings). | ||
| Parallel | 213 | |
| C16-1240 ***** Parallel ***** sentence representations are important for bilingual and cross-lingual tasks in natural language processing. | ||
| L14-1209 *****Parallel***** corpora are crucial for statistical machine translation ( SMT ) . | ||
| R19-1130 *****Parallel***** corpora are crucial resources for NLP applications , most notably for machine translation . | ||
| L10-1019 *****Parallel***** corpora are indispensable resources for a variety of multilingual natural language processing tasks . | ||
| L04-1160 *****Parallel***** corpora are considered an important resource for the development of linguistic tools . | ||
| extractive | 213 | |
| 2021.acl-long.232 Our proposed model specifically produces ***** extractive ***** summaries for each item and user. | ||
| W17-1003 The textual similarity is a crucial aspect for many ***** extractive ***** text summarization methods. | ||
| W18-5310 This paper presents a system for ideal answer generation (using ontology-based retrieval and a neural learning-to-rank approach, combined with ***** extractive ***** and abstractive summarization techniques) which achieved the highest ROUGE score of 0.659 on the BioASQ 5b batch 2 test. | ||
| 2021.emnlp-main.494 We conduct a comprehensive evaluation on a variety of ***** extractive ***** question answering datasets ranging from single-hop to multi-hop and from text-only to table-only to hybrid that require various reasoning capabilities and show that ReasonBert achieves remarkable improvement over an array of strong baselines. | ||
| 2020.inlg-1.30 Conventional snippets are ***** extractive ***** in nature, which recently gave rise to copyright claims from news publishers as well as a new copyright legislation being passed in the European Union, limiting the fair use of web page contents for snippets | ||
| compositional | 212 | |
| W17-2601 As word vectors belonging to different syntactic categories have incompatible syntactic distributions, no trivial ***** compositional ***** operation can be applied to combine them into a new ***** compositional ***** vector. | ||
| Q15-1019 We evaluate our approach on a ***** compositional ***** question answering task where it outperforms several competitive baselines. | ||
| W19-4814 The models are architecturally identical at inference time, but differ in the way that they are trained: our baseline model is trained with a task-success signal only, while the other model receives additional supervision on its attention mechanism (Attentive Guidance), which has shown to be an effective method for encouraging more ***** compositional ***** solutions. | ||
| P18-1201 We design a transferable architecture of structural and ***** compositional ***** neural networks to jointly represent and map event mentions and types into a shared semantic space. | ||
| L12-1283 This work is part of a project for MWE extraction and characterization using different techniques aiming at measuring the properties related to idiomaticity, as institutionalization, non-***** compositional *****ity and lexico-syntactic fixedness. | ||
| annotated corpus | 212 | |
| W18-5613 We analyze inter-annotator agreement based on the developed guidelines and present results from experiments aimed at evaluating the validity and applicability of the ***** annotated corpus ***** using machine learning techniques. | ||
| 2020.lrec-1.652 In the last section, we provide some distributional characteristics of the ***** annotated corpus ***** (POS distribution, multiword expressions). | ||
| W18-2309 Based on the ***** annotated corpus *****, we investigate 1) treatment decision-making process in medical conversations, and 2) effects of physician-caregiver communication behaviors on antibiotic over-prescribing. | ||
| 2021.calcs-1.16 We make the ***** annotated corpus ***** freely available for the researcher to aid abusive content detection in Bengali social media data. | ||
| 2020.mwe-1.5 This paper describes a manually ***** annotated corpus ***** of verbal multi-word expressions in Polish | ||
| fake news detection | 211 | |
| 2020.coling-main.165 However, the challenging problem of ***** fake news detection ***** has not benefited from the improvement of fact verification models, which is closely related to ***** fake news detection *****. | ||
| 2020.lrec-1.309 We show that at the present state of machine translation quality for the English-Urdu language pair, the fully automated data augmentation through machine translation did not provide improvement for ***** fake news detection ***** in Urdu. | ||
| 2019.icon-1.27 We evaluate our techniques on the two recently released datasets, namely Fake News AMT and Celebrity for ***** fake news detection *****. | ||
| P19-2050 This work is still in progress as we plan to extend the dataset in the future and use it for our approach towards automated ***** fake news detection *****. | ||
| 2021.acl-srw.32 In this work, we propose a new technique based on cross-lingual evidence (CE) that can be used for ***** fake news detection ***** and improve existing approaches. | ||
| NLU | 210 | |
| 2020.findings-emnlp.39 Natural language inference (NLI) and semantic textual similarity (STS) are key tasks in natural language understanding (***** NLU *****). | ||
| D19-1458 In this work, we introduce a diagnostic benchmark suite, named CLUTRR, to clarify some key issues related to the robustness and systematicity of ***** NLU ***** systems. | ||
| R17-1023 A key technology in such chat-bots is robust natural language understanding (***** NLU *****) which can significantly influence and impact the efficacy of the conversation and ultimately the user-experience. | ||
| 2020.acl-main.163 Natural language understanding (***** NLU *****) and natural language generation (NLG) are two fundamental and related tasks in building task-oriented dialogue systems with opposite objectives: ***** NLU ***** tackles the transformation from natural language to formal representations, whereas NLG does the reverse. | ||
| 2020.acl-demos.15 jiant implements over 50 ***** NLU ***** tasks, including all GLUE and SuperGLUE benchmark tasks | ||
| recurrent neural networks | 210 | |
| Q14-1017 DT-RNNs outperform other recursive and ***** recurrent neural networks *****, kernelized CCA and a bag-of-words baseline on the tasks of finding an image that fits a sentence description and vice versa. | ||
| N19-1112 To investigate the transferability of contextual word representations, we quantify differences in the transferability of individual layers within contextualizers, especially between ***** recurrent neural networks ***** (RNNs) and transformers. | ||
| C16-1124 We present a model of visually-grounded language learning based on stacked gated ***** recurrent neural networks ***** which learns to predict visual features given an image description in the form of a sequence of phonemes. | ||
| K18-2012 We introduce tree-stack LSTM to model state of a transition based parser with ***** recurrent neural networks *****. | ||
| W19-4813 Recently, several methods have been proposed to explain the predictions of ***** recurrent neural networks ***** (RNNs), in particular of LSTMs. | ||
| machine comprehension | 210 | |
| D19-6009 We show the power of leveraging state-of-the-art pre-trained language models such as BERT(Bidirectional Encoder Representations from Transformers) and XLNet over other Commonsense Knowledge Base Resources such as ConceptNet and NELL for modeling ***** machine comprehension *****. | ||
| 2020.coling-industry.21 While neural approaches have achieved significant improvement in ***** machine comprehension ***** tasks, models often work as a black-box, resulting in lower interpretability, which requires special attention in domains such as healthcare or education. | ||
| N19-1403 In this paper, we propose a method to leverage the natural language relations between the answer choices, such as entailment and contradiction, to improve the performance of ***** machine comprehension *****. | ||
| W18-2601 To answer the question in ***** machine comprehension ***** (MC) task, the models need to establish the interaction between the question and the context. | ||
| C16-1167 Several institutes have released the Cloze-style reading comprehension data, and these have greatly accelerated the research of *****machine comprehension*****. | ||
| Extracting | 209 | |
| 2021.ranlp-1.184 ***** Extracting ***** the most important part of legislation documents has great business value because the texts are usually very long and hard to understand. | ||
| D18-1243 *****Extracting***** relations is critical for knowledge base completion and construction in which distant supervised methods are widely used to extract relational facts automatically with the existing knowledge bases . | ||
| 2021.emnlp-main.429 *****Extracting***** relations across large text spans has been relatively underexplored in NLP , but it is particularly important for high - value domains such as biomedicine , where obtaining high recall of the latest findings is crucial for practical applications . | ||
| P19-3006 *****Extracting***** events in the form of who is involved in what at when and where from text , is one of the core information extraction tasks that has many applications such as web search and question answering . | ||
| S17-2174 This paper describes the system used by the team LIPN in SemEval 2017 Task 10 : *****Extracting***** Keyphrases and Relations from Scientific Publications . | ||
| query | 209 | |
| N18-1186 The major idea is to represent a conversation session into memories upon which attention-based memory reading mechanism can be performed multiple times, so that (1) user's ***** query ***** is properly extended by contextual clues and (2) optimal responses are step-by-step generated. | ||
| N19-1229 Instead of starting with a complex architecture, we proceed from the bottom up and examine the effectiveness of a simple, word-level Siamese architecture augmented with attention-based mechanisms for capturing semantic “soft” matches between ***** query ***** and post tokens. | ||
| 2020.emnlp-main.29 Our method distills a small test suite of databases that achieves high code coverage for the gold ***** query ***** from a large number of randomly generated databases. | ||
| D18-1293 The incremental computation is crucial when a new ***** query ***** is built from a previous ***** query *****. | ||
| 2021.emnlp-main.661 We also demonstrate that our attack achieves a higher success rate when compared to prior attacks in a limited ***** query ***** setting | ||
| parameter | 208 | |
| P17-1059 Our proposed model, Affect-LM enables us to customize the degree of emotional content in generated sentences through an additional design ***** parameter *****. | ||
| C18-1252 Our main contributions are extending J-K-fold CV from performance estimation to ***** parameter ***** tuning and investigating how to choose J and K. | ||
| L12-1557 By tweaking a ***** parameter ***** in the algorithm, resulting patterns can be diversifiable with a specific degree one can control. | ||
| N19-1191 The Semi-parametric nature of our approach also opens the door for non-parametric domain adaptation, demonstrating strong inference-time adaptation performance on new domains without the need for any ***** parameter ***** updates. | ||
| S18-1023 This approach allows different model architectures and ***** parameter ***** settings for each affect category instead of building one single multi-label classifier | ||
| Multimodal | 207 | |
| N18-1199 ***** Multimodal ***** machine learning algorithms aim to learn visual-textual correspondences. | ||
| 2021.emnlp-main.673 ***** Multimodal ***** machine translation (MMT) systems have been shown to outperform their text-only neural machine translation (NMT) counterparts when visual context is available. | ||
| 2021.acl-long.203 *****Multimodal***** fusion has been proved to improve emotion recognition performance in previous works . | ||
| 2021.naacl-main.418 *****Multimodal***** research has picked up significantly in the space of question answering with the task being extended to visual question answering , charts question answering as well as multimodal input question answering . | ||
| 2021.maiworkshop-1.1 The *****Multimodal***** Transformer showed to be a competitive model for multimodal tasks involving textual , visual and audio signals . | ||
| recurrent | 207 | |
| C18-1139 Recent advances in language modeling using ***** recurrent ***** neural networks have made it viable to model language as distributions over characters. | ||
| P19-1056 Specifically, DOER involves a dual ***** recurrent ***** neural network to extract the respective representation of each task, and a cross-shared unit to consider the relationship between them. | ||
| R19-1069 Our models for predicting text functions are based on ***** recurrent ***** neural networks and traditional feature-based machine learning approaches. | ||
| N18-2042 In this paper, we model the flow of emotions over a book using ***** recurrent ***** neural networks and quantify its usefulness in predicting success in books. | ||
| 2020.coling-main.425 While previous methods have relied on traditional machine learning or vanilla ***** recurrent ***** neural networks, we rigorously investigate the use of transformers for clickbait strength prediction | ||
| morphologically | 206 | |
| 2021.eacl-main.158 The present article discusses how to improve translation quality when using limited training data to translate towards ***** morphologically ***** rich languages. | ||
| L16-1087 Since we did not employ any language dependent features, we believe that our method can be easily adapted to microblog texts in other ***** morphologically ***** rich languages. | ||
| L06-1442 A tokeniser is required to isolate categories, like a verb, from raw text before they can be correctly ***** morphologically ***** analysed. | ||
| 2021.conll-1.45 Data collection is challenging for Indian languages, because they are syntactically and ***** morphologically ***** diverse, as well as different from resource-rich languages like English. | ||
| 2020.winlp-1.10 Both languages are members of the Omotic family, spoken and southwestern Ethiopia, and, like other Omotic languages, both are ***** morphologically ***** complex | ||
| sparsity | 206 | |
| P18-1217 Topic models with ***** sparsity ***** enhancement have been proven to be effective at learning discriminative and coherent latent topics of short texts, which is critical to many scientific and engineering applications. | ||
| D18-1351 Many classification models work poorly on short texts due to data ***** sparsity *****. | ||
| 2020.lrec-1.610 In fact, Arabic language is characterized by its agglutination and morphological richness contributing to great ***** sparsity ***** that could affect embedding quality. | ||
| E17-2002 The goal of URIEL and lang2vec is to enable multilingual NLP, especially on less-resourced languages and make possible types of experiments (especially but not exclusively related to NLP tasks) that are otherwise difficult or impossible due to the ***** sparsity ***** and incommensurability of the data sources. | ||
| W19-4427 Considerable effort has been made to address the data ***** sparsity ***** problem in neural grammatical error correction | ||
| shared | 205 | |
| W17-1413 In the paper we present an adaptation of Liner2 framework to solve the BSNLP 2017 ***** shared ***** task on multilingual named entity recognition. | ||
| 2020.fnp-1.1 FNS summarisation ***** shared ***** task is the first to target financial annual reports. | ||
| 2018.iwslt-1.8 We propose a method to transfer knowledge across neural machine translation (NMT) models by means of a ***** shared ***** dynamic vocabulary. | ||
| 2020.wnut-1.39 This paper presents our teamwork on WNUT 2020 ***** shared ***** task-1: wet lab entity extract, that we conducted studies in several models, including a BiLSTM CRF model and a Bert case model which can be used to complete wet lab entity extraction. | ||
| 2020.wmt-1.15 This paper describes Tilde's submission to the WMT2020 ***** shared ***** task on news translation for both directions of the English-Polish language pair in both the constrained and the unconstrained tracks. | ||
| attention head | 205 | |
| 2020.acl-main.429 We apply this methodology to test BERT and RoBERTa on a hypothesis that some *****attention heads***** will consistently attend from a word in negation scope to the negation cue. | ||
| 2021.eacl-main.264 We show that full trees can be decoded above baseline accuracy from single *****attention heads*****, and that individual relations are often tracked by the same heads across languages. | ||
| 2020.emnlp-main.259 Large Transformer-based models were shown to be reducible to a smaller number of self-*****attention heads***** and layers. | ||
| 2021.acl-long.538 Moreover, in-depth analysis on the generated summaries and *****attention heads***** verifies that interactions are learned well using MCLAS, which benefits the CLS task under limited parallel resources. | ||
| 2021.ranlp-1.52 Multiple parallel attention mechanisms that use multiple *****attention heads***** facilitate greater performance of the Transformer model for various applications e.g., Neural Machine Translation (NMT), text classification. | ||
| schema | 203 | |
| 2020.emnlp-main.564 This suggests ***** schema ***** linking is the crux for the current text-to-SQL task. | ||
| 2021.reinact-1.4 We annotate a corpus of analogical episodes with the ***** schema ***** and develop statistical sequence models from the corpus which predict tutor content related decisions, in terms of the selection of the analogical component (AC) and tutor conversational management act (TCMA) to deploy at the current utterance, given the student's behaviour. | ||
| P19-1448 In this paper, we present an encoder-decoder semantic parser, where the structure of the DB ***** schema ***** is encoded with a graph neural network, and this representation is later used at both encoding and decoding time. | ||
| 2020.conll-1.45 For memorization, we identify ***** schema ***** conformity (facts systematically supported by other facts) and frequency as key factors for its success. | ||
| L10-1483 Thanks to the inclusion of semantico-syntactic tags into the ***** schema *****, we can annotate a corpus not only with syntactic dependency structures, but also with valency patterns as they are usually found in separate treebanks such as PropBank and NomBank | ||
| knowledge base | 202 | |
| N19-2018 A capable, automatic Question Answering (QA) system can provide more complete and accurate answers using a comprehensive ***** knowledge base ***** (KB). | ||
| 2021.emnlp-main.292 We avoid crucial assumptions of previous work that do not transfer well to real-world settings, including exploiting knowledge of the fixed number of retrieval steps required to answer each question or using structured metadata like ***** knowledge base *****s or web links that have limited availability. | ||
| 2005.mtsummit-swtmt.1 The bottleneck has been the engineering of sufficiently comprehensive bodies of relevant knowledge The Semantic Web offers opportunities for the gradual evolution of a global heterogeneous ***** knowledge base *****. | ||
| 2020.lt4hala-1.6 Lastly, the paper envisages the advantages of an inclusion of LatInfLexi into the LiLa ***** knowledge base *****, both for the presented resource and for the ***** knowledge base ***** itself. | ||
| 2018.jeptalnrecital-court.37 Entity linking systems typically rely on encyclopedic ***** knowledge base *****s such as DBpedia or Freebase. | ||
| tagger | 201 | |
| L06-1345 Finally, we employ an automatic ***** tagger ***** developed for standard Norwegian, the Oslo-Bergen Tagger, together with a facility for manual tag correction. | ||
| L14-1449 We extend the work of Jawaid and Bojar (2012) who use three different ***** tagger *****s and then apply a voting scheme to disambiguate among the different choices suggested by each ***** tagger *****. | ||
| L06-1103 The ***** tagger ***** is evaluated using a manually disambiguated test corpus and it currently achieves 95% accuracy on unrestricted text. | ||
| 2020.lrec-1.239 Our experimental results demonstrate that the sequence ***** tagger ***** with the optimal setting can detect the entities with a macro-averaged F1 score of 0.826, while the rule-based relation extractor can achieve high performance with a macro-averaged F1 score of 0.887. | ||
| 2020.lrec-1.487 The training of new ***** tagger ***** models for Serbian is primarily motivated by the enhancement of the existing tagset with the grammatical category of a gender | ||
| training data | 201 | |
| 2021.emnlp-main.702 Despite achieving good performance on some public benchmarks, we observe that existing text-to-SQL models do not generalize when facing domain knowledge that does not frequently appear in the ***** training data *****, which may render the worse prediction performance for unseen domains. | ||
| 2021.sigdial-1.9 For this reason, we fully annotated only the test data and left the annotation of the ***** training data ***** incomplete. | ||
| N18-3019 Extensive experimentation over a dataset of 10 domains drawn from data relevant to our commercial personal digital assistant shows that our BoE models outperform the baseline models with a statistically significant average margin of 5.06% in absolute F1-score when training with 2000 instances per domain, and achieve an even higher improvement of 12.16% when only 25% of the ***** training data ***** is used. | ||
| W16-4502 Experiments were conducted by comparing perplexity and BLEU scores on common test cases using the same ***** training data ***** set. | ||
| 2020.emnlp-main.75 We also demonstrate the efficacy of the proposed approach in the zero-shot setup for language pairs without bitext ***** training data *****. | ||
| NLI | 200 | |
| 2021.blackboxnlp-1.31 We present three Natural Language Inference (***** NLI *****) challenge sets that can evaluate ***** NLI ***** models on their understanding of temporal expressions. | ||
| 2021.bionlp-1.5 To bridge this gap, we explore whether supplementing textual domain knowledge in the medical ***** NLI ***** task: a) by further language model pretraining on the medical domain corpora, b) by means of lexical match algorithms such as the BM25 algorithm, c) by supplementing lexical retrieval with dependency relations, or d) by using a trained retriever module, can push this performance closer to that of humans. | ||
| D19-1631 Specifically, we experiment with fusing embeddings obtained from knowledge graph with the state-of-the-art approaches for ***** NLI ***** task (ESIM model). | ||
| W17-5045 This paper presents an ensemble system combining the output of multiple SVM classifiers to native language identification (***** NLI *****). | ||
| 2020.nli-1.2 We propose and evaluate two reversal-based methods on an ***** NLI ***** task of recognising a type of a simple logical expression from its description in plain-text form | ||
| Bilingual | 200 | |
| D18-1062 ***** Bilingual ***** lexicon extraction has been studied for decades and most previous methods have relied on parallel corpora or bilingual dictionaries. | ||
| W19-4316 ***** Bilingual ***** word embeddings, which represent lexicons of different languages in a shared embedding space, are essential for supporting semantic and knowledge transfers in a variety of cross-lingual NLP tasks. | ||
| L14-1120 *****Bilingual***** dictionaries define word equivalents from one language to another , thus acting as an important bridge between languages . | ||
| L16-1353 *****Bilingual***** lexica are the basis for many cross - lingual natural language processing tasks . | ||
| L14-1080 *****Bilingual***** dictionaries are the key component of the cross - lingual similarity estimation methods . | ||
| language resources | 200 | |
| L16-1533 This paper describes the named entity ***** language resources ***** developed as part of a development project for the South African languages. | ||
| L10-1187 To address this issue a broad alliance of LRT providers (CLARIN, the Linguist List, DOBES, DELAMAN, DFKI, ELRA) have initiated the Virtual Language Observatory portal to provide a low-barrier, easy-to-follow entry point to ***** language resources ***** and tools; it can be accessed via http://www.clarin.eu/vlo | ||
| L06-1163 Tagging as the most crucial annotation of ***** language resources ***** can still be challenging when the corpus size is big and when the corpus data is not homogeneous. | ||
| L16-1707 In addition to the portal, we describe long-term goals and prospects with a special focus on ongoing efforts regarding an extension towards integrating ***** language resources ***** and Linguistic Linked Open Data. | ||
| L14-1154 Despite the growth in the number of linguistic data centers around the world, their accomplishments and expansions and the advances they have help enable, the ***** language resources ***** that exist are a small fraction of those required to meet the goals of Human Language Technologies (HLT) for the worlds languages and the promises they offer: broad access to knowledge, direct communication across language boundaries and engagement in a global community. | ||
| systems | 200 | |
| L10-1362 Question answering (QA) ***** systems ***** aim at retrieving precise information from a large collection of documents. | ||
| L06-1015 That is, the retrieved documents from both ***** systems ***** are shown to the judges without any information about thesearch techniques. | ||
| D19-1566 Experimental results suggest the efficacy of the proposed model for both sentiment and emotion analysis over various existing state-of-the-art ***** systems *****. | ||
| 2021.acl-long.96 We also carry out multiple experiments to measure how much each augmentation strategy improves the performance of automatic scoring ***** systems *****. | ||
| D19-1651 Recently , an automated evaluator with a benchmark dataset ( OIE2016 ) was released it scores Open IE *****systems***** automatically by matching system predictions with predictions in the benchmark dataset . | ||
| transliteration | 199 | |
| L14-1016 Our experiments focus on Katakana-English lexicon construction, however it would be possible to apply the proposed methods to ***** transliteration ***** extraction for a variety of language pairs. | ||
| L14-1290 We have analysed every text segment present in this corpus and discovered other conventions of writing at the level of ***** transliteration *****, academic norms and editorial interventions. | ||
| D18-1046 We evaluate ***** transliteration ***** generation performance itself, as well the improvement it brings to cross-lingual candidate generation for entity linking, a typical downstream task. | ||
| W18-2409 Similar to previous editions of NEWS, the Shared Task featured 19 tasks on proper name ***** transliteration *****, including 13 different languages and two different Japanese scripts. | ||
| L12-1499 The Indonesian ***** transliteration ***** generated was used as a means to support the learners where their speech were then recorded | ||
| masked language model | 199 | |
| 2020.emnlp-main.497 In this paper, we show that careful masking strategies can bridge the knowledge gap of ***** masked language model *****s (MLMs) about the domains more effectively by allocating self-supervision where it is needed. | ||
| 2020.blackboxnlp-1.13 We explore the imprint of two specific linguistic alternations, namely passivization and negation, on the representations generated by neural models trained with two different objectives: ***** masked language model *****ing and translation. | ||
| 2020.emnlp-main.498 We present BAE, a black box attack for generating adversarial examples using contextual perturbations from a BERT ***** masked language model *****. | ||
| 2021.semeval-1.16 In our experiments, we used a neural system based on the XLM-R, a pre-trained transformer-based ***** masked language model *****, as a baseline. | ||
| 2020.emnlp-main.699 To tackle cases when no parallel source–target pairs are available, we train ***** masked language model *****s (MLMs) for both the source and the target domain. | ||
| 2018 | 198 | |
| 2020.gebnlp-1.1 We mitigate bias by fine-tuning BERT on the GAP corpus (Webster et al., ***** 2018 *****), after applying Counterfactual Data Substitution (CDS) (Maudslay et al., 2019). | ||
| 2020.lrec-1.326 To that end, Chen & Schwartz (***** 2018 *****) implemented a finite-state morphological analyzer as a critical enabling technology for use in Yupik language education and technology. | ||
| W18-0521 Shared Task ***** 2018 *****. | ||
| D19-1615 When combined with our two-stage fine-tuning pipeline, our method achieves improved common sense reasoning and state-of-the-art perplexity on the WritingPrompts (Fan et al., ***** 2018 *****) story generation dataset | ||
| W18-5810 Sequence - to - sequence neural networks have been shown to perform well at a number of other morphological tasks ( Cotterell et al . , 2016 ) , and produce results that highly correlate with human behavior ( Kirov , 2017 ; Kirov & Cotterell , *****2018***** ) but do not include any explicit variables in their architecture . | ||
| reordering | 198 | |
| L06-1033 This process implicitly leads to word sense disambiguation and to language specific ***** reordering ***** of words. | ||
| 2012.iwslt-papers.16 For the given source sentence, we assign each source token a label which contains the ***** reordering ***** information for that token. | ||
| 2020.acl-main.270 However, the sandwich ***** reordering ***** pattern does not guarantee performance gains across every task, as we demonstrate on machine translation models. | ||
| 2007.iwslt-1.3 Inspired by previous chunk-level ***** reordering ***** approaches to statistical machine translation, this paper presents two methods to improve the ***** reordering ***** at the chunk level | ||
| 2010.amta-papers.25 However, this basic method for combining phrases is not sufficient for phrase ***** reordering *****. | ||
| Supervised | 197 | |
| 2020.wmt-1.80 As we were defining the task, we also obtained a small amount of parallel data (about 60000 parallel sentences), allowing us to offer a Very Low Resource ***** Supervised ***** MT task as well | ||
| 2016.gwc-1.8 *****Supervised***** methods for Word Sense Disambiguation ( WSD ) benefit from high - quality sense - annotated resources , which are lacking for many languages less common than English . | ||
| 2021.gwc-1.17 *****Supervised***** approaches usually achieve the best performance in the Word Sense Disambiguation problem . | ||
| 2020.wnut-1.34 *****Supervised***** models trained to predict properties from representations have been achieving high accuracy on a variety of tasks . For in - stance , the BERT family seems to work exceptionally well on the downstream task from NER tagging to the range of other linguistictasks . | ||
| 2021.case-1.24 *****Supervised***** models can achieve very high accuracy for fine - grained text classification . | ||
| document | 197 | |
| N18-1141 We conduct experiments on both ***** document ***** based and knowledge based question answering tasks. | ||
| 2020.lrec-1.181 We also experiment using linear-chain Conditional Random Fields to leverage the sequential nature of the lawsuits, which we find to lead to improvements on ***** document ***** type classification. | ||
| L14-1274 Classic XML inline annotation often fails for both ***** document ***** classes because of overlapping markup. | ||
| 2020.emnlp-main.191 In this work, we propose “Discern”, a discourse-aware entailment reasoning network to strengthen the connection and enhance the understanding of both ***** document ***** and dialog. | ||
| N19-1348 We apply our approach on two ***** document ***** collections: Wikipedia and Sports articles, yielding 60 million fusion examples annotated with discourse information required to reconstruct the fused text | ||
| word translation | 197 | |
| 2006.amta-papers.9 Lexical mappings (***** word translation *****s) between languages are an invaluable resource for multilingual processing. | ||
| R19-1140 We applied our model for the Turkish-Finnish language pair on the bilingual ***** word translation ***** task. | ||
| 2020.conll-1.19 Further analysis of misleading translations revealed that the most frequent error types are ambiguity, mistranslation, noun phrase error, word-by-***** word translation *****, untranslated word, subject-verb agreement, and spelling error in the source text. | ||
| Q19-1007 We illustrate the effectiveness of joint learning for multiple languages in an indirect ***** word translation ***** setting. | ||
| C16-1300 In this way, we take advantage of its capability to handle multiple alternative ***** word translation *****s in a natural form of regularization. | ||
| recurrent neural network | 196 | |
| Q16-1036 The first model uses an end-to-end ***** recurrent neural network *****. | ||
| Q14-1017 DT-RNNs outperform other recursive and ***** recurrent neural network *****s, kernelized CCA and a bag-of-words baseline on the tasks of finding an image that fits a sentence description and vice versa. | ||
| K19-1062 The model consists of 1) a ***** recurrent neural network ***** (RNN) to learn scoring functions for pair-wise relations, and 2) a structured support vector machine (SSVM) to make joint predictions. | ||
| N19-1112 To investigate the transferability of contextual word representations, we quantify differences in the transferability of individual layers within contextualizers, especially between ***** recurrent neural network *****s (RNNs) and transformers. | ||
| S17-2141 Since two submissions were allowed, two different machine learning methods were developed to solve this task, a support vector machine approach and a ***** recurrent neural network ***** approach. | ||
| authorship attribution | 195 | |
| W17-4914 Using established models for ***** authorship attribution *****, we empirically assess the stylistic qualities of neurally generated text. | ||
| 2021.eval4nlp-1.9 Most articles use either classification accuracy or ***** authorship attribution *****, which does not clearly measure the quality of the representation space, if it really captures what it has been built for. | ||
| 2020.alta-1.4 Distance measures (e.g. Burrows's Delta, Cosine distance) are a standard tool in ***** authorship attribution ***** studies. | ||
| E17-2106 We present a model to perform ***** authorship attribution ***** of tweets using Convolutional Neural Networks (CNNs) over character n-grams. | ||
| C18-1029 We apply the conclusions from our analysis to an extension of an existing approach to ***** authorship attribution ***** and outperform the prior state-of-the-art on two out of the four datasets used | ||
| captioning | 194 | |
| C18-1295 These extracted VGPs have the potential to improve language and image multimodal tasks such as visual question answering and image ***** captioning *****. | ||
| 2020.emnlp-main.707 Recent work has also successfully adapted such models towards the generative task of image ***** captioning *****. | ||
| P18-1195 Our experiments on two different tasks, image ***** captioning ***** and machine translation, show that token-level and sequence-level loss smoothing are complementary, and significantly improve results. | ||
| E17-1019 In this paper, we provide an in-depth evaluation of the existing image ***** captioning ***** metrics through a series of carefully designed experiments. | ||
| 2020.acl-main.233 Generating multi-sentence descriptions for videos is one of the most challenging ***** captioning ***** tasks due to its high requirements for not only visual relevance but also discourse-based coherence across the sentences in the paragraph | ||
| Adversarial | 194 | |
| N19-5001 *****Adversarial***** learning is a game - theoretic learning paradigm , which has achieved huge successes in the field of Computer Vision recently . | ||
| 2021.emnlp-main.527 *****Adversarial***** regularization has been shown to improve the generalization performance of deep learning models in various natural language processing tasks . | ||
| 2020.emnlp-main.495 *****Adversarial***** attacks against natural language processing systems , which perform seemingly innocuous modifications to inputs , can induce arbitrary mistakes to the target models . | ||
| 2020.emnlp-main.256 *****Adversarial***** attacks reveal important vulnerabilities and flaws of trained models . | ||
| 2020.aacl-main.79 *****Adversarial***** attacks are label - preserving modifications to inputs of machine learning classifiers designed to fool machines but not humans . | ||
| graph neural network | 194 | |
| 2021.emnlp-main.11 Recent studies have leveraged ***** graph neural network *****s to capture the inter-sentential relationship (e.g., the discourse graph) within the documents to learn contextual sentence embedding. | ||
| 2020.coling-main.260 Integrating the proposed method with two ***** graph neural network *****-based semantic parsers together with BERT representations demonstrates substantial gains in parsing accuracy on the challenging Spider dataset. | ||
| 2020.acl-main.280 To distinguish confusing charges, we propose a novel ***** graph neural network *****, GDL, to automatically learn subtle differences between confusing law articles, and also design a novel attention mechanism that fully exploits the learned differences to attentively extract effective discriminative features from fact descriptions. | ||
| 2021.emnlp-main.278 We then proposed a novel graph-aware definition generation model Graphex that integrates transformer with ***** graph neural network *****. | ||
| C18-1107 The proposed structured deep reinforcement learning is based on ***** graph neural network *****s (GNN), which consists of some sub-networks, each one for a node on a directed graph. | ||
| manually | 192 | |
| L12-1537 We also empirically show that the proposed method can correctly identify more than 80% of the functional / content usages only with less than 38,000 training instances of ***** manually ***** identified canonical expressions. | ||
| W18-2707 Our experimental results show that the translation quality is improved by increasing the number of synthetic source sentences for each given target sentence, and quality close to that using a ***** manually ***** created parallel corpus was achieved. | ||
| 2020.emnlp-main.346 We also show that our prompts elicit more accurate factual knowledge from MLMs than the ***** manually ***** created prompts on the LAMA benchmark, and that MLMs can be used as relation extractors more effectively than supervised relation extraction models. | ||
| P18-1252 First, we ***** manually ***** construct a bi-tree aligned dataset containing over ten thousand sentences. | ||
| 2021.acl-long.366 We evaluate the newly created arguments ***** manually ***** and automatically, based on several dimensions important in argumentative contexts, including argumentativeness and plausibility | ||
| annotate | 192 | |
| 2021.naacl-main.254 Inspired by the literature in human-human negotiations, we ***** annotate ***** persuasion strategies and perform correlation analysis to understand how the dialogue behaviors are associated with the negotiation performance. | ||
| 2021.wassa-1.18 We formulate an annotation task, in which we join the tasks of hateful/offensive speech detection and stance detection, and ***** annotate ***** 3000 Tweets from the campaign period, if they express a particular stance towards a candidate. | ||
| L16-1184 We manually ***** annotate ***** a collection of opposing polarity phrases and their constituent single words with real-valued sentiment intensity scores using a method known as Best―Worst Scaling. | ||
| 2021.bea-1.1 In particular, manual annotation, empirical evaluation and error analysis indicate two non-obvious facts: 1) L2-Chinese, L1-Japanese data are more difficult to analyze and thus ***** annotate ***** than L2-Chinese, L1-English data; 2) computational models trained on L2-Chinese, L1-Japanese data perform better than models trained on L2-Chinese, L1-English data | ||
| L14-1337 In the present contribution , we propose to look at the so - called interoperability from ( at least ) three angles , namely ( i ) as a relation ( and possible interaction or cooperation ) of different annotation schemes for different layers or phenomena of a single language , ( ii ) the possibility to *****annotate***** different languages by a single ( modified or not ) annotation scheme , and ( iii ) the relation between different annotation schemes for a single language , or for a single phenomenon or layer of the same language . | ||
| binary | 192 | |
| W18-6483 Our supervised approach discerns between good and bad translations by training classic ***** binary ***** classification models over an artificially produced ***** binary ***** classification dataset derived from a high-quality translation set, and a minimalistic set of 6 semantic distance features that rely only on easy-to-gather resources. | ||
| W16-4112 Often, relevant corpora consist only of easy-to-read texts with no rank information or empirical readability scores, making only ***** binary ***** approaches, such as classification, applicable. | ||
| 2020.findings-emnlp.233 Semantic hashing is a powerful paradigm for representing texts as compact ***** binary ***** hash codes. | ||
| 2021.starsem-1.24 We find that for a broad ***** binary ***** distinction into `easy' vs. `difficult' general-language compound frequency is sufficient, but for a more fine-grained four-class distinction it is crucial to include contrastive termhood features and compound and constituent features. | ||
| W18-3006 The proposed models achieve better performances than state-of-the-art non-quantum models on ***** binary ***** sentence classification tasks | ||
| parallel corpora | 192 | |
| L16-1098 Lexical resources do help uplift performance when ***** parallel corpora ***** are scanty. | ||
| I17-1049 In this paper, we address this problem by procuring additional training data from ***** parallel corpora *****: When humans translate a text, they sometimes add connectives (a process known as explicitation). | ||
| L16-1483 English is common in all ***** parallel corpora *****, with translations in five languages, namely, Basque, Bulgarian, Czech, Portuguese and Spanish. | ||
| W16-4508 We defend that bilingual lexicons automatically extracted from ***** parallel corpora *****, whose entries have been meanwhile validated by linguists and classified as correct or incorrect, should constitute a specific ***** parallel corpora *****. | ||
| J17-2003 On word pairs extracted from ***** parallel corpora ***** with fewer than 2% transliteration pairs, our system achieves up to 86.7% F-measure with 77.9% precision and 97.8% recall | ||
| noun | 191 | |
| 2020.emnlp-main.529 We also find that a combination of ***** noun ***** and verb types of keywords is the most effective for content selection. | ||
| P18-2011 In this paper, we propose an approach based on linguistic knowledge for identification of aliases mentioned using proper ***** noun *****s, pro***** noun *****s or ***** noun ***** phrases with common ***** noun ***** headword. | ||
| L16-1732 Building a knowledge graph for representing common-sense knowledge in which concepts discerned from ***** noun ***** phrases are cast as vertices and lexicalized relations are cast as edges leads to learning the embeddings of common-sense knowledge accounting for semantic compositionality as well as implied knowledge. | ||
| W18-0707 To accomplish this, the system uses a series of lexico-syntactic patterns in order to extract shell ***** noun ***** candidates and their content in parallel. | ||
| 2020.lrec-1.631 The parser covers core constructions of Wolof, including ***** noun ***** classes, cleft, copula, causative and applicative sentences | ||
| perplexity | 190 | |
| I17-1045 We show that an “attentive” RNN-LM (with 11M parameters) achieves a better ***** perplexity ***** than larger RNN-LMs (with 66M parameters) and achieves performance comparable to an ensemble of 10 similar sized RNN-LMs. | ||
| 2020.cogalex-1.11 In this paper, we revisit this hypothesis by using OpenAI's GPT-2 to calculate predictability of words as language model ***** perplexity *****. | ||
| 2020.findings-emnlp.15 We explored 5 metrics to gauge both naturalness and faithfulness automatically, and we chose to use BLEU plus METEOR for faithfulness and relative ***** perplexity ***** using a separately trained language model (GPT) for naturalness. | ||
| 2020.nuse-1.14 By augmenting GPT 2.0 with information retrieval we achieve a zero shot 15% relative reduction in ***** perplexity ***** on Gigaword corpus without any re-training. | ||
| D18-1413 Our model effectively encodes and generates scripts, outperforming a recent language modeling-based method on several standard tasks, and allowing the autoencoder model to achieve substantially lower ***** perplexity ***** scores compared to the previous language modeling-based method | ||
| translations | 190 | |
| L08-1579 In this paper we discuss a solution to derive a bilingual dictionary by transitivity using existing ones and to check the generated ***** translations ***** in a parallel corpus. | ||
| 2020.findings-emnlp.244 With low-resource languages in mind, we also show that ***** translations ***** can be effectively used in place of transcriptions but more data is needed to obtain similar results. | ||
| 2019.iwslt-1.31 In particular, if ***** translations ***** have to fit some given layout, quality should not only be measured in terms of adequacy and fluency, but also length. | ||
| 2020.emnlp-main.480 Cross-lingual document alignment aims to identify pairs of documents in two distinct languages that are of comparable content or ***** translations ***** of each other. | ||
| L12-1481 The European Commission's (EC) Directorate General for Translation, together with the EC's Joint Research Centre, is making available a large translation memory (TM; i.e. sentences and their professionally produced ***** translations *****) covering twenty-two official European Union (EU) languages and their 231 language pairs | ||
| detection | 190 | |
| 2021.sigdial-1.12 We also conduct an analysis of the novelty of the generated data and provide generated examples for intent ***** detection *****, slot tagging, and non-goal oriented conversations. | ||
| 2021.eacl-main.257 Available datasets suffer from several shortcomings: a) they contain few languages b) they contain small amounts of labeled examples per language c) they are based on the simple intent and slot ***** detection ***** paradigm for non-compositional queries. | ||
| 2021.blackboxnlp-1.18 Results suggest that all models encode some information supporting anomaly ***** detection *****, but ***** detection ***** performance varies between anomalies, and only representations from more re- cent transformer models show signs of generalized knowledge of anomalies. | ||
| L10-1328 The database design was conceived with extended support of research and development activities devoted to ***** detection ***** of typical and atypical events, emergency and crisis situations, which assist for achieving situational awareness and more reliable interpretation of the context in which humans behave. | ||
| D19-6610 In this work, we explore the effectiveness of automatically extracted warrants for evidence ***** detection ***** | ||
| transcriptions | 189 | |
| L14-1443 Large part of the data, both audio and ***** transcriptions *****, was collected using crowdsourcing, the rest are ***** transcriptions ***** by hired transcribers. | ||
| U18-1007 In spite of the recent success of Dialogue Act (DA) classification, the majority of prior works focus on text-based classification with oracle ***** transcriptions *****, i.e. human ***** transcriptions *****, instead of Automatic Speech Recognition (ASR)'s ***** transcriptions *****. | ||
| W17-5010 We present crowdsourced collection of error annotations for ***** transcriptions ***** of spoken learner English. | ||
| L10-1037 While progress in the search for the right annotation model and format is undeniable, these results only sparsely become manifest in actual solutions (i.e. software tools) that could be used by researchers wishing to annotate their resources right away, even less so for resources of spoken language ***** transcriptions *****. | ||
| L16-1114 We describe our ongoing work towards the fully automatic transcription of all ILSE interviews and the steps we implemented in preparing the ***** transcriptions ***** to meet the interviews' challenges | ||
| phonetic | 189 | |
| P18-4003 Romanization enables the application of string-similarity metrics to texts from different scripts without the need and complexity of an intermediate ***** phonetic ***** representation. | ||
| L14-1236 The first one (ORTOFON) continues the tradition of the CNC's ORAL series of spoken corpora by focusing on collecting recordings of unscripted informal spoken interactions (“prototypically spoken texts”), but also provides new features, most notably an annotation scheme with multiple tiers per speaker, including orthographic and ***** phonetic ***** transcripts and allowing for a more precise treatment of overlapping speech. | ||
| 2020.acl-main.696 Previous work has attempted to predict schwa deletion in a rule-based fashion using prosodic or ***** phonetic ***** analysis. | ||
| L16-1212 The IFCASL corpus is a French-German bilingual ***** phonetic ***** learner corpus designed, recorded and annotated in a project on individualized feedback in computer-assisted spoken language learning | ||
| 2020.findings-emnlp.106 Unsupervised speech representation learning has shown remarkable success at finding representations that correlate with *****phonetic***** structures and improve downstream speech recognition performance . | ||
| machine reading comprehension | 189 | |
| 2020.coling-main.235 The novel framework shows an interesting perspective on ***** machine reading comprehension ***** and cognitive science. | ||
| 2020.emnlp-main.549 Numerical reasoning over texts, such as addition, subtraction, sorting and counting, is a challenging ***** machine reading comprehension ***** task, since it requires both natural language understanding and arithmetic computation. | ||
| 2020.coling-main.219 Question answering over dialogue, a specialized ***** machine reading comprehension ***** task, aims to comprehend a dialogue and to answer specific questions. | ||
| 2020.coling-main.248 Neural models have achieved great success on the task of ***** machine reading comprehension ***** (MRC), which are typically trained on hard labels. | ||
| 2020.findings-emnlp.226 Answer validation in ***** machine reading comprehension ***** (MRC) consists of verifying an extracted answer against an input context and question pair. | ||
| language identification | 187 | |
| W19-0506 At the same time, the studies reveal empirical evidence why contextual abstractness represents a valuable indicator for automatic non-literal ***** language identification *****. | ||
| 2020.coling-main.579 Large text corpora are increasingly important for a wide variety of Natural Language Processing (NLP) tasks, and automatic ***** language identification ***** (LangID) is a core technology needed to collect such datasets in a multilingual context. | ||
| L14-1745 General purpose ***** language identification ***** tools do not take language varieties into account and our work aims to fill this gap. | ||
| N19-1201 In this paper, we extend the ***** language identification ***** task to the subword-level, such that it includes splitting mixed words while tagging each part with a language ID. | ||
| W18-6116 An accurate *****language identification***** tool is an absolute necessity for building complex NLP systems to be used on code - mixed data . | ||
| variational autoencoder | 186 | |
| 2020.findings-emnlp.233 Our model is built upon ***** variational autoencoder ***** and represents each hash bit as a Bernoulli variable, which allows the model to be end-to-end trainable. | ||
| P19-1590 We pretrain a unigram document model as a ***** variational autoencoder ***** on in-domain, unlabeled data and use its internal states as features in a downstream classifier. | ||
| D18-1423 Towards filling the gap, in this paper, we propose a conditional ***** variational autoencoder ***** with adversarial training for classical Chinese poem generation, where the autoencoder part generates poems with novel terms and a discriminator is applied to adversarially learn their thematic consistency with their titles. | ||
| D18-1432 We present three enhancements to existing encoder-decoder models for open-domain conversational agents, aimed at effectively modeling coherence and promoting output diversity: (1) We introduce a measure of coherence as the GloVe embedding similarity between the dialogue context and the generated response, (2) we filter our training corpora based on the measure of coherence to obtain topically coherent and lexically diverse context-response pairs, (3) we then train a response generator using a conditional ***** variational autoencoder ***** model that incorporates the measure of coherence as a latent variable and uses a context gate to guarantee topical consistency with the context and promote lexical diversity. | ||
| 2020.acl-main.461 We capture this intuition by defining a hierarchical ***** variational autoencoder ***** model | ||
| keyphrase extraction | 186 | |
| 2021.newsum-1.11 Using ***** keyphrase extraction ***** and semantic role labeling (SRL), we find that SRL captures relevant information without overwhelming the the model architecture. | ||
| 2020.aacl-demo.2 ii) We build powerful ***** keyphrase extraction ***** models that achieve state-of-the-art results on two public benchmarks. | ||
| 2021.emnlp-main.14 Embedding based methods are widely used for unsupervised ***** keyphrase extraction ***** (UKE) tasks. | ||
| N18-1151 Existing ***** keyphrase extraction ***** methods suffer from data sparsity problem when they are conducted on short and informal texts, especially microblog messages. | ||
| R17-1012 This paper evaluates different techniques for building a supervised, multilanguage ***** keyphrase extraction ***** pipeline for languages which lack a gold standard. | ||
| dictionary | 185 | |
| L12-1380 In an online survey, we present a sense of a target word from one ***** dictionary ***** with senses from the other ***** dictionary *****, asking for judgments of relatedness. | ||
| 2021.emnlp-main.441 Specifically, our method (1) selects salient predicates and object heads, (2) disambiguates predicate senses using only a verb sense ***** dictionary *****, and (3) obtains event types by jointly embedding and clustering predicate sense, object head pairs in a latent spherical space. | ||
| 2003.mtsummit-semit.6 This paper describes the methodology and implementation adopted for ***** dictionary ***** building and morphological analysis. | ||
| 2020.acl-main.675 InstaMap is a non-parametric model that learns a non-linear projection by iteratively: (1) finding a globally optimal rotation of the source embedding space relying on the Kabsch algorithm, and then (2) moving each point along an instance-specific translation vector estimated from the translation vectors of the point's nearest neighbours in the training ***** dictionary *****. | ||
| L12-1662 There are two general ways to construct a phonetization process: rule based systems (with rules based on inference approaches or proposed by expert linguists) and ***** dictionary ***** based solutions which consist in storing a maximum of phonological knowledge in a lexicon | ||
| extracted | 183 | |
| 2021.naacl-main.417 Existing works in multimodal affective computing tasks, such as emotion recognition and personality recognition, generally adopt a two-phase pipeline by first extracting feature representations for each single modality with hand crafted algorithms, and then performing end-to-end learning with ***** extracted ***** features. | ||
| P19-1523 A key step in open IE is confidence modeling, ranking the extractions based on their estimated quality to adjust precision and recall of ***** extracted ***** assertions. | ||
| C16-1201 On the other hand, events ***** extracted ***** from raw texts do not contain background knowledge on entities and relations that they are mentioned. | ||
| P19-1611 The re-ranking leverages different features that are directly ***** extracted ***** from the QA pipeline, i.e., a combination of retrieval and comprehension features. | ||
| 2020.ldl-1.11 To finish, we study the results of giving additional information in training time, such as, cohyponym links and instances ***** extracted ***** through patterns | ||
| skip - gram | 183 | |
| W19-4329 We suggest using indefinite inner product in *****skip-gram***** negative sampling algorithm. | ||
| P19-1044 We show that, trained on a diachronic corpus, the *****skip-gram***** with negative sampling architecture with temporal referencing outperforms alignment models on a synthetic task as well as a manual testset. | ||
| P17-1007 This work provides a theoretical justification for the presence of additive compositionality in word vectors learned using the *****Skip-Gram***** model. | ||
| D18-1174 We present disambiguated *****skip-gram*****: a neural-probabilistic model for learning multi-sense distributed representations of words. | ||
| I17-1024 To make our model more scalable and efficient, we use an online joint learning framework extended from the *****Skip-gram***** model. | ||
| nodes | 182 | |
| 2020.coling-main.144 The random restart probabilities are assigned based on the relevance of the graph ***** nodes ***** to the focus of the task. | ||
| D19-1314 The model learns parallel top-down and bottom-up representations of ***** nodes ***** capturing contrasting views of the graph. | ||
| L08-1245 There is also a clear division between word tokens and empty ***** nodes *****, and the token attributes are stored together with the word, instead of being spread out individually in the file. | ||
| 2021.emnlp-main.838 We propose the “softmax tree”, consisting of a binary tree having sparse hyperplanes at the decision ***** nodes ***** (which make hard, not soft, decisions) and small softmax classifiers at the leaves. | ||
| 2020.emnlp-main.310 By introducing the graph structure, the relationships between documents are established through their shared words and thus the topical representation of a document is enriched by aggregating information from its neighboring ***** nodes ***** using graph convolution | ||
| paraphrasing | 182 | |
| W18-6454 We notice that some words are always copied during ***** paraphrasing *****, which we call copy knowledge. | ||
| P18-1113 Our model extends to interpret the identified metaphors, ***** paraphrasing ***** them into their literal counterparts, so that they can be better translated by machines. | ||
| 2021.acl-long.385 Considering that only relying on the same position substitution cannot handle the variable-length correction cases, various operations such substitution, deletion, insertion, and local ***** paraphrasing ***** are required jointly. | ||
| D19-1588 A qualitative analysis of model predictions indicates that, compared to ELMo and BERT-base, BERT-large is particularly better at distinguishing between related but distinct entities (e.g., President and CEO), but that there is still room for improvement in modeling document-level context, conversations, and mention ***** paraphrasing *****. | ||
| 2020.emnlp-main.602 To address this challenge, we adopt an unsupervised approach using auxiliary supervision with related tasks such as ***** paraphrasing ***** and self-supervision based on a reconstruction loss, building on pretrained language models | ||
| universal dependency | 182 | |
| K17-3020 In this work, we design a system based on UDPipe1 for ***** universal dependency ***** parsing, where multilingual transition-based models are trained for different treebanks. | ||
| 2021.wanlp-1.27 Previous work on CEAE has shown the cross-lingual benefits of ***** universal dependency ***** trees in capturing shared syntactic structures of sentences across languages. | ||
| K18-2011 We present the Uppsala system for the CoNLL 2018 Shared Task on ***** universal dependency ***** parsing. | ||
| K17-3023 In this paper we describe the system by METU team for ***** universal dependency ***** parsing of multilingual text. | ||
| R17-1007 Recently, a new universal scheme was designed as a part of ***** universal dependency ***** project. | ||
| KB | 181 | |
| 2021.naacl-main.65 Experiments on English datasets where models are trained on the CoNLL dataset, and tested on the TAC-***** KB *****P 2010 dataset show that our models are 12% (absolute) more accurate than baseline models that simply flatten entities from the target ***** KB *****. | ||
| 2021.acl-long.139 We conduct a large-scale, systematic investigation of aligning ***** KB ***** and text embeddings for joint reasoning. | ||
| N19-1323 For general purpose ***** KB *****s, this is often done through Relation Extraction (RE), the task of predicting ***** KB ***** relations expressed in text mentioning entities known to the ***** KB *****. | ||
| 2020.lrec-1.94 A stream of this network also utilizes transfer learning by pre-training a bidirectional transformer to extract semantic representation for each input sentence and incorporates external knowledge through the neighborhood of the entities from a Knowledge Base (***** KB *****). | ||
| N19-1029 Although only dealing with simple questions, i.e., questions that can be answered through a single knowledge base (***** KB *****) fact, this task is neither simple nor close to being solved | ||
| argumentative | 180 | |
| 2021.argmining-1.3 Finally, we make first steps to address the problem of reference-less evaluation of ***** argumentative ***** conclusion generations. | ||
| W18-5215 We describe two systems that use retrieval-based and generative models to make ***** argumentative ***** responses to the users. | ||
| 2020.argmining-1.12 Then, we employed that approach to distinguish various patterns of style in selected sets of ***** argumentative ***** articles and presidential debates. | ||
| 2021.argmining-1.4 Our results indicate that even simple expansions provide a strong baseline, reaching a precision@10 of 0.49 for images being (1) on-topic, (2) ***** argumentative *****, and (3) on-stance. | ||
| C18-1176 Finally, we adapt our system to solve a recent argument mining task of identifying ***** argumentative ***** sentences in Web texts retrieved from heterogeneous sources, and obtain F1 scores comparable to the supervised baseline. | ||
| parallel corpus | 179 | |
| L12-1393 Our experimental evaluation showed that this approach is promising for applying SMT, even when a source-side ***** parallel corpus ***** is lacking. | ||
| Q18-1017 Using this approach, we achieve considerable improvements in terms of BLEU score on relatively large ***** parallel corpus ***** (WMT14 English to German) and a low-resource (WIT German to English) setup. | ||
| N19-1199 We propose a method to learn a correspondence between independently engineered lexicosyntactic features in two languages, using a large ***** parallel corpus ***** of out-of-domain movie dialogue data. | ||
| 2020.lrec-1.415 This article describes the process of gathering and constructing a bilingual ***** parallel corpus ***** of Islamic Hadith, which is the set of narratives reporting different aspects of the prophet Muhammad's life. | ||
| W18-6485 This paper describes the participation of SYSTRAN to the shared task on *****parallel corpus***** filtering at the Third Conference on Machine Translation ( WMT 2018 ) . | ||
| open information extraction | 179 | |
| 2020.findings-emnlp.99 In this paper, we propose Multi^2OIE, which performs ***** open information extraction ***** (open IE) by combining BERT with multi-head attention. | ||
| L16-1732 Capturing common-sense and domain-specific knowledge can be achieved by taking advantage of recent advances in ***** open information extraction ***** (IE) techniques and, more importantly, of knowledge embeddings, which are multi-dimensional representations of concepts and relations. | ||
| D18-1129 SalIE is unsupervised and knowledge agnostic, based on ***** open information extraction ***** to detect facts in natural language text, PageRank to determine their relevance, and clustering to promote diversity. | ||
| 2020.acl-demos.8 It is supported by novel data-driven methods for distantly supervised named entity recognition and ***** open information extraction *****. | ||
| D19-1067 We propose a novel supervised ***** open information extraction ***** (Open IE) framework that leverages an ensemble of unsupervised Open IE systems and a small amount of labeled data to improve system performance. | ||
| modality | 178 | |
| 2020.emnlp-main.291 In this scenario, we need to consider both the dependence among different labels (label dependence) and the dependence between each predicting label and different modalities (***** modality ***** dependence). | ||
| W16-5005 In this talk, I will go beyond English, and I will discuss how negation and ***** modality ***** are expressed in other languages. | ||
| 2019.iwslt-1.6 Upon conducting extensive experiments, we found that (i) the explored visual integration schemes often harm the translation performance for the transformer and additive deliberation, but considerably improve the cascade deliberation; (ii) the transformer and cascade deliberation integrate the visual ***** modality ***** better than the additive deliberation, as shown by the incongruence analysis. | ||
| P19-1656 At the heart of our model is the directional pairwise crossmodal attention, which attends to interactions between multimodal sequences across distinct time steps and latently adapt streams from one ***** modality ***** to another | ||
| 2021.acl-long.201 Specifically, with a two-stream multi-modal Transformer encoder, LayoutLMv2 uses not only the existing masked visual-language modeling task but also the new text-image alignment and text-image matching tasks, which make it better capture the cross-***** modality ***** interaction in the pre-training stage. | ||
| context | 178 | |
| 2020.emnlp-main.279 By this means, the auxiliary tasks that relate to ***** context ***** understanding can guide the learning of the generation model to achieve a better local optimum. | ||
| 2021.reinact-1.7 We argue that integrating pragmatic reasoning into the inference of ***** context *****-agnostic generation models could reconcile traits of traditional and neural REG, as this offers a separation between ***** context *****-independent, literal information and pragmatic adaptation to ***** context *****. | ||
| 2020.coling-main.128 We find that both ***** context ***** types can improve performance, although the improvements are dependent on ***** context ***** size and position. | ||
| D19-1546 Experiments using current baseline code generation models show that both ***** context ***** and distant supervision aid in generation, and that the dataset is challenging for current systems. | ||
| P17-2050 BONIE's novelty lies in task-specific customizations, such as inferring implicit relations, which are clear due to ***** context ***** such as units (for e.g., `square kilometers' suggests area, even if the word `area' is missing in the sentence) | ||
| WSD | 177 | |
| 2016.gwc-1.2 The third part describes the algorithm used for ***** WSD *****. | ||
| P19-1568 Current supervised ***** WSD ***** methods treat senses as discrete labels and also resort to predicting the Most-Frequent-Sense (MFS) for words unseen during training. | ||
| W18-6304 The experimental results suggest that NMT models learn to encode contextual information necessary for ***** WSD ***** in the encoder hidden states. | ||
| 2020.emnlp-main.332 However, existing ***** WSD ***** systems rarely consider multilingual information, and no effective method has been proposed for improving ***** WSD ***** by generating translations | ||
| W19-3604 In this paper , we presented a *****WSD***** system that uses LDA topics for semantic expansion of document words . | ||
| reinforcement | 177 | |
| P19-1194 Moreover, to tackle the problem of lacking parallel data, we propose a cycle ***** reinforcement ***** learning algorithm to guide the model training. | ||
| K17-1039 This suggests that our approach can be regarded as a viable alternative to using ***** reinforcement ***** learning or more computationally expensive imitation learning. | ||
| S17-1008 While most existing research proposes offline supervision or hand-crafted reward functions for online ***** reinforcement *****, we devise a novel interactive learning mechanism based on hamming-diverse beam search for response generation and one-character user-feedback at each step. | ||
| C18-1107 The proposed structured deep ***** reinforcement ***** learning is based on graph neural networks (GNN), which consists of some sub-networks, each one for a node on a directed graph. | ||
| C18-1183 As for noisy annotation, we design an instance selector based on ***** reinforcement ***** learning to distinguish positive sentences from auto-generated annotations | ||
| data | 177 | |
| 2021.eacl-main.98 Latent variable models for text , when trained successfully , accurately model the *****data***** distribution and capture global semantic and syntactic features of sentences . | ||
| W19-3506 This research discusses multi - label text classification for abusive language and hate speech detection including detecting the target , category , and level of hate speech in Indonesian Twitter using machine learning approach with Support Vector Machine ( SVM ) , Naive Bayes ( NB ) , and Random Forest Decision Tree ( RFDT ) classifier and Binary Relevance ( BR ) , Label Power - set ( LP ) , and Classifier Chains ( CC ) as the *****data***** transformation method . | ||
| 2021.acl-long.548 When collecting annotations and labeled data from humans , a standard practice is to use inter - rater reliability ( IRR ) as a measure of *****data***** goodness ( Hallgren , 2012 ) . | ||
| 2020.iwslt-1.3 For the offline task , we create both cascaded and end - to - end speech translation systems , paying attention to careful *****data***** selection and weighting . | ||
| P19-1123 Noise and domain are important aspects of *****data***** quality for neural machine translation . | ||
| transcription | 176 | |
| 2021.eacl-main.96 We then propose a novice ***** transcription ***** correction task and demonstrate how ASR systems and novice transcribers can work together to improve EL documentation. | ||
| 2020.iwslt-1.30 Automatic speech recognition (ASR) systems are primarily evaluated on ***** transcription ***** accuracy. | ||
| 2021.wassa-1.26 We further find that the inclusion of audio features partially mitigates ***** transcription ***** errors, but that a naive usage of a multi-task setup does not | ||
| 2020.lrec-1.655 The analysis of the structure of speech nearly always rests on the alignment of the speech recording with a phonetic ***** transcription *****. | ||
| 2020.acl-main.215 Our model treats speech and its own textual representation as two separate modalities or views, as it jointly learns from streamed audio and its noisy ***** transcription ***** into text via automatic speech recognition. | ||
| CoNLL | 175 | |
| K17-3007 We describe our submission to the ***** CoNLL ***** 2017 shared task, which exploits the shared common knowledge of a language across different domains via a domain adaptation technique. | ||
| K18-2014 We describe the SEx BiST parser (Semantically EXtended Bi-LSTM parser) developed at Lattice for the ***** CoNLL ***** 2018 Shared Task (Multilingual Parsing from Raw Text to Universal Dependencies). | ||
| 2021.crac-1.4 On the best-performing setup, it achieves a ***** CoNLL ***** score of 32% when using automatically detected mentions and 55% when using gold mentions. | ||
| L14-1657 The Stanford Coreference Resolution System (StCR) is a multi-pass, rule-based system that scored best in the ***** CoNLL ***** 2011 shared task on general discourse coreference resolution | ||
| 2020.codi-1.16 A substantial overlap of coreferent mentions in the *****CoNLL***** dataset magnifies the recent progress on coreference resolution . | ||
| supervised | 175 | |
| 2020.coling-main.45 In this work, we propose a probabilistic autoencoding framework to deal with this ***** supervised ***** classification task. | ||
| 2020.acl-main.607 Experiments show our model achieves significant and consistent improvement over the ***** supervised ***** baseline. | ||
| Q13-1008 In this paper, we propose a novel ***** supervised ***** approach that can incorporate rich sentence features into Bayesian topic models in a principled way, thus taking advantages of both topic model and feature based ***** supervised ***** learning methods. | ||
| 2021.semeval-1.6 The task could be addressed as ***** supervised ***** sequence labeling, using training data with gold toxic spans provided by the organisers. | ||
| W19-4222 Our AG-based approaches outperform other un***** supervised ***** approaches and show promise when compared to ***** supervised ***** methods, outperforming them on two of the four languages | ||
| tasks | 175 | |
| 2020.winlp-1.22 This study also indicates that ***** tasks ***** that require descriptions of images draw more neologism, loanword and error production. | ||
| 2021.acl-long.378 Since it does not require access to all ***** tasks ***** during training, it is attractive in on-device deployment settings where ***** tasks ***** arrive in stream or even from different providers. | ||
| 2020.findings-emnlp.356 In this paper, we conduct extensive experiments on 3 ***** tasks ***** over 18 datasets and 8 languages to study the accuracy of sequence labeling with various embedding concatenations and make three observations: (1) concatenating more embedding variants leads to better accuracy in rich-resource and cross-domain settings and some conditions of low-resource settings; (2) concatenating contextual sub-word embeddings with contextual character embeddings hurts the accuracy in extremely low-resource settings; (3) based on the conclusion of (1), concatenating additional similar contextual embeddings cannot lead to further improvements. | ||
| 1999.mtsummit-1.85 Most importantly, the objectives the technology's use is expected to accomplish must be known, the objectives must be expressed as ***** tasks ***** that accomplish the objectives, and then successful outcomes defined for the ***** tasks *****. | ||
| D19-1050 Through further task ablations and representational analyses, we find that ***** tasks ***** which produce syntax-light representations yield significant improvements in brain decoding performance | ||
| 2017 | 174 | |
| D18-1477 We also obtain an average of 0.7 BLEU improvement over the Transformer model (Vaswani et al., ***** 2017 *****) on translation by incorporating SRU into the architecture. | ||
| N18-2013 In this paper, we adapt an architecture with augmented memory capacities called Neural Semantic Encoders (Munkhdalai and Yu, ***** 2017 *****) for sentence simplification. | ||
| W17-5046 In this paper, we discuss the results of the IUCL system in the NLI Shared Task ***** 2017 *****. | ||
| N19-1421 To capture common sense beyond associations, we extract from ConceptNet (Speer et al., ***** 2017 *****) multiple target concepts that have the same semantic relation to a single source concept. | ||
| 2020.coling-main.524 Our model outperforms the best performing systems by 1 BLEU point on the WMT 2016, ***** 2017 *****, and 2018 English–German APE shared tasks (PBSMT and NMT) | ||
| word segmentation | 174 | |
| L10-1262 The first approach uses complex tags that describe full words and does not require any ***** word segmentation *****. | ||
| 2020.findings-emnlp.364 We compare the two baselines with key configurations and find that: automatic Vietnamese ***** word segmentation ***** improves the parsing results of both baselines; the normalized pointwise mutual information (NPMI) score (Bouma, 2009) is useful for schema linking; latent syntactic features extracted from a neural dependency parser for Vietnamese also improve the results; and the monolingual language model PhoBERT for Vietnamese (Nguyen and Nguyen, 2020) helps produce higher performances than the recent best multilingual language model XLM-R (Conneau et al., 2020). | ||
| 2020.coling-main.187 Chinese ***** word segmentation ***** (CWS) and part-of-speech (POS) tagging are two fundamental tasks for Chinese language processing. | ||
| Q18-1030 In this paper, we present a sequence tagging framework and apply it to ***** word segmentation ***** for a wide range of languages with different writing systems and typological characteristics. | ||
| 2019.iwslt-1.10 However, selecting the optimal sub***** word segmentation ***** involves a trade-off between expressiveness and flexibility, and is language and dataset-dependent. | ||
| word sense | 174 | |
| W17-1915 This paper compares two approaches to ***** word sense ***** disambiguation using word embeddings trained on unambiguous synonyms. | ||
| 2020.coling-main.254 Existing work finds that syntactic, semantic and ***** word sense ***** knowledge are encoded in BERT. | ||
| L10-1448 Two types of semantic change are amelioration and pejoration; in these processes a ***** word sense ***** changes to become more positive or negative, respectively. | ||
| 2019.gwc-1.20 We show that the t-SNE co-ordinates can be used to reveal interesting semantic relations between ***** word sense *****s, and propose a new method that uses the simple x,y coordinates to compute semantic similarity. | ||
| 2012.amta-papers.29 Most attempts at integrating *****word sense***** disambiguation with statistical machine translation have focused on supervised disambiguation approaches . | ||
| encode | 173 | |
| N19-1240 Mentions of entities are nodes of this graph while edges ***** encode ***** relations between different mentions (e.g., within- and cross-document co-reference). | ||
| K18-1038 We present results that show, under exhaustive and precise conditions, that one kind of word embeddings and the similarity spaces they define do not ***** encode ***** the properties of intervention similarity in long-distance dependencies, and that therefore they fail to represent this core linguistic notion. | ||
| W19-4213 The resulting operations not only ***** encode ***** the actions to be performed but the relative position in the word token and how characters need to be transformed. | ||
| 2021.naacl-main.362 Secondly, since the transformer-based language models cannot ***** encode ***** the flow of events by themselves, we propose a Time-Stamped Language Model (TSLM) to ***** encode ***** event information in LMs architecture by introducing the timestamp encoding. | ||
| N19-1064 First, we conduct several intrinsic analyses and find that (1) training data for ELMo contains significantly more male than female entities, (2) the trained ELMo embeddings systematically ***** encode ***** gender information and (3) ELMo unequally ***** encode *****s gender information about male and female entities | ||
| regularization | 173 | |
| 2021.naacl-main.3 A new ***** regularization ***** mechanism is introduced to enforce the consistency between the golden and predicted type dependency graphs to improve representation learning. | ||
| N19-1386 Our method includes ***** regularization ***** terms to enforce cycle consistency and input reconstruction, and puts the target encoders as an adversary against the corresponding discriminator. | ||
| D19-1295 Experiments show that our knowledge ***** regularization ***** approach outperforms all previous systems on the benchmark dataset PDTB for discourse parsing. | ||
| S19-1020 Noise is inherent in real world datasets and modeling noise is critical during training as it is effective in ***** regularization *****. | ||
| N19-3002 We find this ***** regularization ***** method to be effective in reducing gender bias up to an optimal weight assigned to the loss term, beyond which the model becomes unstable as the perplexity increases | ||
| Entity | 173 | |
| W19-1912 ***** Entity ***** linking (or Normalization) is an essential task in text mining that maps the entity mentions in the medical text to standard entities in a given Knowledge Base (KB). | ||
| 2020.emnlp-main.523 *****Entity***** representations are useful in natural language tasks involving entities . | ||
| D19-6219 *****Entity***** recognition is a critical first step to a number of clinical NLP applications , such as entity linking and relation extraction . | ||
| Q15-1011 *****Entity***** disambiguation with Wikipedia relies on structured information from redirect pages , article text , inter - article links , and categories . | ||
| 2021.law-1.18 Previous work on *****Entity***** Linking has focused on resources targeting non - nested proper named entity mentions , often in data from Wikipedia , i.e. | ||
| heterogeneous | 171 | |
| 2021.adaptnlp-1.19 Recent complementary strands of research have shown that leveraging information on the data source through encoding their properties into embeddings can lead to performance increase when training a single model on ***** heterogeneous ***** data sources. | ||
| W19-3808 Standard debiasing methods require ***** heterogeneous ***** lists of target words to identify the “bias subspace”. | ||
| 2020.ccl-1.87 The data is multi-source and ***** heterogeneous *****, which raises a great challenge for processing it. | ||
| 2021.naacl-main.387 Our results demonstrate that positive knowledge transfer via context-specific shared representations of a flexible cross-stitched parameter sharing model helps establish the inherent benefit of jointly modeling tasks related to sexual abuse disclosures with emotion classification from the text in homogeneous and ***** heterogeneous ***** settings. | ||
| 2021.emnlp-main.300 In contrast to existing tasks on general domain, the finance domain includes complex numerical reasoning and understanding of ***** heterogeneous ***** representations | ||
| annotated corpora | 171 | |
| L16-1283 Furthermore, we present some statistics on the ***** annotated corpora *****, from which we can conclude that the detection of contrasting evaluations might be a good indicator for recognizing irony. | ||
| L12-1551 The usefulness of ***** annotated corpora ***** is greatly increased if there is an associated tool that can allow various kinds of operations to be performed in a simple way. | ||
| L16-1676 In this paper we present newly developed inflectional lexcions and manually ***** annotated corpora ***** of Croatian and Serbian. | ||
| 2021.naacl-demos.1 Although we specify PhoNLP for Vietnamese, our PhoNLP training and evaluation command scripts in fact can directly work for other languages that have a pre-trained BERT-based language model and gold ***** annotated corpora ***** available for the three tasks of POS tagging, NER and dependency parsing. | ||
| 2021.emnlp-main.520 With a total of over 500 hours of videos annotated with both extractive and abstractive summaries, our benchmark dataset is significantly larger than currently existing ***** annotated corpora ***** | ||
| synsets | 170 | |
| 2021.gwc-1.25 In the evaluation, we first converted the ***** synsets ***** in the Arabic WordNet into translation pairs (i.e., losing word-sense memberships). | ||
| L10-1090 We describe a new method for sentiment load annotation of the ***** synsets ***** of a wordnet, along the principles of Osgoods Semantic Differential theory and extending the Kamp and Marx calculus, by taking into account not only the WordNet structure but also the SUMO/MILO (Niles & Pease, 2001) and DOMAINS (Bentivogli et al., 2004) knowledge sources. | ||
| L08-1269 Of course, we must also add some ***** synsets ***** which do not exist in the Princeton WordNet, and must modify ***** synsets ***** in the Princeton WordNet, in order to make the hierarchical structure of Princeton ***** synsets ***** represent thesaurus-like information found in the Japanese language, however, we will address these tasks in a future study. | ||
| L16-1455 The totality of PolNet 2.0 ***** synsets ***** is being revised in order to split the PolNet 2.0 ***** synsets ***** that contain different register words into register-uniform sub-***** synsets *****. | ||
| L10-1639 WQuery is a query language that make use of data types based on ***** synsets *****, word senses and various semantic relations which occur in wordnet-like lexical databases | ||
| Emotion | 170 | |
| 2021.wassa-1.22 ***** Emotion ***** detection is an important task that can be applied to social media data to discover new knowledge. | ||
| W19-2501 ***** Emotion ***** is represented following the popular Valence-Arousal-Dominance (VAD) annotation scheme. | ||
| 2020.coling-main.392 *****Emotion***** recognition in textual conversations ( ERTC ) plays an important role in a wide range of applications , such as opinion mining , recommender systems , and so on . | ||
| W18-3304 *****Emotion***** recognition has become a popular topic of interest , especially in the field of human computer interaction . | ||
| L10-1221 *****Emotion***** processing has always been a great challenge . | ||
| domain adaptation | 170 | |
| 2019.iwslt-1.26 We study here a related setting, multi-***** domain adaptation *****, where the number of domains is potentially large and adapting separately to each domain would waste training resources. | ||
| 2021.adaptnlp-1.5 More broadly, our method can be used for textual ***** domain adaptation ***** where the latent classes are unknown but overlap with known classes from other domains. | ||
| D17-1156 We also investigate the amounts of in-domain training data needed for ***** domain adaptation ***** in NMT, and find a logarithmic relationship between the amount of training data and gain in BLEU score. | ||
| 2020.coling-main.603 Motivated by the latest advances, in this survey we review neural unsupervised ***** domain adaptation ***** techniques which do not require labeled target domain data. | ||
| E17-5002 The technical differences between NMT and the previously dominant phrase-based statistical approach require that practictioners learn new best practices for building MT systems, ranging from different hardware requirements, new techniques for handling rare words and monolingual data, to new opportunities in continued learning and ***** domain adaptation *****.This tutorial is aimed at researchers and users of machine translation interested in working with NMT. | ||
| computational linguistics | 170 | |
| C18-1272 Such models can provide fertile ground for (cognitive) ***** computational linguistics ***** studies. | ||
| W16-4812 This is the first preliminary study for a dialect that has not been widely studied in ***** computational linguistics *****, evidencing the possible existence of distinct subdialects. | ||
| N18-5004 We present CL Scholar, the ACL Anthology knowledge graph miner to facilitate high-quality search and exploration of current research progress in the ***** computational linguistics ***** community. | ||
| 2021.trustnlp-1.6 We discuss future work that would benefit immensely from a ***** computational linguistics ***** perspective. | ||
| W19-0604 We argue that using fuzzy sets for modeling meaning of words and other natural language constructs, along with situations described with natural language is interesting both from purely linguistic perspective, and also as a knowledge representation for problems of ***** computational linguistics ***** and natural language processing. | ||
| prediction | 169 | |
| 2021.semeval-1.82 RS_GV predicts the complexity well of biomedical terms but it has problems with the complexity ***** prediction ***** of very complex and very simple target words. | ||
| 2020.nlpcss-1.9 While this task has been closely associated with emotion ***** prediction *****, we argue and show that identifying worry needs to be addressed as a separate task given the unique challenges associated with it. | ||
| 2021.repl4nlp-1.30 First, we propose to guide the pretrained LM's attention mechanism to focus on relevant context by using attention probabilities as additional features for evidence ***** prediction *****. | ||
| N19-1321 We develop efficient procedures to tackle the computation difficulties involved in training and ***** prediction ***** | ||
| 2020.wanlp-1.16 Our system is developed for the Fairseq framework, which allows for a fast and easy use for any other sequence ***** prediction ***** problem. | ||
| arabic dialect identification | 168 | |
| W16-4819 We primarily focused on the *****Arabic dialect identification***** task and obtained an F1 score of 0.4834, ranking 6th out of 18 participants. | ||
| 2020.wanlp-1.30 In this paper, we present our work for the NADI Shared Task (Abdul-Mageed and Habash, 2020): Nuanced *****Arabic Dialect Identification***** for Subtask-1: country-level dialect identification. | ||
| W16-4801 The challenge offered two subtasks: subtask 1 focused on the identification of very similar languages and language varieties in newswire texts, whereas subtask 2 dealt with *****Arabic dialect identification***** in speech transcripts. | ||
| 2021.acl-long.177 QASR is suitable for training and evaluating speech recognition systems, acoustics- and/or linguistics- based *****Arabic dialect identification*****, punctuation restoration, speaker identification, speaker linking, and potentially other NLP modules for spoken data. | ||
| 2020.wanlp-1.27 In this paper, we investigate the *****Arabic dialect identification***** task, from two perspectives: country-level dialect identification from 21 Arab countries, and province-level dialect identification from 100 provinces. | ||
| quality estimation | 167 | |
| P18-1020 We describe a novel method for efficiently eliciting scalar annotations for dataset construction and system ***** quality estimation ***** by human judgments. | ||
| 2020.acl-main.558 Recent advances in pre-trained multilingual language models lead to state-of-the-art results on the task of ***** quality estimation ***** (QE) for machine translation. | ||
| 2021.emnlp-main.267 To reduce the negative impact of noises, we propose a self-supervised method for both sentence- and word-level QE, which performs ***** quality estimation ***** by recovering the masked target words. | ||
| 2021.repl4nlp-1.23 To enhance performance, we further propose an extension to a state-of-the-art Bayesian meta-learning approach which utilizes a matrix-valued kernel for Bayesian meta-learning of ***** quality estimation *****. | ||
| 2020.wmt-1.113 This paper describes the system submitted by Papago team for the *****quality estimation***** task at WMT 2020 . | ||
| GEC | 166 | |
| P19-2020 In this paper, we propose a method for neural grammar error correction (***** GEC *****) that can control the degree of correction. | ||
| N18-2046 Our analysis shows that the created systems are closer to reaching human-level performance than any other ***** GEC ***** system reported so far. | ||
| 2021.acl-short.89 cLang-8 greatly simplifies typical ***** GEC ***** training pipelines composed of multiple fine-tuning stages – we demonstrate that performing a single fine-tuning stepon cLang-8 with the off-the-shelf language models yields further accuracy improvements over an already top-performing gT5 model for English. | ||
| 2021.acl-long.462 Not only does our approach allow a single model to achieve the state-of-the-art results in English ***** GEC ***** benchmarks: 66.4 F0.5 in the CoNLL-14 and 72.9 F0.5 in the BEA-19 test set with an almost 10x online inference speedup over the Transformer-big model, but also it is easily adapted to other languages. | ||
| 2021.naacl-main.429 However, existing models neglect the possible ***** GEC ***** evidence from different hypotheses | ||
| iii | 166 | |
| D17-1118 Our analysis shows that the effects reported in recent literature must be substantially revised: (i) the proposed negative correlation between meaning change and word frequency is shown to be largely an artefact of the models of word representation used; (ii) the proposed negative correlation between meaning change and prototypicality is shown to be much weaker than what has been claimed in prior art; and (***** iii *****) the proposed positive correlation between meaning change and polysemy is largely an artefact of word frequency. | ||
| 2020.lrec-1.389 The links can be used to (i) evaluate existing wordnets, (ii) add data to these wordnets and (***** iii *****) create new open wordnets for Khmer, Korean, Lao, Mongolian, Russian, Tagalog, Urdua nd Vietnamese | ||
| W18-0540 We developed solutions following three approaches: (i) a feature engineering method using lexical, n-gram and psycholinguistic features, (ii) a shallow neural network method using only word embeddings, and (***** iii *****) a Long Short-Term Memory (LSTM) language model, which is pre-trained on a large text corpus to produce a contextualized word vector. | ||
| Q18-1037 In such games, a successful player must (i) infer the partner's private information from the partner's messages, (ii) generate messages that are most likely to help with the goal, and (***** iii *****) reason pragmatically about the partner's strategy. | ||
| L12-1476 In this paper, we describe the online repository that we have created as a one-stop resource for obtaining NLG task materials, both from Generation Challenges tasks and from other sources, where the set of materials provided for each task consists of (i) task definition, (ii) input and output data, (***** iii *****) evaluation software, (iv) documentation, and (v) publications reporting previous results | ||
| biomedical | 166 | |
| 2020.acl-main.520 Through extensive experiments on three ***** biomedical ***** data sets, we show that our model can effectively recognize discontinuous mentions without sacrificing the accuracy on continuous mentions. | ||
| 2020.louhi-1.10 This model outperforms two strong baselines in two ***** biomedical ***** event extraction corpora in a Knowledge Base Population setting, and also achieves competitive performance in BioNLP challenge evaluation settings. | ||
| W19-5049 We show that leveraging information from parallel tasks across domains along with medical knowledge integration allows our model to learn better ***** biomedical ***** feature representations. | ||
| 2020.bionlp-1.4 Inferring the nature of the relationships between ***** biomedical ***** entities from text is an important problem due to the difficulty of maintaining human-curated knowledge bases in rapidly evolving fields. | ||
| 2018.gwc-1.50 The NCIt Derived WordNet (ncitWN) is based on the National Cancer Institute Thesaurus (NCIt), a controlled ***** biomedical ***** terminology that includes formal class restrictions and English definitions developed by groups of clinicians and terminologists | ||
| bidirectional | 164 | |
| 2021.iwslt-1.22 In addition, we proposed two novel pre-train approaches, i.e. de-noising training and ***** bidirectional ***** training to fully exploit the data. | ||
| 2021.acl-long.379 In this paper, we propose a recursive Transformer model based on differentiable CKY style binary trees to emulate this composition process, and we extend the ***** bidirectional ***** language model pre-training objective to this architecture, attempting to predict each word given its left and right abstraction nodes. | ||
| W16-5104 We propose an approach for named entity recognition in medical data, using a character-based deep ***** bidirectional ***** recurrent neural network. | ||
| W18-4927 Our system is language-independent and uses the ***** bidirectional ***** Long Short-Term Memory model with a Conditional Random Field layer on top (***** bidirectional ***** LSTM-CRF). | ||
| S19-2116 In the second method, we develop a deep neural network consisting of ***** bidirectional ***** recurrent layers with Gated Recurrent Unit (GRU) cells and fully connected layers | ||
| text style transfer | 164 | |
| 2021.emnlp-main.730 In this paper, we explore Non-AutoRegressive (NAR) decoding for unsupervised ***** text style transfer *****. | ||
| 2021.ranlp-1.64 Through extensive experiments on two popular ***** text style transfer ***** tasks, we show that our proposed method significantly outperforms twelve state-of-the-art methods. | ||
| D19-1499 In this paper, we first propose a semi-supervised ***** text style transfer ***** model that combines the small-scale parallel data with the large-scale nonparallel data. | ||
| 2021.emnlp-main.729 In this paper, we propose a collaborative learning framework for unsupervised ***** text style transfer ***** using a pair of bidirectional decoders, one decoding from left to right while the other decoding from right to left. | ||
| 2021.emnlp-main.684 We compare our models with five representative *****text style transfer***** models on three datasets across different domains. | ||
| propagation | 163 | |
| 2021.internlp-1.7 However, this has also enabled the widespread ***** propagation ***** of fake news, text that is published with an intent to spread misinformation and sway beliefs. | ||
| 2020.coling-main.24 For EASA, compared to pipeline and multi-task approaches, joint aspect extraction and sentiment analysis provides a one-step solution to predict both aspect terms and their sentiment polarities through a single decoding process, which avoid the mismatches in between the results of aspect terms and sentiment polarities, as well as error ***** propagation *****. | ||
| 2021.wanlp-1.8 We experimented with SOTA models of versatile approaches that either exploit content, user profiles features, temporal features and ***** propagation ***** structure of the conversational threads for tweet verification. | ||
| 2020.emnlp-main.32 The hypothesis is validated by three theoretical perspectives: semantic scaling, ***** propagation ***** dynamics and matrix perturbation. | ||
| L14-1653 Although many variants of the ***** propagation ***** method are developed for English, little is known about how they perform with WordNets of other languages | ||
| heuristics | 162 | |
| 2000.iwpt-1.26 In this paper, we report on significant progress, i.e., (1) developing guidelines for the grammar partition through a set of ***** heuristics *****, (2) devising a new mix-strategy composition algorithms for any rule-based grammar partition in a lattice framework, and 3) initial but encouraging parsing results for Chinese and English queries from an Air Travel Information System (ATIS) corpus. | ||
| P19-1387 In contrast to existing propositions which primarily employ features like page reputation, editor activity or rule based ***** heuristics *****, we utilize the textual content of the edits which, we believe contains superior signatures of their quality. | ||
| L12-1434 Certain Swedish noun-phrase types are paraphrased using basic ***** heuristics *****. | ||
| 2020.emnlp-main.649 Further, we discover that our metrics can serve the additional purpose of being inexpensive ***** heuristics ***** for detecting generically low quality examples. | ||
| W19-2804 This task has been largely neglected by the EDL community because it is challenging to outperform simple edit distance or other ***** heuristics ***** based baselines | ||
| analysis | 162 | |
| D19-1566 Experimental results suggest the efficacy of the proposed model for both sentiment and emotion ***** analysis ***** over various existing state-of-the-art systems. | ||
| 2021.acl-long.523 We show that Polyjuice produces diverse sets of realistic counterfactuals, which in turn are useful in various distinct applications: improving training and evaluation on three different tasks (with around 70% less annotation effort than manual generation), augmenting state-of-the-art explanation techniques, and supporting systematic counterfactual error ***** analysis ***** by revealing behaviors easily missed by human experts. | ||
| Q19-1026 We also describe ***** analysis ***** of 25-way annotations on 302 examples, giving insights into human variability on the annotation task. | ||
| 2021.blackboxnlp-1.19 Tsetlin Machine (TM) is an interpretable pattern recognition algorithm based on propositional logic, which has demonstrated competitive performance in many Natural Language Processing (NLP) tasks, including sentiment ***** analysis *****, text classification, and Word Sense Disambiguation. | ||
| L16-1360 Previous studies, based on the ***** analysis ***** of poetic texts, have shown that synaesthetic transfers tend to go from the lower toward the higher senses (e.g., sweet music vs. musical sweetness). | ||
| aspect term | 162 | |
| 2020.findings-emnlp.72 To address the issue, we present a novel view of ABSA as an opinion triplet extraction task, and propose a multi-task learning framework to jointly extract ***** aspect term *****s and opinion terms, and simultaneously parses sentiment dependencies between them with a biaffine scorer. | ||
| 2020.findings-emnlp.6 The polarities sequence is designed to depend on the generated ***** aspect term *****s labels. | ||
| C18-1066 Both industry and academia have realized the importance of the relationship between ***** aspect term ***** and sentence, and made attempts to model the relationship by designing a series of attention models. | ||
| 2021.emnlp-main.321 Aspect-based sentiment analysis (ABSA) predicts the sentiment polarity towards a particular ***** aspect term ***** in a sentence, which is an important task in real-world applications. | ||
| P19-1048 Aspect-based sentiment analysis produces a list of ***** aspect term *****s and their corresponding sentiments for a natural language sentence. | ||
| metaphor detection | 161 | |
| W17-1903 Our contribution to this topic are as follows: i) we compare supervised techniques to learn and extend abstractness ratings for huge vocabularies ii) we learn and investigate norms for larger units by propagating abstractness to verb-noun pairs which lead to better ***** metaphor detection ***** iii) we overcome the limitation of learning a single rating per word and show that multi-sense abstractness ratings are potentially useful for ***** metaphor detection *****. | ||
| D18-1060 These models establish a new state-of-the-art on existing verb ***** metaphor detection ***** benchmarks, and show strong performance on jointly predicting the metaphoricity of all words in a running text. | ||
| 2020.figlang-1.27 In this paper we present a novel resource-inexpensive architecture for ***** metaphor detection ***** based on a residual bidirectional long short-term memory and conditional random fields. | ||
| 2020.figlang-1.4 We focus a novel reading comprehension paradigm for solving the token-level ***** metaphor detection ***** task which provides an innovative type of solution for this task. | ||
| W17-2201 Our method focuses on ***** metaphor detection ***** in a poetry corpus. | ||
| multi - hop reasoning | 161 | |
| 2020.coling-main.143 Document-level relation extraction (RE) poses new challenges over its sentence-level counterpart since it requires an adequate comprehension of the whole document and the *****multi-hop reasoning***** ability across multiple sentences to reach the final result. | ||
| D19-1194 Also, we propose a preliminary model that selects an output from two networks at each time step: a sequence-to-sequence model (Seq2Seq) and a *****multi-hop reasoning***** model, in order to support dynamic knowledge graphs. | ||
| N19-1032 *****Multi-hop reasoning***** question answering requires deep comprehension of relationships between various documents and queries. | ||
| N19-1405 Learning *****multi-hop reasoning***** has been a key challenge for reading comprehension models, leading to the design of datasets that explicitly focus on it. | ||
| D18-1454 We also show that our background knowledge enhancements are generalizable and improve performance on QAngaroo-WikiHop, another *****multi-hop reasoning***** dataset. | ||
| open - domain question | 161 | |
| P19-1436 Existing *****open-domain question***** answering (QA) models are not suitable for real-time usage because they need to process several long documents on-demand for every input query, which is computationally prohibitive. | ||
| N19-1030 *****Open-domain question***** answering remains a challenging task as it requires models that are capable of understanding questions and answers, collecting useful information, and reasoning over evidence. | ||
| L08-1533 We use a web search engine as a back end to allow for *****open-domain Question***** Answering. | ||
| 2021.eacl-main.261 Answer Sentence Selection (AS2) is an efficient approach for the design of *****open-domain Question***** Answering (QA) systems. | ||
| 2021.nllp-1.20 We use a similar approach to the two-step *****open-domain question***** answering approach by using a Reducer to extract relevant text segments and a Producer to generate both extractive answers and non-extractive classifications. | ||
| granularity | 160 | |
| 2020.findings-emnlp.425 It omits information carried by larger text ***** granularity *****, and thus the encoders cannot easily adapt to certain combinations of characters. | ||
| D19-1184 This paper introduces a novel training procedure which explicitly learns multiple representations of language at several levels of ***** granularity *****. | ||
| 2014.lilt-9.4 We propose a decomposition approach over TE pairs, where single linguistic phenomena are isolated in what we have called atomic inference pairs, and we show that at this ***** granularity ***** level the actual correlation between the linguistic and the logical dimensions of semantic inferences emerges and can be empirically observed. | ||
| K19-1024 We develop annotation guidelines for the task of applying these codes to debate motions at two levels of ***** granularity ***** and produce a dataset of manually labelled examples. | ||
| W19-2510 We also find that including information about the ***** granularity ***** of text spans is a crucial ingredient when employing hidden layers, in contrast to simple logistic regression | ||
| nouns | 160 | |
| 2020.coling-main.451 Recently, domain-general recurrent neural networks, without explicit linguistic inductive biases, have been shown to successfully reproduce a range of human language behaviours, such as accurately predicting number agreement between ***** nouns ***** and verbs. | ||
| L10-1474 The experiment is about 20 French polysemous words (16 ***** nouns ***** and 4 verbs) and we make use of the French-English part of the sentence-aligned EuroParl Corpus for training and testing. | ||
| 2020.cogalex-1.13 We present a corpus of about 3000 ***** nouns ***** ending in schwa, annotated for various phonological and morpho-syntactic features, and critically, the dominant linking strategy. | ||
| L16-1722 We evaluate it with a 10-fold cross validation on 9,600 pairs, equally distributed among the three classes and involving several Parts-Of-Speech (i.e. adjectives, ***** nouns ***** and verbs). | ||
| 2016.gwc-1.36 In fact, since ***** nouns ***** are open class words, producing an exhaustive definite list of noun-CL associations is not possible, since it would quickly get out of date | ||
| implicit discourse relation | 159 | |
| P19-1065 We firstly propose a method to automatically extract the ***** implicit discourse relation ***** argument pairs and labels from a dataset of dialogic turns, resulting in a novel corpus of discourse relation pairs; the first of its kind to attempt to identify the discourse relations connecting the dialogic turns in open-domain discourse. | ||
| E17-1027 Inferring ***** implicit discourse relation *****s in natural language text is the most difficult subtask in discourse parsing. | ||
| 2021.unimplicit-1.1 In the current study, we perform ***** implicit discourse relation ***** classification without relying on any labeled implicit relation. | ||
| P19-1058 In the literature, most of the previous studies on English ***** implicit discourse relation ***** recognition only use sentence-level representations, which cannot provide enough semantic information in Chinese due to its unique paratactic characteristics. | ||
| 2020.lrec-1.145 (2) In expectation of knowledge transfer from explicit discourse relations to ***** implicit discourse relation *****s, we add a task named explicit connective prediction at the additional pre-training step. | ||
| annotator | 158 | |
| 2021.humeval-1.11 Recent studies emphasize the need of document context in human evaluation of machine translations, but little research has been done on the impact of user interfaces on ***** annotator ***** productivity and the reliability of assessments. | ||
| 2021.ranlp-1.177 This paper investigates the effectiveness of automatic ***** annotator ***** assignment for text annotation in expert domains. | ||
| L10-1512 Furthermore, we show that our setup can also prove useful for evaluating when an inexperienced ***** annotator ***** is ready to start participating in the production of the treebank. | ||
| 2020.lrec-1.834 The experimental results show that NER performance over the corpus is around 77% in terms of micro-F1, which is comparable to human ***** annotator ***** agreement rates. | ||
| 2020.lrec-1.168 To validate the manual labels, we trained SVM (Support Vector Machine) and BERT (Bidirectional Encoder Representations from Transformers) with half of the corpus (labeled by one ***** annotator *****) to predict the skill and intent labels of the other half (labeled by the other ***** annotator *****) | ||
| morphosyntactic | 158 | |
| W19-4226 The SIGMORPHON 2019 shared task on cross-lingual transfer and contextual analysis in morphology examined transfer learning of inflection between 100 language pairs, as well as contextual lemmatization and ***** morphosyntactic ***** description in 66 languages. | ||
| L16-1403 It provides access to the basic as well as advanced natural language processing tools and resources, including tools for corpus creation and management, text preprocessing and annotation, ontology building, named entity recognition, ***** morphosyntactic ***** and semantic analysis, sentiment analysis, etc. | ||
| L16-1647 Algorithms were trained to classify texts with respect to their publication date taking into account lexical variation represented as word n-grams, and ***** morphosyntactic ***** variation represented by part-of-speech (POS) distribution. | ||
| L10-1564 We study how ***** morphosyntactic ***** as well as function tag information percolation in the form of grammar transforms (Johnson, 1998, Kulick et al., 2006) affects the performance of a parser and helps dependency assignment. | ||
| L04-1154 In this paper , we present the methodology developed by the SLI ( Computational Linguistics Group of the University of Vigo ) for the building and processing of the CLUVI Corpus , showing the TMX - based XML specification designed to encode both *****morphosyntactic***** features and translation alignments in parallel corpora , and the solutions adopted for making the CLUVI parallel corpora freely available over the WWW ( http://sli.uvigo.es/CLUVI/ ) . | ||
| graph embedding | 158 | |
| 2020.semeval-1.76 We also explore the utility of external resources that aim to supplement the world knowledge inherent in such language models, including commonsense knowledge ***** graph embedding ***** models, word concreteness ratings, and text-to-image generation models. | ||
| D17-1060 Experimentally, we show that our proposed method outperforms a path-ranking based algorithm and knowledge ***** graph embedding ***** methods on Freebase and Never-Ending Language Learning datasets. | ||
| 2020.knlp-1.3 First, we use retrofitted target concept vectors instead of ***** graph embedding ***** based vectors. | ||
| W19-4313 We argue that the entity ranking protocol, which is currently used to evaluate knowledge ***** graph embedding ***** models, is not suitable to answer this question since only a subset of the model predictions are evaluated. | ||
| D18-1358 Most existing researches are focusing on knowledge ***** graph embedding ***** (KGE) models. | ||
| transfer learning | 158 | |
| 2020.calcs-1.6 In a second step, two systems using cross-lingual embeddings were researched, being (1) a supervised classifier and (2) a ***** transfer learning ***** approach trained on English sentiment data and evaluated on code-mixed data. | ||
| K17-3010 To allow ***** transfer learning ***** for low-resource treebanks and surprise languages, we train several multilingual models for related languages, grouped by their genus and language families. | ||
| W19-3812 In this work, contribution of ***** transfer learning ***** technique to pronoun resolution systems is investigated and the gender bias contained in classification models is evaluated. | ||
| 2019.iwslt-1.14 We first trained an end-to-end ASR system and used the weights of its encoder to initialize the decoder of our ST model (***** transfer learning *****). | ||
| 2020.lrec-1.94 A stream of this network also utilizes ***** transfer learning ***** by pre-training a bidirectional transformer to extract semantic representation for each input sentence and incorporates external knowledge through the neighborhood of the entities from a Knowledge Base (KB). | ||
| native language identification | 158 | |
| 2020.acl-main.206 Motivated by recent advances in multi-task learning, we develop neural networks trained in a multi-task fashion that learn to predict the proficiency level of non-native English speakers by taking advantage of inductive transfer between the main task (grading) and auxiliary prediction tasks: morpho-syntactic labeling, language modeling, and ***** native language identification ***** (L1). | ||
| C18-1293 In this paper, we describe experiments designed to explore and evaluate the impact of punctuation marks on the task of ***** native language identification *****. | ||
| W18-1605 In this paper, we approach the task of ***** native language identification ***** in a realistic cross-corpus scenario where a model is trained with available data and has to predict the native language from data of a different corpus. | ||
| W17-5025 We present the RUG-SU team's submission at the *****Native Language Identification***** Shared Task 2017. | ||
| W17-5046 For our system, we explore a variety of phonetic algorithms to generate features for *****Native Language Identification*****. | ||
| news | 158 | |
| 2020.wmt-1.15 This paper describes Tilde 's submission to the WMT2020 shared task on *****news***** translation for both directions of the English - Polish language pair in both the constrained and the unconstrained tracks . | ||
| W19-8907 The Social Web Observatory is an entity - driven , sentiment - aware , event summarization web platform , combining various methods and tools to overview trends across social media and *****news***** sources in Greek . | ||
| 2020.winlp-1.32 In this study , we apply NLP methods to learn about the framing of the 2020 Democratic Presidential candidates in *****news***** media . | ||
| W19-4712 Contemporary debates on filter bubbles and polarization in public and social media raise the question to what extent *****news***** media of the past exhibited biases . | ||
| 2021.naacl-main.344 We propose MultiOpEd , an open - domain news editorial corpus that supports various tasks pertaining to the argumentation structure in *****news***** editorials , focusing on automatic perspective discovery . | ||
| Hence | 157 | |
| 2021.nlpmc-1.6 ***** Hence *****, we propose to leverage weak supervision approaches, namely incomplete supervision, inaccurate supervision, and a hybrid supervision approach and evaluate both generic and domain-specific, ELMo, and BERT embeddings using sequence tagging models. | ||
| 2021.dravidianlangtech-1.16 ***** Hence *****, the content posted by a troll needs to be identified and dealt with before causing any more damage. | ||
| 2021.argmining-1.3 ***** Hence *****, the features that determine argument similarity remain elusive. | ||
| C16-1021 ***** Hence ***** the generated summaries suffer from a lack of readability. | ||
| 2020.lrec-1.739 ***** Hence *****, it is not of easy access to the Deaf community | ||
| 2020 | 157 | |
| 2020.wmt-1.98 We build on recent work studying how to improve BLEU by using diverse automatically paraphrased references (Bawden et al., ***** 2020 *****), extending experiments to the multilingual setting for the WMT***** 2020 ***** metrics shared task and for three base metrics. | ||
| 2021.emnlp-main.694 Previous implementations of this technique (Cohen et al, ***** 2020 *****) have focused on single-entity questions using a relation following operation. | ||
| 2020.emnlp-main.382 Our best model significantly outperforms mBERT, XLM-RoBERTa, and AraBERT (Antoun et al., ***** 2020 *****) in both the supervised and zero-shot transfer settings. | ||
| 2021.mrqa-1.16 One such model, REALM, (Guu et al., ***** 2020 *****) is an end-to-end dense retrieval system that uses MLM based pretraining for improved downstream QA performance. | ||
| 2020.latechclfl-1.14 While supervised models can predict literary quality ratings from textual factors quite successfully, as shown in the Riddle of Literary Quality project (Koolen et al., ***** 2020 *****), this does not prove that social factors are not important, nor can we assume that readers make judgments on literary quality in the same way and based on the same information as machine learning models | ||
| tweets | 157 | |
| S18-1095 This paper describes the Irony detection system that participates in SemEval-2018 Task 3: Irony detection in English ***** tweets *****. | ||
| P19-3026 While the last several years have witnessed a substantial growth in interests and efforts in the area of computational fact-checking, ClaimPortal is a novel infrastructure in that fact-checkers have largely skipped factual claims in ***** tweets *****. | ||
| 2020.smm4h-1.16 We specifically describe the systems designed to solve task 2: Automatic classification of multilingual ***** tweets ***** that report adverse effects, and task 3: Automatic extraction and normalization of adverse effects in English ***** tweets *****. | ||
| W17-5227 The CNN-LSTM model has two combined parts: CNN extracts local n-gram features within ***** tweets ***** and LSTM composes the features to capture long-distance dependency across ***** tweets *****. | ||
| 2020.wanlp-1.2 This paper, therefore, aims to investigate several neural network models based on Convolutional Neural Network (CNN) and Recurrent Neural Networks (RNN) to detect hate speech in Arabic ***** tweets ***** | ||
| taxonomy | 156 | |
| D17-1123 A ***** taxonomy ***** is a semantic hierarchy, consisting of concepts linked by is-a relations. | ||
| 2020.lrec-1.125 Handcrafted mappings have been proposed between markers and discourse relations on a limited set of markers and a limited set of categories, but there exists hundreds of discourse markers expressing a wide variety of relations, and there is no consensus on the ***** taxonomy ***** of relations between competing discourse theories (which are largely built in a top-down fashion). | ||
| N18-1030 It is hard to isolate the impact of these factors on the quality of the resulting ***** taxonomy ***** because organization methods are rarely compared directly. | ||
| 2020.lrec-1.327 The goals of this research have been to propose i) a spelling error ***** taxonomy ***** for ZC, formalised as an ontology and ii) an adaptive spell checking approach using Character-Based Statistical Machine Translation to correct spelling errors in ZC. | ||
| W18-4905 Based on an existing medical ***** taxonomy *****, we develop an annotation scheme and label a sample of MWEs from a Dutch corpus with semantic and grammatical features | ||
| multiword expressions | 156 | |
| W17-1716 All ***** multiword expressions ***** are a great challenge for natural language processing, but the verbal ones are particularly interesting for tasks such as parsing, as the verb is the central element in the syntactic organization of a sentence. | ||
| W17-1727 As ***** multiword expressions ***** (MWEs) exhibit a range of idiosyncrasies, their automatic detection warrants the use of many different features. | ||
| L12-1517 Light verb constructions (LVCs), such as take a walk and make a decision, are a common subclass of ***** multiword expressions ***** (MWEs), whose distinct syntactic and semantic properties call for a special treatment within a computational system. | ||
| P19-1316 The compositionality degree of ***** multiword expressions ***** indicates to what extent the meaning of a phrase can be derived from the meaning of its constituents and their grammatical relations. | ||
| U18-1009 In this paper, we perform a comparative evaluation of off-the-shelf embedding models over the task of compositionality prediction of ***** multiword expressions *****(“MWEs”). | ||
| training | 156 | |
| P17-1078 Results show that such pre***** training ***** significantly improves the model, leading to accuracies competitive to the best methods on six benchmarks. | ||
| 2021.naacl-main.269 We integrate our approach into a self-***** training ***** framework for boosting performance. | ||
| 2020.aacl-main.29 To resolve the cold start problem in ***** training *****, we propose a method using a pseudo data generator which generates pseudo texts and KB triples for learning an initial model. | ||
| 2020.sltu-1.7 Overall, we show that the proposed multilingual graphemic hybrid ASR with various data augmentation can not only recognize any within ***** training ***** set languages, but also provide large ASR performance improvements. | ||
| W18-3002 Recent work in machine translation has demonstrated that self - attention mechanisms can be used in place of recurrent neural networks to increase *****training***** speed without sacrificing model accuracy . | ||
| MRC | 155 | |
| 2021.emnlp-main.214 In this paper, we take a new perspective to address the data sparsity issue faced by implicit EAE, by bridging the task with machine reading comprehension (***** MRC *****). | ||
| D18-1453 These results suggest that one might overestimate recent advances in ***** MRC *****. | ||
| 2021.acl-short.120 The experimental results show that ***** MRC ***** models do not perform well on the challenge test set. | ||
| N19-1271 Empirical study shows that our approach can be applied to many existing ***** MRC ***** models. | ||
| P19-1226 Recently, pre-trained language models (LMs), especially BERT, have achieved remarkable success, presenting new state-of-the-art results in ***** MRC ***** | ||
| Temporal | 155 | |
| L12-1015 *****Temporal***** expressions are words or phrases that describe a point , duration or recurrence in time . | ||
| N18-1061 *****Temporal***** orientation refers to an individual 's tendency to connect to the psychological concepts of past , present or future , and it affects personality , motivation , emotion , decision making and stress coping processes . | ||
| L06-1394 *****Temporal***** annotation is a complex task characterized by low markup speed and low inter - annotator agreements scores . | ||
| W19-0401 *****Temporal***** notions based on a finite set A of properties are represented in strings , on which projections are defined that vary the granularity A. | ||
| L06-1156 *****Temporal***** relations between events and times are often difficult to discover , time - consuming and expensive . | ||
| Ontology | 155 | |
| L08-1558 We outline a representation for encoding light linguistic features of Compound Nominal term mentions of Concepts within an ***** Ontology ***** as well as a lightweight semantic annotator which complies the above linguistic information into efficient Dictionary formats to drive large scale identification and semantic annotation of the aforementioned concepts. | ||
| L10-1633 *****Ontology***** population from text is becoming increasingly important for NLP applications . | ||
| 2021.emnlp-main.842 *****Ontology***** Alignment is an important research problem applied to various fields such as data integration , data transfer , data preparation , etc . | ||
| L14-1041 *****Ontology***** mediators often demand extensive configuration , or even the adaptation of the input ontologies for remedying unsupported modeling patterns . | ||
| L14-1658 *****Ontology***** alignment is a key process for enabling interoperability between ontology - based systems in the Linked Open Data age . | ||
| multiword | 155 | |
| L06-1374 We present a detailed description of the task, including the main criteria for difficult cases in the edition of the senses and the tagging of the corpus, with special mention to ***** multiword ***** entries. | ||
| L16-1364 Despite the availability of language resource catalogs and the inventory of ***** multiword ***** datasets on the SIGLEX-MWE website, ***** multiword ***** resources are scattered and difficult to find. | ||
| C16-1046 Much previous research on ***** multiword ***** expressions (MWEs) has focused on the token- and type-level tasks of MWE identification and extraction, respectively. | ||
| Q14-1016 We present a novel representation, evaluation measure, and supervised models for the task of identifying the ***** multiword ***** expressions (MWEs) in a sentence, resulting in a lexical semantic segmentation. | ||
| 2020.mwe-1.14 We present edition 1.2 of the PARSEME shared task on identification of verbal ***** multiword ***** expressions (VMWEs) | ||
| anaphora | 155 | |
| 2020.acl-main.132 Most previous studies on bridging ***** anaphora ***** resolution (Poesio et al., 2004; | ||
| 2020.crac-1.7 Lexical semantics and world knowledge are crucial for interpreting bridging ***** anaphora *****. | ||
| 2021.codi-sharedtask.7 We compare our team's systems to others submitted for the CODI-CRAC 2021 Shared-Task on ***** anaphora ***** resolution in dialogue. | ||
| 2020.nlptea-1.16 However, the major challenge with such texts is the difficulty in aligning the expressed opinions to the concerned political leaders as this entails a non-trivial task of named-entity recognition and ***** anaphora ***** resolution | ||
| 2020.coling-main.435 One critical issue of zero ***** anaphora ***** resolution (ZAR) is the scarcity of labeled data. | ||
| orthographic | 154 | |
| W18-6224 We find that our approach can outperform multiple baselines, and offers an elegant and effective solution to the problem of ***** orthographic ***** variance in tweets. | ||
| 2021.eacl-srw.3 However, there is accumulating evidence that ***** orthographic ***** information could also have an impact on auditory word recognition. | ||
| 2020.lrec-1.503 This paper describes the design, collection, ***** orthographic ***** transcription, and phonetic annotation of SpiCE, a new corpus of conversational Cantonese-English bilingual speech recorded in Vancouver, Canada. | ||
| 2020.lrec-1.375 This paper introduces Cifu, a lexical database for Hong Kong Cantonese (HKC) that offers phonological and ***** orthographic ***** information, frequency measures, and lexical neighborhood information for lexical items in HKC. | ||
| P18-2062 Recent embedding-based methods in bilingual lexicon induction show good results, but do not take advantage of ***** orthographic ***** features, such as edit distance, which can be helpful for pairs of related languages | ||
| phonological | 154 | |
| C16-1328 In particular, we show that ***** phonological ***** features outperform character-based models. | ||
| D17-1112 Cognitive biases toward ***** phonological ***** and syntactic predictability in speech are rooted in the limitations of human memory (Baddeley et al., 1998); compressed representations are easier to acquire and retain in memory. | ||
| 2014.lilt-11.2 I discuss the need for grammar (a.k.a. abstraction), the contents of individual grammars (a potentially infinite number of constructions, paradigmatic mappings and predictive relationships between ***** phonological ***** units), the computational characteristics of constructions (complex non-crossover interactions among partially redundant features), resolution of competition among constructions (probability matching), and the need for multimodel inference in modeling internal grammars underlying the linguistic performance of a community. | ||
| W18-5819 Probabilistic approaches have proven themselves well in learning ***** phonological ***** structure. | ||
| 2004.jeptalnrecital-poster.14 In this paper we apply such structures to ***** phonological ***** data and demonstrate how such representations can have practical and beneficial applications in computational lexicography | ||
| unstructured | 154 | |
| D19-5804 As one of the first studies on exploiting ***** unstructured ***** external knowledge for subject-area QA, we hope our methods, observations, and discussion of the exposed limitations may shed light on further developments in the area. | ||
| S18-1112 Reliably detecting relevant relations between entities in ***** unstructured ***** text is a valuable resource for knowledge extraction, which is why it has awaken significant interest in the field of Natural Language Processing. | ||
| L08-1263 The entire process in based on the use of minimal language-dependent tools, no external linguistic resources, and merely free, ***** unstructured ***** text. | ||
| 2021.dialdoc-1.16 This work focuses on responding to these beyond-API-coverage user turns by incorporating external, ***** unstructured ***** knowledge sources | ||
| P17-1103 Automatically evaluating the quality of dialogue responses for *****unstructured***** domains is a challenging problem . | ||
| distributional semantics | 154 | |
| L14-1534 To extract pairs of synonyms of multi-word terms, we propose in this paper an unsupervised semi-compositional method that makes use of ***** distributional semantics ***** and exploit the compositional property shared by most MWT. | ||
| S17-2008 Our submission to SemEval was an update of previous work that builds high-quality, multilingual word embeddings from a combination of ConceptNet and ***** distributional semantics *****. | ||
| Q13-1029 There have been several efforts to extend ***** distributional semantics ***** beyond individual words, to measure the similarity of word pairs, phrases, and sentences (briefly, tuples; ordered sets of words, contiguous or noncontiguous). | ||
| C18-2003 This tool uniquely combines state-of-the-art ***** distributional semantics ***** with a nuanced model of human emotions, two information streams we deem beneficial for a data-driven interpretation of texts in the humanities. | ||
| D18-1023 We construct a multilingual common semantic space based on ***** distributional semantics *****, where words from multiple languages are projected into a shared space via which all available resources and knowledge can be shared across multiple languages. | ||
| speech translation | 154 | |
| 2020.coling-main.314 We introduce dual-decoder Transformer, a new model architecture that jointly performs automatic speech recognition (ASR) and multilingual ***** speech translation ***** (ST). | ||
| 2020.iwslt-1.2 This paper describes the ON-TRAC Consortium translation systems developed for two challenge tracks featured in the Evaluation Campaign of IWSLT 2020, offline ***** speech translation ***** and simultaneous ***** speech translation *****. | ||
| L10-1417 We describe a multilingual Open Source CALL game , CALL - SLT , which reuses *****speech translation***** technology developed using the Regulus platform to create an automatic conversation partner that allows intermediate - level language students to improve their fluency . | ||
| 2019.iwslt-1.7 This paper describes our end - to - end speech translation system for the *****speech translation***** task of lectures and TED talks from English to German for IWSLT Evaluation 2019 . | ||
| 1999.mtsummit-1.16 Speech communication includes many important issues on natural language processing and they are related with desirable advanced *****speech translation***** systems . | ||
| 153 | ||
| W18-1112 We address this problem by introducing a large-scale dataset derived from ***** Reddit *****, a source so far overlooked for personality prediction. | ||
| P19-1255 Automatic evaluation on a large-scale dataset collected from ***** Reddit ***** shows that our model yields significantly higher BLEU, ROUGE, and METEOR scores than the state-of-the-art and non-trivial comparisons. | ||
| W19-3009 We collected and analyzed a large corpus of ***** Reddit ***** posts from users claiming to have received a formal diagnosis of SZ and identified several linguistic features that differentiated these users from a control (CTL) group. | ||
| 2020.lrec-1.774 ***** Reddit ***** is a popular online platform combining social news aggregation, discussion and micro-blogging. | ||
| W18-6211 We present a preliminary study on bipolar disorder prediction from user-generated text on ***** Reddit *****, which relies on users' self-reported labels | ||
| machine translation evaluation | 153 | |
| 2021.humeval-1.5 Our classification-based approach focuses on such errors using several error type labels, for practical ***** machine translation evaluation ***** in an age of neural machine translation. | ||
| 2005.mtsummit-invited.7 Since 1994, China's HTRDP ***** machine translation evaluation ***** has been conducted for five times. | ||
| W19-5357 The well known Meteor metric improves ***** machine translation evaluation ***** by introducing paraphrase knowledge. | ||
| W19-8704 We propose a metric for ***** machine translation evaluation ***** based on frame semantics which does not require the use of reference translations or human corrections, but is aimed at comparing original and translated output directly. | ||
| 2012.amta-papers.26 We propose straightforward implementations of translation memory ( TM ) functionality for research purposes , using *****machine translation evaluation***** metrics as similarity functions . | ||
| computation | 152 | |
| 1995.iwpt-1.7 This check is based upon the composition of simple relations and does not require any ***** computation ***** of symbol stacks. | ||
| 2021.emnlp-main.327 2) Densely sampled candidate moments cause redundant ***** computation ***** and degrade the performance of ranking process. | ||
| W17-2631 We demonstrate two approaches to reducing unnecessary ***** computation ***** in cases where a fast but weak baseline classier and a stronger, slower model are both available. | ||
| D18-1160 We show that the invertibility condition allows for efficient exact inference and marginal likelihood ***** computation ***** in our model so long as the prior is well-behaved. | ||
| W17-3208 During the mini-batched training process, it is necessary to pad shorter sentences in a mini-batch to be equal in length to the longest sentence therein for efficient ***** computation ***** | ||
| UD | 152 | |
| 2020.iwpt-1.23 These results show that much of the information needed to construct E***** UD ***** graphs from ***** UD ***** trees are present in the ***** UD ***** trees. | ||
| W18-4918 Our results indicate that pure SD to ***** UD ***** conversion is highly accurate across multiple genres, resulting in around 1.5% errors, but can be improved further to fewer than 0.5% errors given access to annotations beyond the pure syntax tree, such as entity types and coreference resolution, which are necessary for correct generation of several ***** UD ***** relations. | ||
| 2020.acl-main.239 We show improved results on the CoNLL02 NER and ***** UD ***** 1.2 POS datasets and demonstrate the power of the method for transfer learning with low-resources achieving 0.6 F1 score in Dutch using only one sample from it. | ||
| 2021.law-1.12 We raised and discussed these issues within the community on the official ***** UD ***** portal | ||
| 2021.law-1.5 In this paper we investigate the possibility of extracting predicate - argument relations from *****UD***** trees ( and enhanced UD graphs ) . | ||
| Word | 152 | |
| S17-2040 We use FarsNet, the Persian ***** Word ***** Net, besides deep learning techniques to extract the similarity of words. | ||
| D19-5902 We also explored the ratings with the semantic labels used in the `***** Word ***** List by Semantic Principles'. | ||
| 2020.semeval-1.5 This paper describes the system submitted by our team ( BabelEnconding ) to SemEval-2020 Task 3 : Predicting the Graded Effect of Context in *****Word***** Similarity . | ||
| 2020.acl-main.484 *****Word***** embeddings derived from human - generated corpora inherit strong gender bias which can be further amplified by downstream models . | ||
| P19-1352 *****Word***** embedding is central to neural machine translation ( NMT ) , which has attracted intensive research interest in recent years . | ||
| spoken language understanding | 152 | |
| 2021.eacl-main.159 We introduce a data augmentation technique based on byte pair encoding and a BERT-like self-attention model to boost performance on ***** spoken language understanding ***** tasks. | ||
| 2020.findings-emnlp.244 Visually-grounded models of ***** spoken language understanding ***** extract semantic information directly from speech, without relying on transcriptions. | ||
| L12-1438 The PORTMEDIA project is intended to develop new corpora for the evaluation of ***** spoken language understanding ***** systems. | ||
| L14-1613 In this paper we address the problem of creating multilingual aligned corpora and its evaluation in the context of a ***** spoken language understanding ***** (SLU) porting task. | ||
| 2020.emnlp-main.587 Despite the promising results of current cross - lingual models for *****spoken language understanding***** systems , they still suffer from imperfect cross - lingual representation alignments between the source and target languages , which makes the performance sub - optimal . | ||
| processing | 152 | |
| W17-1411 We investigate whether word embeddings offer any advantage over corpus- and pre***** processing *****-free string kernels, and how these compare to bag-of-words baselines. | ||
| L14-1333 We discuss our specifications, pre-***** processing ***** and evaluation | ||
| 2004.amta-papers.23 This paper describes an evaluation experiment about a Japanese-Uighur machine translation system which consists of verbal suffix ***** processing *****, case suffix ***** processing *****, phonetic change ***** processing *****, and a Japanese-Uighur dictionary including about 20,000 words. | ||
| 2020.winlp-1.17 In the following, we present a system for assisted typing in LS whose accuracy and speed is largely due to the deployment of real time natural-language ***** processing ***** enabling efficient prediction and context-sensitive grammar support. | ||
| L08-1412 The high level of heterogeneity between linguistic annotations usually complicates the interoperability of *****processing***** modules within an NLP pipeline . | ||
| Transfer | 151 | |
| C18-1185 In this paper, we propose to use a sequence to sequence model for Named Entity Recognition (NER) and we explore the effectiveness of such model in a progressive NER setting – a ***** Transfer ***** Learning (TL) setting. | ||
| 2021.acl-long.154 *****Transfer***** learning has yielded state - of - the - art ( SoTA ) results in many supervised NLP tasks . | ||
| 2021.acl-short.108 *****Transfer***** learning with large pretrained transformer - based language models like BERT has become a dominating approach for most NLP tasks . | ||
| 2021.eacl-demos.22 *****Transfer***** learning , particularly approaches that combine multi - task learning with pre - trained contextualized embeddings and fine - tuning , have advanced the field of Natural Language Processing tremendously in recent years . | ||
| 2021.acl-long.381 *****Transfer***** learning from pretrained language models recently became the dominant approach for solving many NLP tasks . | ||
| modalities | 150 | |
| P17-2031 Modeling attention in neural multi-source sequence-to-sequence learning remains a relatively unexplored area, despite its usefulness in tasks that incorporate multiple source languages or ***** modalities *****. | ||
| 2020.coling-main.176 We present a Transformer model that integrates text and image ***** modalities ***** and attends to textual features from visual features in generating a caption. | ||
| 2021.mmtlrl-1.7 Multimodal Neural Machine Translation (MNMT) is an interesting task in natural language processing (NLP) where we use visual ***** modalities ***** along with a source sentence to aid the source to target translation process. | ||
| 2019.icon-1.1 We further attempt to establish that using such a word representation as input makes the model robust to unseen words, particularly arising due to tokenization and spelling errors, which is a common problem in systems where a typing interface is one of the input ***** modalities *****. | ||
| W19-1608 A comparison of our vector representations to human semantic judgments indicates that different bias (functional or geometric) is captured in different data collection tasks which suggests that the contribution of the two meaning ***** modalities ***** is dynamic, related to the context of the task | ||
| subword | 150 | |
| 2020.findings-emnlp.414 We analyze differences between BPE and unigram LM tokenization, finding that the latter method recovers ***** subword ***** units that align more closely with morphology and avoids problems stemming from BPE's greedy construction procedure. | ||
| 2021.ranlp-1.120 We also show that expensive parameter optimization can be replaced by a simple n-gram coverage model that consistently improves the accuracy of fastText models on the word analogy tasks by up to 3% compared to the default ***** subword ***** sizes, and that it is within 1% accuracy of the optimal ***** subword ***** sizes. | ||
| D18-2012 It provides open-source C++ and Python implementations for ***** subword ***** units. | ||
| 2021.nodalida-main.18 Given the”gold” nature of the resource, it is possible to use it for empirical studies as well as to develop linguistically-aware algorithms for morpheme segmentation and labeling (cf statistical ***** subword ***** approach). | ||
| D19-5506 Contemporary machine translation systems achieve greater coverage by applying *****subword***** models such as BPE and character - level CNNs , but these methods are highly sensitive to orthographical variations such as spelling mistakes . | ||
| aspect | 150 | |
| C18-1066 Both industry and academia have realized the importance of the relationship between ***** aspect ***** term and sentence, and made attempts to model the relationship by designing a series of attention models. | ||
| 2020.coling-main.72 Most of the ***** aspect ***** based sentiment analysis research aims at identifying the sentiment polarities toward some explicit ***** aspect ***** terms while ignores implicit ***** aspect *****s in text. | ||
| 2021.naacl-main.34 In this work, we present the Arg-CTRL - a language model for argument generation that can be controlled to generate sentence-level arguments for a given topic, stance, and ***** aspect *****. | ||
| 2020.coling-main.545 In this paper, we extend similarity with ***** aspect ***** information by performing a pairwise document classification task | ||
| 1997.mtsummit-workshop.4 (1) From a linguistic viewpoint, the expected benefits include a refinement of the ***** aspect *****ual model in (Olsen, 1994; Olsen, 1997) (which provides necessary but not sufficient conditions for ***** aspect *****ual com- position), and a refinement of the verb classifications in (Levin, 1993); we also expect our approach to eventually produce a systematic definition (in terms of LCSs and compositional operations) of the precise meaning components responsible for Levin's classification. | ||
| generalize | 149 | |
| 2020.inlg-1.44 (3) We carefully select and ***** generalize ***** human review sentences into templates, and apply these templates to transform the review scores and evidence into natural language comments. | ||
| 2021.dash-1.8 We conclude with a discussion of future work to determine if and how the results ***** generalize ***** to other classification tasks. | ||
| W17-2601 In this article, we ***** generalize ***** the method described by Erk and Padö (2009) by proposing a dependency-base framework that contextualize not only lemmas but also selectional preferences. | ||
| 2020.emnlp-main.361 The layers are specific to one language (as opposed to bilingual adapters) allowing to compose them and ***** generalize ***** to unseen language-pairs. | ||
| P19-1031 Neural parsers obtain state-of-the-art results on benchmark treebanks for constituency parsing—but to what degree do they ***** generalize ***** to other domains | ||
| predicate | 149 | |
| 2020.lrec-1.11 In languages like Arabic, Chinese, Italian, Japanese, Korean, Portuguese, Spanish, and many others, ***** predicate ***** arguments in certain syntactic positions are not realized instead of being realized as overt pronouns, and are thus called zero- or null-pronouns. | ||
| L14-1011 This research focuses on expanding PropBank, a corpus annotated with ***** predicate ***** argument structures, with new ***** predicate ***** types; namely, noun, adjective and complex ***** predicate *****s, such as Light Verb Constructions. | ||
| N19-1309 Open Information Extraction (OpenIE), the problem of harvesting triples from natural language text whose ***** predicate ***** relations are not aligned to any pre-defined ontology, has been a popular subject of research for the last decade. | ||
| L08-1597 In AnCora-Verb lexicons, the mapping between syntactic functions, arguments and thematic roles of each verbal ***** predicate ***** it is established taking into account the verbal semantic class and the diatheses alternations in which the ***** predicate ***** can participate | ||
| 2020.findings-emnlp.440 We study the potential synergy between two different NLP tasks , both confronting predicate lexical variability : identifying *****predicate***** paraphrases , and event coreference resolution . | ||
| keyphrase | 149 | |
| 2021.emnlp-main.146 Keyword or ***** keyphrase ***** extraction is to identify words or phrases presenting the main topics of a document. | ||
| 2021.acl-long.111 The experimental results on seven ***** keyphrase ***** generation benchmarks from scientific and web documents demonstrate that SEG-Net outperforms the state-of-the-art neural generative methods by a large margin. | ||
| C16-1277 Keyphrase annotation is either carried out by extracting the most important phrases from a document, ***** keyphrase ***** extraction, or by assigning entries from a controlled domain-specific vocabulary, ***** keyphrase ***** assignment. | ||
| 2021.eacl-main.136 In this paper, we present KPRank, an unsupervised graph-based algorithm for ***** keyphrase ***** extraction that exploits both positional information and contextual word embeddings into a biased PageRank. | ||
| N18-2105 We further introduce a novel mechanism to incorporate ***** keyphrase ***** selection preferences into the model | ||
| deep neural | 149 | |
| P18-1182 In this paper, we present a new approach (NeuralREG), relying on ***** deep neural ***** networks, which makes decisions about form and content in one go without explicit feature extraction. | ||
| 2020.emnlp-main.255 To demystify the “black box” property of ***** deep neural ***** networks for natural language processing (NLP), several methods have been proposed to interpret their predictions by measuring the change in prediction probability after erasing each token of an input. | ||
| W19-6143 Named Entity Recognition (NER) has greatly advanced by the introduction of ***** deep neural ***** architectures. | ||
| 2020.coling-main.344 Both models consist of two parts: an encoder enhanced by ***** deep neural ***** networks (DNN) that can utilize the contextual information to encode the input into latent variables, and a decoder which is a generative model able to reconstruct the input. | ||
| N18-1044 In this paper, we propose a ***** deep neural ***** network diachronic distributional model. | ||
| source | 148 | |
| 2020.aacl-srw.13 However, they rely on an overlap between ***** source ***** and target vocabularies. | ||
| 2020.acl-main.143 AT uses the bilingual dictionary to establish anchoring points for closing the gap between ***** source ***** language and target language. | ||
| 2021.acl-long.207 Previous work has explored the similarity between ***** source ***** and target sentences as an approximate measure of strength for different ***** source ***** models. | ||
| 1999.mtsummit-1.61 The method is ***** source ***** language independent, and can be used for systems translating from any language into English. | ||
| W17-2506 The obtained parallel corpora are especially suitable for speech-to-speech translation applications when a prosody transfer between ***** source ***** and target languages is desired | ||
| multi - head attention | 148 | |
| 2021.eacl-main.264 Since the popularization of the Transformer as a general-purpose feature encoder for NLP, many studies have attempted to decode linguistic structure from its novel *****multi-head attention***** mechanism. | ||
| 2021.ranlp-1.52 In *****multi-head attention***** mechanism, different heads attend to different parts of the input. | ||
| C18-1154 We propose vector-based *****multi-head attention***** that includes the widely used max pooling, mean pooling, and scalar self-attention as special cases. | ||
| 2020.findings-emnlp.178 We introduce DropHead, a structured dropout method specifically designed for regularizing the *****multi-head attention***** mechanism which is a key component of transformer. | ||
| 2020.coling-main.342 Recent research on the *****multi-head attention***** mechanism, especially that in pre-trained models such as BERT, has shown us heuristics and clues in analyzing various aspects of the mechanism. | ||
| Sign | 147 | |
| 2020.signlang-1.29 In this paper we describe STS-korpus, a web corpus tool for Swedish ***** Sign ***** Language (STS) which we have built during the past year, and which is now publicly available on the internet. | ||
| L12-1432 In this paper, we describe DEGELS1, a comparable corpus of French ***** Sign ***** Language and co-speech gestures that has been created to serve as a testbed corpus for the DEGELS workshops. | ||
| 2020.lrec-1.737 *****Sign***** languages are complex languages . | ||
| 2010.iwslt-papers.17 *****Sign***** languages represent an interesting niche for statistical machine translation that is typically hampered by the scarceness of suitable data , and most papers in this area apply only a few , well - known techniques and do not adapt them to small - sized corpora . | ||
| L12-1061 *****Sign***** language is used by many people who were born deaf or who became deaf early in life use as their first and/or preferred language . | ||
| captions | 147 | |
| C18-1221 We constructed a set of 19,954 examples of 4,365 ambiguous acronyms from image ***** captions ***** in scientific papers along with their contextually correct definition from different domains. | ||
| P19-1641 Motivated by video dense captioning, we propose a model to generate procedure ***** captions ***** from narrated instructional videos which are a sequence of step-wise clips with description. | ||
| P18-1241 Our extensive experiments show that our algorithm can successfully craft visually-similar adversarial examples with randomly targeted ***** captions ***** or keywords, and the adversarial examples can be made highly transferable to other image captioning systems. | ||
| C16-1313 In this paper, we propose a new approach to obtain the relationship between concepts by exploiting the syntactic dependencies between words in the image ***** captions *****. | ||
| 2021.conll-1.14 Further, we evaluate the quality of these guided ***** captions ***** when trained on Conceptual Captions which contain 3.3M image-level ***** captions ***** compared to Visual Genome which contain 3.6M object-level ***** captions ***** | ||
| emotion | 147 | |
| 2020.coling-main.393 Lately, quite a few datasets have been made available for dialogue ***** emotion ***** and sentiment classification, but these datasets are imbalanced in representing different ***** emotion *****s and consist of an only single ***** emotion *****. | ||
| L10-1061 Furthermore, investigations concerning the ability of the annotators to map certain expressions onto the developed ***** emotion ***** models is lacking proof. | ||
| N18-1052 In the framework, the objective loss function is designed elaborately so that both ***** emotion ***** prediction and rankings of only relevant ***** emotion *****s can be achieved. | ||
| W18-6214 However, because WSCs can break the syntax of the major text, it poses more challenges in Natural Language Processing (NLP) tasks like ***** emotion ***** classification. | ||
| 2021.wassa-1.8 We evaluate our corpus on benchmark datasets for both ***** emotion ***** and sentiment classification, obtaining competitive results | ||
| Knowledge | 146 | |
| P17-2051 We propose jointly modelling *****Knowledge***** Bases and aligned text with Feature - Rich Networks . | ||
| S19-1016 *****Knowledge***** graphs , which provide numerous facts in a machine - friendly format , are incomplete . | ||
| D19-1522 *****Knowledge***** graphs are structured representations of real world facts . | ||
| W19-1910 *****Knowledge***** discovery from text in natural language is a task usually aided by the manual construction of annotated corpora . | ||
| P17-1088 *****Knowledge***** bases are important resources for a variety of natural language processing tasks but suffer from incompleteness . | ||
| deep neural networks | 146 | |
| P18-1182 In this paper, we present a new approach (NeuralREG), relying on ***** deep neural networks *****, which makes decisions about form and content in one go without explicit feature extraction. | ||
| 2020.emnlp-main.255 To demystify the “black box” property of ***** deep neural networks ***** for natural language processing (NLP), several methods have been proposed to interpret their predictions by measuring the change in prediction probability after erasing each token of an input. | ||
| 2020.coling-main.344 Both models consist of two parts: an encoder enhanced by ***** deep neural networks ***** (DNN) that can utilize the contextual information to encode the input into latent variables, and a decoder which is a generative model able to reconstruct the input. | ||
| 2020.vardial-1.23 From simple models for regression, such as Support Vector Regression, to ***** deep neural networks *****, such as Long Short-Term Memory networks and character-level convolutional neural networks, and, finally, to ensemble models based on meta-learners, such as XGBoost, our interest is focused on approaching the problem from a few different perspectives, in an attempt to minimize the prediction error. | ||
| 2020.acl-main.540 Adversarial attacks are carried out to reveal the vulnerability of ***** deep neural networks *****. | ||
| recurrent neural | 146 | |
| Q16-1036 The first model uses an end-to-end ***** recurrent neural ***** network. | ||
| Q14-1017 DT-RNNs outperform other recursive and ***** recurrent neural ***** networks, kernelized CCA and a bag-of-words baseline on the tasks of finding an image that fits a sentence description and vice versa. | ||
| K19-1062 The model consists of 1) a ***** recurrent neural ***** network (RNN) to learn scoring functions for pair-wise relations, and 2) a structured support vector machine (SSVM) to make joint predictions. | ||
| N19-1112 To investigate the transferability of contextual word representations, we quantify differences in the transferability of individual layers within contextualizers, especially between ***** recurrent neural ***** networks (RNNs) and transformers. | ||
| S17-2141 Since two submissions were allowed, two different machine learning methods were developed to solve this task, a support vector machine approach and a ***** recurrent neural ***** network approach. | ||
| attention | 146 | |
| I17-4035 In this paper, we propose the use of an ***** attention *****-based LSTM (AT-LSTM) model for these tasks. | ||
| W19-4324 The most recent successes are predominantly due to the use of different variations of ***** attention ***** mechanisms, but their cognitive plausibility is questionable. | ||
| W18-5427 However, standard ***** attention ***** models are of limited interpretability for tasks that involve a series of inference steps. | ||
| W19-5409 More specifically, one of the proposed approaches employs the translation knowledge between the two languages from two different translation directions; while the other one employs extra monolingual knowledge from both source and target sides, obtained by pre-training deep self-***** attention ***** networks. | ||
| N18-1137 Experimental results demonstrate that models trained with content-specific objectives improve upon a vanilla encoder-decoder which solely relies on soft ***** attention *****. | ||
| sentences | 145 | |
| W18-6123 We describe the FrameIt System that provides a workflow for (1) quickly discovering an ontology to model a text corpus and (2) learning an SRL model that extracts the instances of the ontology from ***** sentences ***** in the corpus. | ||
| R19-1047 We create a manually annotated proposition dataset from ***** sentences ***** taken from restaurant reviews that distinguishes between clauses that need to be split and those that do not. | ||
| D18-1107 In the sentence classification task, context formed from ***** sentences ***** adjacent to the sentence being classified can provide important information for classification. | ||
| 2020.fnp-1.17 Next, we build the kernel Matrix L for the intermediate document, which represents the quality of its ***** sentences *****. | ||
| 2021.rocling-1.19 In addition, it is found that conversation behavior types such as “Statement-non-opinion”, “Signal-non-understanding” and “Appreciation” are more related to question ***** sentences *****, while “Wh-Question”, “Yes-No-Question” and “Rhetorical-Question” questions are more related to chat ***** sentences ***** | ||
| essay scoring | 145 | |
| P19-1390 Nevertheless, progress on dimension-specific ***** essay scoring ***** is limited in part by the lack of annotated corpora. | ||
| 2020.bea-1.8 A lot of research in the last decade has dealt with automatic holistic ***** essay scoring ***** - where a machine rates an essay and gives a score for the essay. | ||
| N18-1021 This may affect automated ***** essay scoring ***** models in many ways, as these models are typically designed to model (potentially biased) essay raters. | ||
| D18-1090 In order to address this issue, we propose a reinforcement learning framework for ***** essay scoring ***** that incorporates quadratic weighted kappa as guidance to optimize the scoring system. | ||
| W17-5017 The inputs to neural ***** essay scoring ***** models – ngrams and embeddings – are arguably well-suited to evaluate content in short answer scoring tasks. | ||
| encoded | 144 | |
| W16-5320 So far, we have implemented 100 simple and 500 complex lexical functions, and ***** encoded ***** about 8,000 syntagmatic and 46,000 paradigmatic relations, for the French language. | ||
| 2021.emnlp-main.111 We present the first dataset comprising stereotypical attributes of a range of social groups and propose a method to elicit stereotypes ***** encoded ***** by pretrained language models in an unsupervised fashion. | ||
| P18-1076 We introduce a neural reading comprehension model that integrates external commonsense knowledge, ***** encoded ***** as a key-value memory, in a cloze-style setting. | ||
| W19-4210 We frame this as a sequence generation task and employ a neural encoder-decoder (seq2seq) architecture to generate the sequence of MSD tags given the ***** encoded ***** representation of each token. | ||
| 2021.emnlp-main.116 We experiment with several NLU datasets and known biases, and show that, counter-intuitively, the more a language model is pushed towards a debiased regime, the more bias is actually ***** encoded ***** in its inner representations | ||
| pronouns | 144 | |
| P18-3022 This paper presents a system that automatically generates multiple, natural language questions using relative ***** pronouns ***** and relative adverbs from complex English sentences. | ||
| 2020.lrec-1.447 In particular, ***** pronouns ***** of subject, object, and possessive cases are often omitted in Japanese; these are known as zero ***** pronouns *****. | ||
| W19-4108 We tackle the problem of context reconstruction in Chinese dialogue, where the task is to replace ***** pronouns *****, zero ***** pronouns *****, and other referring expressions with their referent nouns so that sentences can be processed in isolation without context. | ||
| 2020.emnlp-main.157 We report that state-of-the-art parsers consistently failed to identify “hers” and “theirs” as ***** pronouns ***** but identified the masculine equivalent “his”. | ||
| D19-6504 Some target languages need to add or specialize words that are omitted or ambiguous in the source languages (e.g, zero ***** pronouns ***** in translating Japanese to English or epicene ***** pronouns ***** in translating English to French) | ||
| collocations | 144 | |
| L10-1468 Furthermore, I explain the main criteria for the composition of the dictionary, in addition to its integration with a Virtual Learning Environment (VLE), aimed at supporting learning activities on ***** collocations *****. | ||
| C16-2019 In this demo, we present our free on-line multilingual linguistic services which allow to analyze sentences or to extract ***** collocations ***** from a corpus directly on-line, or by uploading a corpus. | ||
| L14-1128 These two aggregation methods are especially well suited for the task, since the results of each individual method naturally forms a ranking of ***** collocations *****. | ||
| 2020.lrec-1.538 The dataset can serve as a reliable empirical basis for comparing different theoretical frameworks concerned with ***** collocations ***** or as material for data-driven approaches to the studies of ***** collocations ***** including different machine learning experiments | ||
| W17-1706 In our approach, priority is given to parsing alternatives involving ***** collocations *****, and hence collocational information helps the parser through the maze of alternatives, with the aim to lead to substantial improvements in the performance of both tasks (collocation identification and parsing), and in that of a subsequent task (machine translation). | ||
| Topic | 144 | |
| 2020.findings-emnlp.299 Then we design a ***** Topic ***** Knowledge Graph enhanced decoder | ||
| 2020.emnlp-main.138 *****Topic***** models have been prevailing for many years on discovering latent semantics while modeling long documents . | ||
| 2021.acl-long.299 *****Topic***** models have been widely used to learn text representations and gain insight into document corpora . | ||
| 2021.sigdial-1.17 *****Topic***** diversion occurs frequently with engaging open - domain dialogue systems like virtual assistants . | ||
| D17-1139 *****Topic***** segmentation plays an important role for discourse parsing and information retrieval . | ||
| pronoun | 144 | |
| 2021.acl-long.138 A second (multi-relational) GCN is then applied to the utterance states to produce a discourse relation-augmented representation for the utterances, which are then fused together with token states in each utterance as input to a dropped ***** pronoun ***** recovery layer. | ||
| W18-0710 More generally, it confirms that computational modeling approach is able to dissociate different dimensions that are involved in the complex process of ***** pronoun ***** resolution in the brain. | ||
| 2021.acl-long.200 Compared to sentence-level ST, context-aware ST obtains better translation quality (+0.18-2.61 BLEU), improves ***** pronoun ***** and homophone translation, shows better robustness to (artificial) audio segmentation errors, and reduces latency and flicker to deliver higher quality for simultaneous translation. | ||
| 2021.crac-1.16 We also highlight the enormous room for improving the linker and show that the rest of its errors mainly involve ***** pronoun ***** resolution. | ||
| D19-1294 With this aim in mind, we contribute an extensive, targeted dataset that can be used as a test suite for ***** pronoun ***** translation, covering multiple source languages and different ***** pronoun ***** errors drawn from real system translations, for English | ||
| learner | 143 | |
| L10-1560 This paper describes the Error-Annotated German Learner Corpus (EAGLE), a corpus of beginning ***** learner ***** German with grammatical error annotation. | ||
| W17-0807 This paper presents a quality assessment scheme for English-to-Japanese translations produced by ***** learner ***** translators at university. | ||
| Q19-1001 We present a corrected and error-tagged corpus of Russian ***** learner ***** writing and develop models that make use of existing state-of-the-art methods that have been well studied for English. | ||
| 2020.conll-1.7 We present a method for classifying syntactic errors in ***** learner ***** language, namely errors whose correction alters the morphosyntactic structure of a sentence | ||
| W19-4447 In view of the influence of the first language on ***** learner *****s, we further propose an effective approach to improve the quality of the suggested sentences. | ||
| labeling | 143 | |
| 2021.naacl-demos.10 We also propose two sentence selection approaches, an embedding-based selection using a dense retrieval model, and a sequence ***** labeling ***** approach for context-aware selection. | ||
| C18-1233 Semantic role ***** labeling ***** (SRL) is to recognize the predicate-argument structure of a sentence, including subtasks of predicate disambiguation and argument ***** labeling *****. | ||
| 2021.acl-srw.7 Furthermore, the identified salient sentences tend to agree with independent human ***** labeling ***** by domain experts. | ||
| 2020.findings-emnlp.235 We evaluated our methods and showed its effectiveness on four intrinsic and extrinsic tasks: word similarity, embedding numeracy, numeral prediction, and sequence ***** labeling *****. | ||
| 2020.argmining-1.6 We further report first promising results using supervised classification (F1: 0.82) and sequence ***** labeling ***** (F1: 0.72) approaches | ||
| 143 | ||
| S19-2119 It consists of 13,240 tweets extracted from ***** twitter ***** and were annotated at three levels using crowdsourcing. | ||
| N19-1185 Stance detection in ***** twitter ***** aims at mining user stances expressed in a tweet towards a single or multiple target entities. | ||
| N18-4018 In this paper, we analyze the problem of emotion identification in code-mixed content and present a Hindi-English code-mixed corpus extracted from ***** twitter ***** and annotated with the associated emotion. | ||
| S17-2102 Recently, neural ***** twitter ***** sentiment classification has become one of state-of-thearts, which relies less feature engineering work compared with traditional methods. | ||
| L14-1101 Due to the relatively small number of German messages on Twitter, it is possible to collect a virtually complete snapshot of German ***** twitter ***** messages over a period of time | ||
| language inference | 143 | |
| 2020.acl-main.177 As a case study, we perform a series of experiments in the setting of natural ***** language inference ***** (NLI). | ||
| 2021.starsem-1.27 We show that examples that depend critically on a rarer word are more challenging for natural ***** language inference ***** models. | ||
| 2020.findings-emnlp.39 Natural ***** language inference ***** (NLI) and semantic textual similarity (STS) are key tasks in natural language understanding (NLU). | ||
| 2020.acl-main.645 In this paper, we investigate the feasibility of training monolingual Transformer-based language models for other languages, taking French as an example and evaluating our language models on part-of-speech tagging, dependency parsing, named entity recognition and natural ***** language inference ***** tasks. | ||
| D18-1007 We present a large-scale collection of diverse natural ***** language inference ***** (NLI) datasets that help provide insight into how well a sentence representation captures distinct types of reasoning. | ||
| Hindi | 142 | |
| 2018.gwc-1.49 We describe our efforts to use pre-existing implementations for WaveNet - a model to generate raw audio using neural nets (Oord et al., 2016) and generate speech for ***** Hindi *****. | ||
| W18-4409 Two conditions were tested: (i) one with standard pre-trained fastText word embeddings where each ***** Hindi ***** word is treated as an OOV token, and (ii) another where word embeddings for ***** Hindi ***** and English are loaded in a common vector space, so ***** Hindi ***** tokens can be assigned a meaningful representation. | ||
| C16-1186 In this paper, we identified the mood taxonomy and prepared multimodal mood annotated datasets for ***** Hindi ***** and Western songs. | ||
| W16-4622 We develop a system based on hierarchical phrase-based SMT for English to ***** Hindi ***** language pair. | ||
| 2021.naacl-main.82 Besides, our agent achieves the new state-of-the-art on Room-Across-Room dataset, which contains instructions in 3 languages (English, ***** Hindi *****, and Telugu) | ||
| named entities | 142 | |
| 2020.lrec-1.852 To extract characters from target books, manually created dictionaries of characters are employed because some characters appear as common nouns not as ***** named entities *****. | ||
| L16-1530 We defined 15 broad categories of biomedical ***** named entities ***** for annotation. | ||
| P17-2068 In this work, we construct a corpus that ensures consistency between dependency structures and MWEs, including ***** named entities *****. | ||
| 2021.bsnlp-1.7 In this study, we present an exploratory analysis of a Slovenian news corpus, in which we investigate the association between ***** named entities ***** and sentiment in the news. | ||
| W16-3926 The first evaluation aims at predicting the 10 fine-grained types of ***** named entities *****; while the second evaluation aims at predicting no type of ***** named entities *****. | ||
| prosodic | 141 | |
| L06-1516 In this way; a new paradigm for evaluation of the ***** prosodic ***** component of TTS systems has been successfully demonstrated. | ||
| N18-1007 For automatically parsing spoken utterances, we introduce a model that integrates transcribed text and acoustic-***** prosodic ***** features using a convolutional neural network over energy and pitch trajectories coupled with an attention-based recurrent neural network that accepts text and ***** prosodic ***** features. | ||
| L04-1164 Second, the developed ***** prosodic ***** models for prominence are insufficiently accurate to produce automatic prominence annotations that are as good as the manual ones. | ||
| L16-1311 The current version of the SEA_AP toolkit is capable of analysing Galician, Spanish and Brazilian Portuguese data, and hence the distances between several ***** prosodic ***** linguistic varieties can be measured at present. | ||
| 2021.conll-1.42 For instance, generating speech with fine-grained prosody control (***** prosodic ***** prominence, contextually appropriate emotions) is still an open challenge | ||
| predicates | 140 | |
| W17-2803 Multi-modal grounded language learning connects language ***** predicates ***** to physical properties of objects in the world. | ||
| D18-1548 In experiments on CoNLL-2005 SRL, LISA achieves new state-of-the-art performance for a model using predicted ***** predicates ***** and standard word embeddings, attaining 2.5 F1 absolute higher than the previous state-of-the-art on newswire and more than 3.5 F1 on out-of-domain data, nearly 10% reduction in error. | ||
| E17-3001 In semantic parsing, natural language questions map to expressions in a meaning representation language (MRL) over some fixed vocabulary of ***** predicates *****. | ||
| W17-1803 This algorithm relies on a lexicon of ESPs, specifying how these ***** predicates ***** influence the polarity of their embedded events. | ||
| 2020.pam-1.6 In this paper, I show how the previous formulation gives trivial truth values when a precise quantifier is used with vague ***** predicates ***** | ||
| wordnets | 140 | |
| L12-1186 The queries initiated by a simple or multiword keyword, in Serbian or English, can be expanded by Bibliša, both semantically and morphologically, using different supporting monolingual and multilingual resources, such as ***** wordnets ***** and electronic dictionaries. | ||
| R19-1057 Researchers use ***** wordnets ***** as a knowledge base in many natural language processing tasks and applications, such as question answering, textual entailment, discourse classification, and so forth. | ||
| 2020.lrec-1.598 Its synsets were manually validated and are linked to semantically equivalent synsets of the Princeton WordNet of English, and thus transitively to the many ***** wordnets ***** for other languages that are also linked to this English wordnet. | ||
| 2016.gwc-1.31 Adverbs are seldom well represented in ***** wordnets ***** | ||
| 2020.lrec-1.390 In this paper we discuss the experience of bringing together over 40 different ***** wordnets *****. | ||
| text summarization | 140 | |
| D19-3043 machine translation, ***** text summarization *****, image captioning and video description) usually relies heavily on task-specific metrics, such as BLEU and ROUGE. | ||
| N18-2055 Identifying the most dominant and central event of a document, which governs and connects other foreground and background events in the document, is useful for many applications, such as ***** text summarization *****, storyline generation and text segmentation. | ||
| 2020.aacl-main.52 We finally discuss how we can take advantage of a cascaded pipeline in neural ***** text summarization ***** and shed light on important directions for future research. | ||
| 2021.cl-4.27 In this direction, this work presents a novel framework that combines sequence-to-sequence neural-based ***** text summarization ***** along with structure and semantic-based methodologies. | ||
| 2021.ranlp-1.184 We collected a ***** text summarization ***** dataset of EU legal documents consisting of 1563 documents, in which the mean length of summaries is 424 words. | ||
| irony detection | 140 | |
| 2021.ranlp-1.88 In addition, considering emoji position can further improve the performance for the ***** irony detection ***** task compared to the emoji label prediction. | ||
| S18-1105 The system takes as starting point emotIDM, an ***** irony detection ***** model that explores the use of affective features based on a wide range of lexical resources available for English, reflecting different facets of affect. | ||
| S18-1096 We create a targeted feature set and analyse how different features are useful in the task of ***** irony detection *****, achieving an F1-score of 0.5914. | ||
| P18-2122 This paper addresses the issue of false-alarm hashtags in the self-labeled data for ***** irony detection *****. | ||
| 2020.findings-emnlp.148 Each year, new shared tasks and datasets are proposed, ranging from classics like sentiment analysis to ***** irony detection ***** or emoji prediction. | ||
| random forest | 140 | |
| 2020.wosp-1.10 Our best model on the leaderboard is a ***** random forest ***** classifier using only the citation context text. | ||
| W18-0527 With these features, we trained a ***** random forest ***** classifier to predict errors in early learners of French, Spanish, and English. | ||
| S17-2161 Ensemble of unsupervised models, ***** random forest ***** and linear models are used for candidate keyphrase ranking and keyphrase type classification. | ||
| W18-0533 Experimental results show that feature-based ensemble learning methods (***** random forest ***** and LambdaMART) outperform both the NN-based method and unsupervised baselines. | ||
| 2021.ltedi-1.23 In the first approach, we used contextual embeddings to train classifiers using logistic regression, ***** random forest *****, SVM, and LSTM based models. | ||
| Paraphrase | 139 | |
| 2020.ngt-1.21 This paper describes the University of Maryland's submission to the Duolingo Shared Task on Simultaneous Translation And ***** Paraphrase ***** for Language Education (STAPLE). | ||
| L16-1289 ***** Paraphrase ***** plagiarism is a significant and widespread problem and research shows that it is hard to detect. | ||
| 2021.emnlp-main.480 ***** Paraphrase ***** generation is a longstanding NLP task that has diverse applications on downstream NLP tasks. | ||
| C16-3003 ***** Paraphrase ***** or textual entailment techniques can contribute to the identification of relations across different scientific textual sources. | ||
| W16-4207 *****Paraphrase***** generation is important in various applications such as search , summarization , and question answering due to its ability to generate textual alternatives while keeping the overall meaning intact . | ||
| sentence alignment | 138 | |
| 2020.emnlp-main.207 Most publicly available parallel corpora for Bengali are not large enough; and have rather poor quality, mostly because of incorrect ***** sentence alignment *****s resulting from erroneous sentence segmentation, and also because of a high volume of noise present in them. | ||
| W19-4309 Our approach shows promising performance on ***** sentence alignment ***** recovery and the WMT 2018 parallel corpus filtering tasks with only a single model. | ||
| L16-1666 Due to the scarcity of digital resources, we describe the several problems that arose when compiling this corpus: most of our sources were non-digital books, we faced errors when digitizing the sources and there were difficulties in the ***** sentence alignment ***** process, just to mention some. | ||
| D19-1136 We introduce Vecalign, a novel bilingual ***** sentence alignment ***** method which is linear in time and space with respect to the number of sentences being aligned and which requires only bilingual sentence embeddings. | ||
| W19-3650 Due to this, we need to face the errors when digitizing the sources and difficulties in ***** sentence alignment *****, as well as the fact that does not exist a standard orthography. | ||
| evaluations | 137 | |
| 2021.emnlp-main.703 Previous work has shown that human ***** evaluations ***** in NLP are notoriously under-powered. | ||
| P19-1597 We conduct experiments on 5 categories in a benchmark Chess Commentary dataset and achieve inspiring results in both automatic and human ***** evaluations *****. | ||
| 2020.acl-main.8 Increasing model scale yielded similar improvements in human ***** evaluations ***** that measure preference of model samples to the held out target distribution in terms of realism (31% increased to 37% preference), style matching (37% to 42%), grammar and content quality (29% to 42%), and conversation coherency (32% to 40%). | ||
| P19-1565 Quantitative and human ***** evaluations ***** show our system can produce meaningful and effective conversations, significantly improving over other approaches | ||
| 1999.mtsummit-1.31 The first contribution sets out some recent work on creating standards for the design of ***** evaluations ***** | ||
| compositionality | 137 | |
| 2020.acl-main.539 After that, a program-driven module network is further introduced to exploit the hierarchical structure of the program, where semantic ***** compositionality ***** is dynamically modeled along the program structure with a set of function-specific modules. | ||
| 2021.spnlp-1.3 AM dependency parsing is a method for neural semantic graph parsing that exploits the principle of ***** compositionality *****. | ||
| 2021.iwpt-1.14 This talk will describe work that relies on ***** compositionality ***** in semantic parsing and in reading comprehension requiring numerical reasoning. | ||
| L16-1365 Focusing on compound nouns (CN), we then verify in a longitudinal study if there are differences in the distribution and ***** compositionality ***** of CNs in child-directed and child-produced sentences across ages. | ||
| L12-1283 Our approach is based on the hypothesis that ***** compositionality ***** can be related to distributional similarity | ||
| affective | 137 | |
| 2021.acl-short.50 The method exploits the online use of reaction GIFs, which capture complex ***** affective ***** states. | ||
| P17-2022 Informal first-person narratives are a unique resource for computational models of everyday events and people's ***** affective ***** reactions to them. | ||
| 2020.lrec-1.14 The EPA vectors are mapped to an ***** affective ***** influence value and then integrated into Long Short-term Memory (LSTM) models to highlight ***** affective ***** terms. | ||
| N18-2024 Classification experiments relying on a standard similarity model successfully distinguish between four types of shifts, with verb classes boosting the performance, and ***** affective ***** features for abstractness, emotion and sentiment representing the most salient indicators | ||
| S18-1105 The system takes as starting point emotIDM, an irony detection model that explores the use of ***** affective ***** features based on a wide range of lexical resources available for English, reflecting different facets of affect. | ||
| AI | 137 | |
| 2021.ranlp-srw.10 Sentiment analysis of textual information in user comments is a topical task in emotion ***** AI ***** because user comments or reviews are not homogeneous, they contain sparse context behind, and are misleading both for human and computer. | ||
| C18-1038 Answering questions from university admission exams (Gaokao in Chinese) is a challenging ***** AI ***** task since it requires effective representation to capture complicated semantic relations between questions and answers. | ||
| 2020.nli-1.4 Many users communicate with chatbots and ***** AI ***** assistants in order to help them with various tasks. | ||
| J77-4001 ACL Officers 1978; AJCL Editorial Board (John L. Bennett; Wallace Chafe); ACL Annual Meeting 1978 (David L. Waltz); ACL 78 Session Descriptions; TINLAP-2 Proceedings Supplement Canceled (Dr. Donald E. Walker); ACL Membership List: Individuals 1977; ACL United States Institutional Members 1977; ACL Foreign Institutional Members 1977; Vingt Cinq Annees de Recherches en Synthese de la Parole, Michel Chafcouloff (Andre Malecot); Information * Politics: Proceedings of the ASIS Annual Meeting, Vol. 13, compiled by Susan K. Martin (Gerard Salton); Taxonomy of Computer Science (Anthony Ralston); COLING 78 (A. Zampolli); NCC 78; Improving Data Base Utility and Response: Conference (A. Reiter); Upcoming Conferences; Natural Language and ***** AI ***** at Yale; Pattern Recognition and Artificial Intelligence (T. Pavlidis); 1978 Linguistics Institute; AFIPS Washington Report | ||
| 2020.sdp-1.23 To provide ***** AI ***** researchers with modern tools for dealing with the explosive growth of the research literature in their field, we introduce a new platform, ***** AI ***** Research Navigator, that combines classical keyword search with neural retrieval to discover and organize relevant literature | ||
| questions | 137 | |
| D18-1241 We present QuAC, a dataset for Question Answering in Context that contains 14K information-seeking QA dialogs (100K ***** questions ***** in total). | ||
| 2020.acl-main.119 At each time step, our model performs multiple rounds of attention, reasoning, and composition that aim to answer two critical ***** questions *****: (1) which part of the input sequence to abstract; and (2) where in the output graph to construct the new concept. | ||
| 2021.splurobonlp-1.3 We automatically extract different trigger and zoomer pairs based on the visual property that the ***** questions ***** rely on (e.g. | ||
| P18-3022 Depending upon the input, we generate both factoid and descriptive type ***** questions *****. | ||
| 2021.acl-short.33 A recent study showed that manual summarization of consumer health ***** questions ***** brings significant improvement in retrieving relevant answers. | ||
| exposure bias | 137 | |
| 2020.emnlp-main.702 Sequence generation models trained with teacher-forcing suffer from issues related to ***** exposure bias ***** and lack of differentiability across timesteps. | ||
| 2020.coling-main.138 However, they usually involve sequential interrelated steps and suffer from the problem of ***** exposure bias *****. | ||
| D19-1619 To learn a decoder, supervised learning which maximizes the likelihood of tokens always suffers from the ***** exposure bias *****. | ||
| D18-1396 Neural machine translation usually adopts autoregressive models and suffers from ***** exposure bias ***** as well as the consequent error propagation problem. | ||
| 2021.eacl-main.233 Next, we perform a coupled scheduled sampling to effectively mitigate the ***** exposure bias ***** when learning both policies jointly with imitation learning. | ||
| Natural Language | 137 | |
| 2020.gebnlp-1.6 Recent research in *****Natural Language***** Processing has revealed that word embeddings can encode social biases present in the training data which can affect minorities in real world applications . | ||
| L16-1682 Part - of - speech tagging is a basic step in *****Natural Language***** Processing that is often essential . | ||
| L10-1213 Automatic content scoring for free - text responses has started to emerge as an application of *****Natural Language***** Processing in its own right , much like question answering or machine translation . | ||
| 2020.semeval-1.170 Sentiment Analysis is a well - studied field of *****Natural Language***** Processing . | ||
| W17-3502 Poetry generation is becoming popular among researchers of *****Natural Language***** Generation , Computational Creativity and , broadly , Artificial Intelligence . | ||
| error correction | 136 | |
| W17-5016 Furthermore, we investigate augmenting our model with ***** error correction *****s and monitor the impact on performance. | ||
| 2020.acl-main.82 Spelling ***** error correction ***** is an important yet challenging task because a satisfactory solution of it essentially needs human-level language understanding ability. | ||
| 2021.acl-long.385 Moreover, besides the typical fix-length ***** error correction ***** datasets, we also construct a variable-length corpus to conduct experiments. | ||
| L14-1231 We propose to combine two automatic procedures to obtain the ***** error correction *****: i) a similarity measure and ii) a translation algorithm based on aligned parallel corpus. | ||
| 2020.lrec-1.835 The lack of large-scale datasets has been a major hindrance to the development of NLP tasks such as spelling correction and grammatical ***** error correction ***** (GEC). | ||
| word embedding | 136 | |
| S17-2031 The first stage deals with constructing neural ***** word embedding *****s, the components of sentence embeddings. | ||
| W17-1411 We investigate whether ***** word embedding *****s offer any advantage over corpus- and preprocessing-free string kernels, and how these compare to bag-of-words baselines. | ||
| W18-6230 This paper describes an approach to solve implicit emotion classification with the use of pre-trained ***** word embedding ***** models to train multiple neural networks. | ||
| C16-1121 We present a successful collaboration of ***** word embedding *****s and co-training to tackle in the most difficult test case of semantic role labeling: predicting out-of-domain and unseen semantic frames. | ||
| 2020.vardial-1.6 However, these approaches require cross-lingual information such as seed dictionaries to train the model and find a linear transformation between the ***** word embedding ***** spaces. | ||
| intents | 135 | |
| 2020.acl-srw.17 This classification introduces new sequence-based social ***** intents ***** that traditional taxonomies of speech acts do not capture. | ||
| N19-1380 One of the first steps in the utterance interpretation pipeline of many task-oriented conversational AI systems is to identify user ***** intents ***** and the corresponding slots. | ||
| 2021.dialdoc-1.15 Phishing emails are longer than dialogue utterances and often contain multiple ***** intents *****. | ||
| 2021.teachingnlp-1.2 Our course ***** intents ***** to serve multiple purposes: (i) familirize students with the core concepts and methods in NLP, such as language modelling or word or sentence representations, (ii) show that recent advances, including pre-trained Transformer-based models, are build upon these concepts; (iii) to introduce architectures for most most demanded real-life applications, (iii) to develop practical skills to process texts in multiple languages. | ||
| 2020.coling-main.429 Inspired by this idea, we have manually labeled 500 response ***** intents ***** using a subset of a sizeable empathetic dialogue dataset (25K dialogues) | ||
| regression | 134 | |
| D18-1304 By including multiview embeddings, we obtain an F1 score of 0.82 in the classification task and a mean absolute error of 3.42 in the ***** regression ***** task. | ||
| 2021.acl-long.515 Using negative flip rate as ***** regression ***** measure, we show that ***** regression ***** has a prevalent presence across tasks in the GLUE benchmark. | ||
| 2021.semeval-1.11 Over these features, a supervised random forest ***** regression ***** algorithm was trained. | ||
| S18-1001 valence (sentiment) ***** regression *****, 4. | ||
| 2020.sigdial-1.19 User impression scores were collected from 104 participants recruited via crowdsourcing and then ***** regression ***** analysis was conducted | ||
| reasoning | 134 | |
| 2000.amta-papers.3 A mixed-initiative system is one which allows more interactivity between the system and user, as the system is ***** reasoning *****. | ||
| L10-1615 Data in our terminological knowledge base (TKB) are primarily hosted in a relational database which is now linked to an ontology in order to apply ***** reasoning ***** techniques and enhance user queries. | ||
| 2020.acl-main.770 This seemingly inevitable trade-off may not tell us much about the changes in the ***** reasoning ***** and understanding capabilities of the resulting models on broader types of examples beyond the small subset represented in the out-of-distribution data. | ||
| 2021.emnlp-main.467 Therefore, optimizing the semantic entailment and contradiction ***** reasoning ***** objective alone is inadequate to capture the high-level semantic structure. | ||
| 2021.acl-short.100 However, existing approaches in this area are limited by considering CKGs as a limited set of facts, thus rendering them unfit for ***** reasoning ***** over new unseen situations and events | ||
| statistical | 134 | |
| L12-1088 It is shown experimentally to be well suited for training ***** statistical ***** dependency parsers by comparing the performance of two parsers from different parsing paradigms on the data set of the CoNLL 2009 Shared Task data and our corpus. | ||
| L14-1180 We present the results and evaluate them with respect to accuracy and completeness through ***** statistical ***** comparisons between retrieved and manually constructed reference annotations. | ||
| 2020.acl-main.511 Furthermore, novel insight into argument quality is provided through ***** statistical ***** analysis, and a new aggregation method to infer overall quality from individual quality dimensions is proposed. | ||
| D19-1453 The state of the art in machine translation (MT) is governed by neural approaches, which typically provide superior translation accuracy over ***** statistical ***** approaches. | ||
| 2020.acl-main.146 Word alignment was once a core unsupervised learning task in natural language processing because of its essential role in training ***** statistical ***** machine translation (MT) models | ||
| event extraction | 134 | |
| L14-1091 Here, we extend the distant supervision approach to template-based ***** event extraction *****, focusing on the extraction of passenger counts, aircraft types, and other facts concerning airplane crash events. | ||
| 2021.ranlp-1.40 To the best of our knowledge we present the first ***** event extraction ***** approach that combines an expert-based syntactic parser with a transformer-based classifier for Dutch. | ||
| 2020.emnlp-main.431 Importantly, we also provide first results on biomedical ***** event extraction ***** without gold entity information. | ||
| 2020.lrec-1.362 The existing lexicons blur senses and frames of predicates, which needs to be refined to meet the tasks like word sense disambiguation and ***** event extraction *****. | ||
| W18-1507 We present a novel approach for ***** event extraction ***** and abstraction from movie descriptions. | ||
| knowledge graph | 134 | |
| 2021.emnlp-main.712 Such relation embeddings are appealing because they can, in principle, encode relational knowledge in a more fine-grained way than is possible with ***** knowledge graph *****s. | ||
| 2020.coling-main.369 Furthermore, when working on a specific domain, ***** knowledge graph *****s in its entirety contribute towards extraneous information and noise. | ||
| N18-5004 We present CL Scholar, the ACL Anthology ***** knowledge graph ***** miner to facilitate high-quality search and exploration of current research progress in the computational linguistics community. | ||
| 2020.emnlp-main.99 It performs multi-hop, multi-relational reasoning over subgraphs extracted from external ***** knowledge graph *****s. | ||
| 2020.emnlp-main.595 It has been shown that *****knowledge graph***** embeddings encode potentially harmful social biases , such as the information that women are more likely to be nurses , and men more likely to be bankers . | ||
| scientific | 134 | |
| L14-1283 The creation of large-scale multimedia datasets has become a ***** scientific ***** matter in itself. | ||
| L14-1662 Extracting Linked Data following the Semantic Web principle from unstructured sources has become a key challenge for ***** scientific ***** research. | ||
| D19-1236 The review and selection process for ***** scientific ***** paper publication is essential for the quality of scholarly publications in a ***** scientific ***** field. | ||
| L10-1030 This paper describes the development of a new Swedish ***** scientific ***** medical corpus. | ||
| L12-1467 The platform will facilitate new linguistic findings by making it possible to manage and analyse primary data and annotations in the petabyte range, while at the same time allowing an undistorted view of the primary linguistic data, and thus fully satisfying the demands of a ***** scientific ***** tool. | ||
| Relation | 133 | |
| K19-1056 ***** Relation ***** extraction is the task of determining the relation between two entities in a sentence. | ||
| C18-1100 ***** Relation ***** classification is an important task in natural language processing fields. | ||
| L08-1305 ***** Relation ***** extraction is the task of finding pre-defined semantic relations between two entities or entity mentions from text. | ||
| P19-1525 *****Relation***** Extraction is the task of identifying entity mention spans in raw text and then identifying relations between pairs of the entity mentions . | ||
| N18-2059 *****Relation***** classification is an important semantic processing task in the field of natural language processing . | ||
| relations | 133 | |
| W19-4504 Contrary to previous works, we focus on comparing sub-structures and not only ***** relations ***** matches. | ||
| S17-1012 The task is challenging, with major performance differences between ***** relations *****. | ||
| P17-1088 Moreover, the learned associations between ***** relations ***** and concepts, which are represented by sparse attention vectors, can be interpreted easily. | ||
| W18-6001 For both evaluation datasets, the performance of parsers increases, in terms of the standard LAS and UAS measures and of a more focused measure taking into account only ***** relations ***** involved in error patterns, and at the level of individual dependencies. | ||
| 2021.emnlp-main.92 The analysis shows that our few-shot systems are specially effective when discriminating between ***** relations *****, and that the performance difference in low data regimes comes mainly from identifying no-relation cases | ||
| downstream | 132 | |
| D19-5537 We evaluate our method under synthetic noise and natural noise and show that the proposed algorithm can use context information to correct noise text and improve the performance of noisy inputs in several ***** downstream ***** tasks. | ||
| 2020.acl-main.484 Word embeddings derived from human-generated corpora inherit strong gender bias which can be further amplified by ***** downstream ***** models. | ||
| 2021.emnlp-main.45 Compared to the prevalent text infilling objectives for Seq2Seq pre-training, SSR is naturally more consistent with many ***** downstream ***** generation tasks that require sentence rewriting (e.g., text summarization, question generation, grammatical error correction, and paraphrase generation). | ||
| W18-3217 Named Entity Recognition plays a major role in several ***** downstream ***** applications in NLP. | ||
| 2018.iwslt-1.2 Mining parallel sentences from comparable corpora is of great interest for many ***** downstream ***** tasks | ||
| meaning representation | 132 | |
| L12-1164 The resource thus derived provides a ***** meaning representation ***** that complements the relational representation captured in the concept network. | ||
| W18-6554 One of the biggest challenges of end-to-end language generation from ***** meaning representation *****s in dialogue systems is making the outputs more natural and varied. | ||
| W19-3317 From it, we define a ***** meaning representation ***** label set by adapting the English schema and taking into account the specific characteristics of Vietnamese. | ||
| 2020.lrec-1.234 We discuss methodological choices in contrastive and diagnostic evaluation in *****meaning representation***** parsing , i.e. | ||
| 2020.conll-shared.3 Prague Tectogrammatical Graphs ( PTG ) is a *****meaning representation***** framework that originates in the tectogrammatical layer of the Prague Dependency Treebank ( PDT ) and is theoretically founded in Functional Generative Description of language ( FGD ) . | ||
| medical | 132 | |
| W19-5004 In this work, we build a unifying framework for RE, applying this on three highly used datasets (from the general, bio***** medical ***** and clinical domains) with the ability to be extendable to new datasets. | ||
| P17-1028 We evaluate a suite of methods across two different applications and text genres: Named-Entity Recognition in news articles and Information Extraction from bio***** medical ***** abstracts. | ||
| D19-6203 The experimental results suggest that dependency-based pooling is the best pooling strategy for RE in the bio***** medical ***** domain, yielding the state-of-the-art performance on two benchmark datasets for this problem. | ||
| 2020.clinicalnlp-1.15 We pre-trained several models of common architectures on this dataset and empirically showed that such pre-training leads to improved performance and convergence speed when fine-tuning on downstream ***** medical ***** tasks. | ||
| L12-1032 Natural language generation in the *****medical***** domain is heavily influenced by domain knowledge and genre - specific text characteristics . | ||
| annotation projection | 132 | |
| R17-1054 The naive approach to ***** annotation projection ***** is not effective to project discourse annotations from one language to another because implicit relations are often changed to explicit ones and vice-versa in the translation. | ||
| D19-1056 Unlike ***** annotation projection ***** techniques, our model does not need parallel data during inference time. | ||
| W19-1425 We consider four techniques to perform Ukrainian POS tagging: zero-shot tagging and cross-lingual ***** annotation projection ***** (for the zero-resource scenario), and compare these with self-training and multilingual learning (for the low-resource scenario). | ||
| 2020.emnlp-main.391 Our approach innovates in three ways: 1) a robust approach of selecting training instances via cross-lingual ***** annotation projection ***** that exploits best practices of unsupervised type and token constraints, word-alignment confidence and density of projected POS, 2) a Bi-LSTM architecture that uses contextualized word embeddings, affix embeddings and hierarchical Brown clusters, and 3) an evaluation on 12 diverse languages in terms of language family and morphological typology. | ||
| C18-1071 Moreover, we find that ***** annotation projection ***** works equally well when using either costly human or cheap machine translations. | ||
| achieves | 131 | |
| D19-1241 The experimental results show that our method ***** achieves ***** single model state-of-the-art performance on Math23K, which is the largest dataset on this task. | ||
| P19-1421 Our experiments on the four datasets from Coursera and XuetangX show that the proposed method ***** achieves ***** significant improvements(+0.19 by MAP) over existing methods. | ||
| D18-1043 Extensive experiments on word translation of European and Non-European languages show that our method ***** achieves ***** better performance than recent state-of-the-art deep adversarial approaches and is competitive with the supervised baseline. | ||
| 2021.vardial-1.10 The XGBoost ensemble resulted from combining the power of the aforementioned methods ***** achieves ***** a median distance of 23.6 km on the test data, which places us on the third place in the ranking, at a difference of 6.05 km and 2.9 km from the submissions on the first and second places, respectively. | ||
| 2020.iwdp-1.5 Extensive experiments on this benchmark show that our proposed method ***** achieves ***** a competitive performance on a document-level real-world scenario for CWS | ||
| keyphrases | 131 | |
| S17-2166 We explored semantic similarities and patterns of ***** keyphrases ***** in scientific publications using pre-trained word embedding models. | ||
| 2021.emnlp-main.146 This paper proposes the AttentionRank, a hybrid attention model, to identify ***** keyphrases ***** from a document in an unsupervised manner. | ||
| W18-2304 We propose ***** keyphrases ***** extraction technique to extract important terms from the healthcare user-generated contents. | ||
| K18-1022 With EmbedRank, we also explicitly increase coverage and diversity among the selected ***** keyphrases ***** by introducing an embedding-based maximal marginal relevance (MMR) for new phrases. | ||
| P19-1240 While most existing methods extract words from source posts to form ***** keyphrases *****, we propose a sequence-to-sequence (seq2seq) based neural keyphrase generation framework, enabling absent ***** keyphrases ***** to be created | ||
| MWE | 130 | |
| W19-5117 We propose a deep encoder-decoder architecture generating for every ***** MWE ***** word its corresponding part in the lemma, based on the internal context of the ***** MWE *****. | ||
| L16-1263 The recognition of multiword expressions (***** MWE *****s) in a sentence is important for such linguistic analyses as syntactic and semantic parsing, because it is known that combining an ***** MWE ***** into a single token improves accuracy for various NLP tasks, such as dependency parsing and constituency parsing. | ||
| W19-5119 Our neural MTL architecture utilises the supervision of dependency parsing in lower layers and predicts ***** MWE ***** tags in upper layers. | ||
| L12-1613 For each ***** MWE ***** its basic morphological form and the base forms of its constituents are specified but also each ***** MWE ***** is assigned to a class on the basis of its syntactic structure | ||
| L14-1433 Multiword expressions ( MWEs ) are quite frequent in languages such as English , but their diversity , the scarcity of individual *****MWE***** types , and contextual ambiguity have presented obstacles to corpus - based studies and NLP systems addressing them as a class . | ||
| sign language | 130 | |
| 2020.signlang-1.15 In this paper we survey the state of the art for the anonymisation of ***** sign language ***** corpora. | ||
| 2020.signlang-1.3 The utterance unit is an original concept for segmenting and annotating ***** sign language ***** dialogue referring to signer's native sense from the perspectives of Conversation Analysis (CA) and Interaction Studies. | ||
| L14-1253 The purpose of this project was to increase awareness of ***** sign language ***** as a distinctive language in Japan. | ||
| 2020.signlang-1.4 In this paper, we present a novel approach for measuring lexical similarity across any two ***** sign language *****s using the Global Signbank platform, a lexical database of uniformly coded signs. | ||
| 2020.signlang-1.35 However, parallel corpora consisting of ***** sign language ***** interpreting are rarely explored. | ||
| machine reading | 130 | |
| 2020.coling-main.235 The novel framework shows an interesting perspective on ***** machine reading ***** comprehension and cognitive science. | ||
| 2020.emnlp-main.549 Numerical reasoning over texts, such as addition, subtraction, sorting and counting, is a challenging ***** machine reading ***** comprehension task, since it requires both natural language understanding and arithmetic computation. | ||
| 2020.coling-main.219 Question answering over dialogue, a specialized ***** machine reading ***** comprehension task, aims to comprehend a dialogue and to answer specific questions. | ||
| 2020.coling-main.248 Neural models have achieved great success on the task of ***** machine reading ***** comprehension (MRC), which are typically trained on hard labels. | ||
| 2020.findings-emnlp.226 Answer validation in ***** machine reading ***** comprehension (MRC) consists of verifying an extracted answer against an input context and question pair. | ||
| heuristic | 129 | |
| 2021.nodalida-main.12 First, the real estate reports and their associated summaries are automatically labelled using a set of ***** heuristic ***** rules gathered from human experts and aggregated using weak supervision. | ||
| 2020.inlg-1.9 The output of the model is filtered by a simple ***** heuristic ***** and reranked with an off-the-shelf pre-trained language model. | ||
| K17-1039 Instead, we propose a differentiable relaxation that lends itself to gradient-based optimisation, thus bypassing the need for reinforcement learning or ***** heuristic ***** modification of cross-entropy. | ||
| C18-1098 In this paper, we propose a multilevel ***** heuristic ***** approach to regulate rationale extraction to avoid extracting monotonous rationales without compromising classification performance. | ||
| L16-1559 We use time-based alignment with lexical re-synchronisation techniques and BLEU score filters and sort alternative translations into categories using edit distance metrics and ***** heuristic ***** rules | ||
| seq2seq | 129 | |
| 2021.sustainlp-1.6 Copy mechanisms explicitly obtain unchanged tokens from the source (input) sequence to generate the target (output) sequence under the neural ***** seq2seq ***** framework. | ||
| K19-1084 We introduce a class of ***** seq2seq ***** models, GAMs (Global Autoregressive Models), which combine an autoregressive component with a log-linear component, allowing the use of global a priori features to compensate for lack of data. | ||
| 2021.emnlp-main.418 However, recent research has shown that inappropriate language in training samples and well-designed testing cases can induce ***** seq2seq ***** models to output profanity. | ||
| 2020.coling-main.363 Specifically, we study the sequence-to-sequence (Seq2Seq) model in the contexts of two mainstream NLP tasks–machine translation and dialogue response generation–as they both use the ***** seq2seq ***** model. | ||
| 2020.acl-main.39 We empirically support our claim for recurrent ***** seq2seq ***** models with our proposed attention on variants of the Lookup Table task | ||
| monolingual corpora | 129 | |
| 2020.acl-main.318 Unsupervised bilingual lexicon induction is the task of inducing word translations from ***** monolingual corpora ***** of two languages. | ||
| 2020.acl-srw.37 We utilize script mapping (Chinese to Japanese) to increase the similarity (number of cognates) between the ***** monolingual corpora ***** of helping languages and LOI. | ||
| 2021.dravidianlangtech-1.8 Code-mixed texts are abundant, especially in social media, and pose a problem for NLP tools as they are typically trained on ***** monolingual corpora *****. | ||
| 2020.acl-main.152 We present a novel method to extract parallel sentences from two ***** monolingual corpora *****, using neural machine translation | ||
| 2021.acl-long.507 We show that margin-based bitext mining in a multilingual sentence space can be successfully scaled to operate on ***** monolingual corpora ***** of billions of sentences. | ||
| implicit | 129 | |
| W17-0809 Then, TDB 1.1, i.e. enrichments on 10% of the corpus are described (namely, senses for explicit discourse connectives, and new annotations for three discourse relation types - ***** implicit ***** relations, entity relations and alternative lexicalizations). | ||
| N19-1139 In this work, we show that the knowledge ***** implicit ***** in the optimization procedure can be distilled into another more efficient neural network. | ||
| 2020.sigdial-1.24 A particularly interesting phenomenon we observe is that the model picks up ***** implicit ***** meanings by splitting different aspects of the semantics of a single word across multiple attention heads. | ||
| 2021.emnlp-main.143 Rephrase detection is used to identify the rephrases and has long been treated as a task with pairwise input, which does not fully utilize the contextual information (e.g. users' ***** implicit ***** feedback). | ||
| 2021.acl-long.29 Experiments demonstrate the feasibility of the new task and its effectiveness in extracting and describing ***** implicit ***** aspects and ***** implicit ***** opinions | ||
| Aspect | 128 | |
| C16-1219 *****Aspect***** extraction identifies relevant features from a textual description of an entity , e.g. , a phone , and is typically targeted to product descriptions , reviews , and other short texts as an enabling task for , e.g. , opinion mining and information retrieval . | ||
| P17-1036 *****Aspect***** extraction is an important and challenging task in aspect - based sentiment analysis . | ||
| 2020.aacl-srw.18 *****Aspect***** extraction is a widely researched field of natural language processing in which aspects are identified from the text as a means for information . | ||
| D19-1465 *****Aspect***** words , indicating opinion targets , are essential in expressing and understanding human opinions . | ||
| 2021.ecnlp-1.17 *****Aspect***** extraction is not a well - explored topic in Hindi , with only one corpus having been developed for the task . | ||
| entity linking | 128 | |
| D18-1270 This enables our approach to: (a) augment the limited supervision in the target language with additional supervision from a high-resource language (like English), and (b) train a single ***** entity linking ***** model for multiple languages, improving upon individually trained models for each language. | ||
| Q15-1023 We attack this confusion by analyzing differences between several versions of the EL problem and presenting a simple yet effective, modular, unsupervised system, called Vinculum, for ***** entity linking *****. | ||
| C16-1218 Previous studies have highlighted the necessity for ***** entity linking ***** systems to capture the local entity-mention similarities and the global topical coherence. | ||
| Q14-1037 We present a joint model of three core tasks in the entity analysis stack: coreference resolution (within-document clustering), named entity recognition (coarse semantic typing), and ***** entity linking ***** (matching to Wikipedia entities). | ||
| W18-5520 A simple ***** entity linking ***** approach with text match is used as the document selection component, this component identifies relevant documents for a given claim by using mentioned entities as clues. | ||
| language learning | 128 | |
| P19-1035 This is a crucial step towards generating learner-adaptive exercises for self-directed ***** language learning ***** and preparing language assessment tests. | ||
| W19-1808 Recent work on visually grounded ***** language learning ***** has focused on broader applications of grounded representations, such as visual question answering and multimodal machine translation. | ||
| C16-1124 We present a model of visually-grounded ***** language learning ***** based on stacked gated recurrent neural networks which learns to predict visual features given an image description in the form of a sequence of phonemes. | ||
| L12-1465 In this paper we describe the research that was carried out and the resources that were developed within the DISCO (Development and Integration of Speech technology into COurseware for ***** language learning *****) project. | ||
| D18-1274 Grammatical error correction ( GEC ) systems deployed in *****language learning***** environments are expected to accurately correct errors in learners ' writing . | ||
| sequence labeling | 128 | |
| D19-1422 For neural ***** sequence labeling *****, however, BiLSTM-CRF does not always lead to better results compared with BiLSTM-softmax local classification. | ||
| 2021.acl-long.16 However, existing early-exit mechanisms are specifically designed for sequence-level tasks, rather than ***** sequence labeling *****. | ||
| 2021.acl-short.108 Specifically, we insert small bottleneck layers (i.e., adapter) within each layer of a pretrained model, then fix the pretrained layers and train the adapter layers on the downstream task data, with (1) task-specific unsupervised pretraining and then (2) task-specific supervised training (e.g., classification, ***** sequence labeling *****). | ||
| 2021.semeval-1.43 In our systems, two different frameworks are designed to solve text classification and ***** sequence labeling *****. | ||
| D19-1429 Experiments on three ***** sequence labeling ***** tasks show that our fine-grained knowledge fusion model outperforms strong baselines and other state-of-the-art ***** sequence labeling ***** domain adaptation methods. | ||
| hypotheses | 127 | |
| L08-1110 We describe a set of experiments to explore statistical techniques for ranking and selecting the best translations in a graph of translation ***** hypotheses *****. | ||
| 2021.emnlp-main.298 Many open-domain question answering problems can be cast as a textual entailment task, where a question and candidate answers are concatenated to form ***** hypotheses *****. | ||
| P18-2054 However, as the algorithm produces ***** hypotheses ***** in a monotonic left-to-right order, a hypothesis can not be revisited once it is discarded. | ||
| P19-1073 Third, ***** hypotheses ***** about the cause of errors should be explicitly tested; Errudite supports this via automated counterfactual rewriting. | ||
| P18-4005 For example, in Question Answering, the supporting text can be newswire or Wikipedia articles; in Natural Language Inference, premises can be seen as the supporting text and ***** hypotheses ***** as questions | ||
| RL | 127 | |
| P19-1208 Extensive experiments on five real-world datasets of different scales demonstrate that our ***** RL ***** approach consistently and significantly improves the performance of the state-of-the-art generative models with both conventional and new evaluation methods. | ||
| 2021.emnlp-main.83 We present an extensive investigation demonstrating that the use of ***** RL ***** via SCST benefits graph and text generation on WebNLG+ 2020 and TekGen datasets. | ||
| D19-1014 This paper proposes a novel framework that alternatively trains a ***** RL ***** policy for image guessing and a supervised seq2seq model to improve dialog generation quality. | ||
| P19-1434 We validate our model on a synthetic dataset (bAbI) as well as real-world large-scale textual QA (TriviaQA) and video QA (TVQA) datasets, on which it achieves significant improvements over rule based memory scheduling policies or an ***** RL ***** based baseline that independently learns the query-specific importance of each memory. | ||
| 2021.emnlp-main.540 Extensive experiments on iSQuAD suggest that graph representations can result in significant performance improvements for ***** RL ***** agents | ||
| lemmas | 127 | |
| 2020.lt4hala-1.19 Word forms plus the outputted POS labels are used to feed a seq2seq algorithm implemented in Keras to predict ***** lemmas *****. | ||
| 2020.acl-main.736 The morphological features can be lexicalized, like ***** lemmas ***** and diacritized forms, or non-lexicalized, like gender, number, and part-of-speech tags, among others. | ||
| L12-1619 This dataset contains parser-generated dependency structures (with POS tags and ***** lemmas *****) for all FrameNet 1.5 sentences, with nodes automatically associated with FrameNet annotations. | ||
| 2016.gwc-1.60 We test cross-domain performance by gathering ***** lemmas ***** and synsets from three corpora: website privacy policies, Wikipedia articles, and Wikibooks textbooks. | ||
| 2018.gwc-1.46 Transliteration is used to bridge alphabet differences and match ***** lemmas ***** in the closest phonological way | ||
| pairwise | 127 | |
| 2020.sdp-1.8 One remedy is a blocking function that reduces the number of ***** pairwise ***** similarity calculations. | ||
| 2021.emnlp-main.186 Dominant sentence ordering models can be classified into ***** pairwise ***** ordering models and set-to-sequence models. | ||
| P18-1020 We contrast direct assessment (annotators assign scores to items directly), online ***** pairwise ***** ranking aggregation (scores derive from annotator comparison of items), and a hybrid approach (EASL: | ||
| D18-1135 The first approach is based on interpreting the ***** pairwise ***** string kernel similarities between samples in the training set and samples in the test set as features. | ||
| P19-1242 We build a dataset of 12,594 hashtags split into individual segments and propose a set of approaches for hashtag segmentation by framing it as a ***** pairwise ***** ranking problem between candidate segmentations | ||
| spatial | 127 | |
| 2021.splurobonlp-1.4 In this work, we discuss our attempt at modeling ***** spatial ***** senses of prepositions in English using a combination of rule-based and statistical learning approaches. | ||
| 2020.emnlp-main.314 To mitigate this, we propose Form2Seq, a novel sequence-to-sequence (Seq2Seq) inspired framework for structure extraction using text, with a specific focus on forms, which leverages relative ***** spatial ***** arrangement of structures. | ||
| 2021.reinact-1.8 We applied this system to a particular task of characterizing ***** spatial ***** configurations of blocks in a simple physical Blocks World (BW) domain using natural locative expressions, as well as generating justifications for the proposed ***** spatial ***** descriptions by indicating the factors that the system used to arrive at a particular conclusion. | ||
| 2020.emnlp-tutorials.5 We discuss the recent results on the above-mentioned applications –that need ***** spatial ***** language learning and reasoning – and highlight the research gaps and future directions. | ||
| W18-1401 The challenge for computational models of ***** spatial ***** descriptions for situated dialogue systems is the integration of information from different modalities. | ||
| grammar | 127 | |
| L08-1321 In this paper we will mainly discuss the most important parts, ***** grammar ***** management and validation systems, which are directly related to a CCG lexicon construction. | ||
| 2021.emnlp-main.157 We combine a finite state implementation of a published ***** grammar ***** with a partial lexicon, and apply this to a noisy phone representation of the signal. | ||
| W18-4907 Our second experiment also suggests that the same methodology might be used for extracting more schematic or abstract constructions, thereby providing evidence for the statistical foundation of construction ***** grammar *****. | ||
| L08-1489 The source speech recognition ***** grammar ***** is used to generate phrases, which are translated by a common translation service. | ||
| 1963.earlymt-1.33 A ***** grammar ***** of this system is basically independent of any ***** grammar ***** of Russian | ||
| abusive language detection | 127 | |
| 2020.acl-main.380 For example, texts containing some demographic identity-terms (e.g., “gay”, “black”) are more likely to be abusive in existing ***** abusive language detection ***** datasets. | ||
| W19-3508 We propose an experimental study that has three aims: 1) to provide us with a deeper understanding of current data sets that focus on different types of abusive language, which are sometimes overlapping (racism, sexism, hate speech, offensive language, and personal attacks); 2) to investigate what type of attention mechanism (contextual vs. self-attention) is better for ***** abusive language detection ***** using deep learning architectures; and 3) to investigate whether stacked architectures provide an advantage over simple architectures for this task. | ||
| 2021.socialnlp-1.10 In this paper, we investigate the effectiveness of several Unsupervised Domain Adaptation (UDA) approaches for the task of cross-corpora ***** abusive language detection *****. | ||
| W18-5112 While analysis of online explicit ***** abusive language detection ***** has lately seen an ever-increasing focus, implicit abuse detection remains a largely unexplored space. | ||
| 2021.woah-1.9 Current ***** abusive language detection ***** systems have demonstrated unintended bias towards sensitive features such as nationality or gender. | ||
| unsupervised domain adaptation | 127 | |
| 2020.coling-main.603 Motivated by the latest advances, in this survey we review neural ***** unsupervised domain adaptation ***** techniques which do not require labeled target domain data. | ||
| 2021.adaptnlp-1.2 In ***** unsupervised domain adaptation *****, we aim to train a model that works well on a target domain when provided with labeled source samples and unlabeled target samples. | ||
| P19-1591 Pivot Based Language Modeling (PBLM) (Ziser and Reichart, 2018a), combining LSTMs with pivot-based methods, has yielded significant progress in ***** unsupervised domain adaptation *****. | ||
| 2020.emnlp-main.497 On six ***** unsupervised domain adaptation ***** tasks involving named entity recognition, our method strongly outperforms the random masking strategy and achieves up to +1.64 F1 score improvements. | ||
| 2020.acl-main.370 In this paper, we investigate how to efficiently apply the pre-training language model BERT on the ***** unsupervised domain adaptation *****. | ||
| context word | 127 | |
| 2020.coling-main.608 A limitation of CBOW is that it equally weights the *****context words***** when making a prediction, which is inefficient, since some words have higher predictive value than others. | ||
| W19-1302 In this paper, we propose a soft label approach to target-level sentiment classification task, in which a history-based soft labeling model is proposed to measure the possibility of a *****context word***** as an opinion word. | ||
| D18-1427 The model copies the *****context words***** that are far from and irrelevant to the answer, instead of the words that are close and relevant to the answer. | ||
| D18-1174 Disambiguated skip-gram jointly estimates a skip-gram-like *****context word***** prediction model and a word sense disambiguation model. | ||
| D17-1190 Specifically, our approach first extracts a sequence of *****context words***** that indicates the temporal relation between two events, which well align with the dependency path between two event mentions. | ||
| gradient | 126 | |
| I17-1034 It extracts structural information on text, and uses Long Short-Term Memory (LSTM) cell to prevent ***** gradient ***** vanish. | ||
| 2020.findings-emnlp.98 For unlabeled data, we leverage a self-critical policy ***** gradient ***** method with the difference between the scores obtained by Monte-Carlo sampling and greedy decoding as the reward function, while the scores are the negative K-L divergence between output distributions of original video data and augmented video data. | ||
| 2021.adaptnlp-1.8 Our algorithm, TreeMAML, adapts the model to each task with a few ***** gradient ***** steps, but the adaptation follows the hierarchical tree structure: in each step, ***** gradient *****s are pooled across tasks clusters and subsequent steps follow down the tree. | ||
| P18-1063 We use a novel sentence-level policy ***** gradient ***** method to bridge the non-differentiable computation between these two neural networks in a hierarchical way, while maintaining language fluency. | ||
| 2020.ngt-1.4 We also propose to use an error-feedback mechanism during retraining, to preserve the compressed model as a stale ***** gradient ***** | ||
| KG | 126 | |
| D19-1265 Experiments on a real-world ***** KG ***** updating dataset show that our model can effectively broadcast the news information to the ***** KG ***** structures and perform necessary link-adding or link-deleting operations to ensure the ***** KG ***** up-to-date according to news snippets. | ||
| S18-2027 In this paper, we propose a multimodal translation-based approach that defines the energy of a ***** KG ***** triple as the sum of sub-energy functions that leverage both multimodal (visual and linguistic) and structural ***** KG ***** representations. | ||
| P19-1026 Despite of their successful performances, existing bilinear forms overlook the modeling of relation compositions, resulting in lacks of interpretability for reasoning on ***** KG *****. | ||
| 2021.acl-long.82 We develop a deep convolutional network that utilizes textual entity representations and demonstrate that our model outperforms recent ***** KG ***** completion methods in this challenging setting. | ||
| 2020.findings-emnlp.105 To explore the type information for any ***** KG *****, we develop a novel ***** KG *****E framework with Automated Entity TypE Representation (AutoETER), which learns the latent type embedding of each entity by regarding each relation as a translation operation between the types of two entities with a relation-aware projection mechanism | ||
| lemmatization | 126 | |
| K18-2017 It performs sentence splitting, tokenization, compound word expansion, ***** lemmatization *****, tagging and parsing. | ||
| W19-4009 The interoperability between lemmatized corpora of Latin and other resources that use the lemma as indexing key is hampered by the multiple ***** lemmatization ***** strategies that different projects adopt. | ||
| L16-1680 UDPipe, a pipeline processing CoNLL-U-formatted files, performs tokenization, morphological analysis, part-of-speech tagging, ***** lemmatization ***** and dependency parsing for nearly all treebanks of Universal Dependencies 1.2 (namely, the whole pipeline is currently available for 32 out of 37 treebanks). | ||
| W19-4211 Our core approach focuses on the morphological tagging task; part-of-speech tagging and ***** lemmatization ***** are treated as secondary tasks | ||
| 2021.iwpt-1.19 This year the official evaluation metrics was ELAS , therefore dependency parsing might have been avoided as well as other pipeline stages like POS tagging and *****lemmatization***** . | ||
| salient | 126 | |
| L08-1301 In this paper, we introduce two techniques for extracting informative expressions from documents: the extraction of related words that are not only taxonomically related but also thematically related, and the acquisition of ***** salient ***** terms and phrases. | ||
| 2020.sltu-1.19 Computer vision module is for detecting ***** salient ***** objects or extracting features of images and Natural Language Processing (NLP) module is for generating correct syntactic and semantic image captions. | ||
| D19-5402 As an attempt to combine extractive and abstractive summarization, Sentence Rewriting models adopt the strategy of extracting ***** salient ***** sentences from a document first and then paraphrasing the selected ones to generate a summary. | ||
| 2020.coling-main.279 In this paper, we propose an Interactive key-value Memory- augmented Attention model for image Paragraph captioning (IMAP) to keep track of the attention history (***** salient ***** objects coverage information) along with the update-chain of the decoder state and therefore avoid generating repetitive or incomplete image descriptions. | ||
| P19-1205 This mechanism models a Gaussian focal bias on attention scores to enhance the perception of local context, which contributes to producing ***** salient ***** and informative summaries | ||
| connectives | 126 | |
| 2021.codi-main.8 The lexicon shows that the majority of Nigerian Pidgin ***** connectives ***** are borrowed from its English lexifier, but that there are also some ***** connectives ***** that are unique to Nigerian Pidgin. | ||
| P19-1411 In this work, we explore this property in a multi-task learning framework for IDRR in which the relations and the ***** connectives ***** are simultaneously predicted, and the mapping is leveraged to transfer knowledge between the two prediction tasks via the embeddings of relations and ***** connectives *****. | ||
| W17-5501 In this paper, we present an approach to exploit phrase tables generated by statistical machine translation in order to map French discourse ***** connectives ***** to discourse relations. | ||
| W18-4906 The paper defines primary and secondary ***** connectives *****, and explains why it is possible to build a lexicon for the compositional ones and how it could be organized. | ||
| I17-1049 In this paper, we address this problem by procuring additional training data from parallel corpora: When humans translate a text, they sometimes add ***** connectives ***** (a process known as explicitation) | ||
| induction | 126 | |
| P19-1228 Experiments on English and Chinese show the effectiveness of our approach compared to recent state-of-the-art methods for grammar ***** induction ***** from words with neural language models. | ||
| 2020.lrec-1.431 SDEC-AD outperforms the state-of-the-art methods in both steps of the frame ***** induction ***** process. | ||
| W18-5452 We find that this model represents the first empirical success for neural network latent tree learning, and that neural language modeling warrants further study as a setting for grammar ***** induction ***** | ||
| D18-1160 In experiments we instantiate our approach with both Markov and tree-structured priors, evaluating on two tasks: part-of-speech (POS) ***** induction *****, and unsupervised dependency parsing without gold POS annotation. | ||
| L16-1524 Based on the assumption, we propose a constraint-based bilingual lexicon ***** induction ***** for closely related languages by extending constraints and translation pair candidates from recent pivot language approach. | ||
| terminology | 126 | |
| W16-4702 The present paper explores a novel method that integrates efficient distributed representations with ***** terminology ***** extraction. | ||
| L14-1743 A cloud-based, user-oriented, collaborative, portable, interoperable, and multilingual platform offers such ***** terminology ***** services as ***** terminology ***** project creation and sharing, data collection for translation lookup, user document upload and management, ***** terminology ***** extraction customisation and execution, raw terminological data management, validated terminological data export and reuse, and other ***** terminology ***** services. | ||
| L14-1130 We hope that presented guidelines and approach in evaluation will be useful to ***** terminology ***** institutions, regulative authorities and researchers in different countries that are involved in the national ***** terminology ***** work. | ||
| L06-1208 In addition to TermDB, a database used for ***** terminology ***** management and storage, we present the following modules that are used to populate the database: TerMine (recognition, extraction and normalisation of terms from literature), AcroTerMine (extraction and clustering of acronyms and their long forms), AnnoTerm (annotation and classification of terms), and ClusTerm (extraction of term associations and clustering of terms) | ||
| 2005.mtsummit-wpt.9 The workflow consists of the stage for setting lexical goals and the semi- automatic ***** terminology ***** construction stage. | ||
| spoken | 126 | |
| L14-1584 While methods for collecting data on ***** spoken ***** or written communication, backed up by computational techniques, are evolving, the actual data being collected remain largely the same. | ||
| W16-4806 Statistical systems for all language pairs and translation directions are trained using parallel texts from different domains, however mainly on ***** spoken ***** language i.e. subtitles. | ||
| P19-2043 We focus on ***** spoken ***** language domains, namely colloquial and speech languages. | ||
| P17-1057 We present a visually grounded model of speech perception which projects ***** spoken ***** utterances and images to a joint semantic space. | ||
| W17-4610 Practical applications on ***** spoken ***** language, however, would rely on automatically predicted prosodic information | ||
| Terminology | 125 | |
| L08-1283 In this paper we describe the methodology and the first steps for the creation of WNTERM (from WordNet and ***** Terminology *****), a specialized lexicon produced from the merger of the EuroWordNet-based Multilingual Central Repository (MCR) and the Basic Encyclopaedic Dictionary of Science and Technology (BDST). | ||
| 2021.wmt-1.82 This paper describes Charles University sub - mission for *****Terminology***** translation Shared Task at WMT21 . | ||
| 2021.wmt-1.42 This paper describes Charles University sub - mission for *****Terminology***** translation shared task at WMT21 . | ||
| R19-1052 *****Terminology***** translation plays a critical role in domain - specific machine translation ( MT ) . | ||
| L08-1155 *****Terminology***** extraction commonly includes two steps : identification of term - like units in the texts , mostly multi - word phrases , and the ranking of the extracted term - like units according to their domain representativity . | ||
| question generation | 125 | |
| P19-1415 We also present a way to construct training data for our ***** question generation ***** models by leveraging the existing reading comprehension dataset. | ||
| W18-6536 In this work we present a new Attentional Encoder–Decoder Recurrent Neural Network model for automatic ***** question generation *****. | ||
| D19-5809 These results established that our research direction may be promising, but at the same time revealed that the identification of question patterns is a challenging issue, and it has to be largely refined to achieve a better quality in the end-to-end automatic ***** question generation *****. | ||
| P17-1123 We study automatic ***** question generation ***** for sentences from text passages in reading comprehension. | ||
| 2020.emnlp-main.729 We present a novel task of ***** question generation ***** given a query path in the knowledge graph constructed from the input text. | ||
| unsupervised machine translation | 125 | |
| W19-2307 Latent space based GAN methods and attention based sequence to sequence models have achieved impressive results in text generation and ***** unsupervised machine translation ***** respectively. | ||
| 2020.acl-main.658 Finally, we provide a unified outlook for different types of research in this area (i.e., cross-lingual word embeddings, deep multilingual pretraining, and ***** unsupervised machine translation *****) and argue for comparable evaluation of these models. | ||
| P19-1019 Together, we obtain large improvements over the previous state-of-the-art in ***** unsupervised machine translation *****. | ||
| 2020.acl-srw.34 We first produce a synthetic parallel corpus using ***** unsupervised machine translation *****, and use it to fine-tune a pretrained cross-lingual masked language model (XLM) to derive the multilingual sentence representations. | ||
| 2021.naacl-main.420 Inspired by ***** unsupervised machine translation *****, we investigate if a strong V&L representation model can be learned through unsupervised pre-training without image-caption corpora. | ||
| phrase table | 125 | |
| 2014.amta-workshop.3 The experiments conducted in the course of this work provide evidence to the contrary: without loss in translation quality, the sampling ***** phrase table ***** ranks second out of four in terms of speed, being slightly slower than hash table look-up (Junczys-Dowmunt, 2012) and considerably faster than current implementations of the approach suggested by Zens and Ney (2007). | ||
| 2011.iwslt-evaluation.22 We also investigated coupling WFST based ASR to a simple WFST based translation decoder and found it was crucial to perform ***** phrase table ***** expansion to avoid OOV problems. | ||
| D18-1399 Our method profits from the modular architecture of SMT: we first induce a ***** phrase table ***** from monolingual corpora through cross-lingual embedding mappings, combine it with an n-gram language model, and fine-tune hyperparameters through an unsupervised MERT variant. | ||
| L14-1052 As part of this project, all ***** phrase table *****s produced in the experiments will also be made freely available. | ||
| L16-1354 Furthermore, since our scoring function uses Moses ***** phrase table *****s directly we avoid the need to translate the texts to be aligned, which is time-consuming and a potential source of alignment errors. | ||
| Attention | 124 | |
| D19-1002 A recent paper claims that `***** Attention ***** is not Explanation' (Jain and Wallace, 2019). | ||
| 2020.coling-main.281 We integrate the planning mechanism to the attention based caption model and propose the High-level Semantic PLanning based ***** Attention ***** Network (HS-PLAN). | ||
| D19-5624 *****Attention***** models have become a crucial component in neural machine translation ( NMT ) . | ||
| D18-1245 *****Attention***** mechanism is often used in deep neural networks for distantly supervised relation extraction ( DS - RE ) to distinguish valid from noisy instances . | ||
| D17-1048 *****Attention***** models are proposed in sentiment analysis because some words are more important than others . | ||
| meaning | 124 | |
| 2020.sigdial-1.16 The contributions of this work include a semantic parser that maps spatial questions into logical forms consistent with a general approach to ***** meaning ***** representation, a dialogue manager based on a schema representation, and a constraint solver for spatial questions that provides answers in agreement with human perception. | ||
| L14-1692 Word sense annotation is a challenging task where annotators distinguish which ***** meaning ***** of a word is present in a given context. | ||
| 2006.bcs-1.12 Then, sentences of the target language are generated from those ***** meaning ***** representations. | ||
| L06-1150 Thus, the goal of our experiments was to explore dictionary users requirements and to study what services an intelligent dictionary interface should be able to supply to help solving access by ***** meaning ***** problems | ||
| 1997.mtsummit-workshop.4 (1) From a linguistic viewpoint, the expected benefits include a refinement of the aspectual model in (Olsen, 1994; Olsen, 1997) (which provides necessary but not sufficient conditions for aspectual com- position), and a refinement of the verb classifications in (Levin, 1993); we also expect our approach to eventually produce a systematic definition (in terms of LCSs and compositional operations) of the precise ***** meaning ***** components responsible for Levin's classification. | ||
| data augmentation | 124 | |
| 2020.sltu-1.7 Overall, we show that the proposed multilingual graphemic hybrid ASR with various ***** data augmentation ***** can not only recognize any within training set languages, but also provide large ASR performance improvements. | ||
| C18-1105 In this paper, we study the problem of ***** data augmentation ***** for language understanding in task-oriented dialogue system. | ||
| P19-1555 In this paper, we present a novel ***** data augmentation ***** method for neural machine translation.Different from previous augmentation methods that randomly drop, swap or replace words with other words in a sentence, we softly augment a randomly chosen word in a sentence by its contextual mixture of multiple related words. | ||
| 2021.eacl-main.159 We introduce a ***** data augmentation ***** technique based on byte pair encoding and a BERT-like self-attention model to boost performance on spoken language understanding tasks. | ||
| 2021.semeval-1.98 We solve the problem as a binary classification problem and also experiment with ***** data augmentation ***** and adversarial training techniques. | ||
| event coreference resolution | 124 | |
| L14-1646 The ECB corpus is one of the data sets used for evaluation of the task of ***** event coreference resolution *****. | ||
| 2020.aespen-1.7 A separate literature in computational linguistics on ***** event coreference resolution ***** attempts to link known events to one another within (and across) documents. | ||
| Q15-1037 Experiments on the ECB+ corpus show that our model outperforms state-of-the-art methods for both within- and cross-document ***** event coreference resolution *****. | ||
| 2020.aacl-main.66 We present two extensions to a state-of-theart joint model for ***** event coreference resolution *****, which involve incorporating (1) a supervised topic model for improving trigger detection by providing global context, and (2) a preprocessing module that seeks to improve event coreference by discarding unlikely candidate antecedents of an event mention using discourse contexts computed based on salient entities. | ||
| L16-1631 In this paper, we investigate this challenge, proposing the first multi-pass sieve approach to ***** event coreference resolution *****. | ||
| unlabeled text | 124 | |
| 2020.emnlp-main.354 In this work, we study visually grounded grammar induction and learn a constituency parser from both ***** unlabeled text ***** and its visual groundings. | ||
| P18-1146 To the best of our knowledge, this is the first attempt at inducing higher-order relation schemata from ***** unlabeled text *****. | ||
| 2020.coling-main.485 Compared with supervised learning methods which require a large corpus of labeled documents, our method aims to make it possible to classify ***** unlabeled text ***** with few labeled data. | ||
| D18-1217 Unsupervised representation learning algorithms such as word2vec and ELMo improve the accuracy of many supervised NLP models, mainly because they can take advantage of large amounts of ***** unlabeled text *****. | ||
| L10-1371 From the ***** unlabeled text ***** we derive distributional word clusters. | ||
| discourse connective | 124 | |
| 2020.lrec-1.138 We present DiMLex-Bangla, a newly developed lexicon of *****discourse connectives***** in Bangla. | ||
| D18-1079 In this paper, we follow Rutherford and Xue (2015) to expand the training data set using the corpus of explicitly-related arguments, by arbitrarily dropping the overtly presented *****discourse connectives*****. | ||
| W18-2802 A number of different *****discourse connectives***** can be used to mark the same discourse relation, but it is unclear what factors affect connective choice. | ||
| W19-2711 We discover that most of the important features for rhetorical relation classification are related to *****discourse connectives***** derived from the connectives lexicon for Russian and from other sources. | ||
| W17-5501 In this paper, we present an approach to exploit phrase tables generated by statistical machine translation in order to map French *****discourse connectives***** to discourse relations. | ||
| IE | 123 | |
| L14-1157 In order to extend Open ***** IE ***** to extract relationships that are not expressed by verbs, we present a novel Open ***** IE ***** approach that extracts relations expressed in noun compounds (NCs), such as (oil, extracted from, olive) from olive oil, or in adjective-noun pairs (ANs), such as (moon, that is, gorgeous) from gorgeous moon. | ||
| D19-1067 We propose a novel supervised open information extraction (Open ***** IE *****) framework that leverages an ensemble of unsupervised Open ***** IE ***** systems and a small amount of labeled data to improve system performance. | ||
| L10-1110 We treat this as a text classification problem and apply first information extraction (***** IE *****) techniques (voting using keywords weight according to their context), then machine learning (ML), and finally a combined approach in which ML has priority over weighted keywords, but the latter can still make up categorizations for services for which ML does not produce enough. | ||
| C18-1195 Moreover, we show that existing Open ***** IE ***** approaches can benefit from the transformation process of our framework. | ||
| 2021.acl-long.489 In order to evaluate the impact of our approach on real-world problems that involve topic-specific fine-grained knowledge elements, we have also created a new ontology and annotated corpus for entity and event extraction for the COVID-19 scientific literature, which can serve as a new benchmark for the biomedical ***** IE ***** community | ||
| BiLSTM | 123 | |
| R19-1133 Previous work on using ***** BiLSTM ***** models for PoS tagging has primarily focused on small tagsets. | ||
| D19-1151 The task is typically modeled as a sequence labeling problem and currently Bidirectional Long Short Term Memory (***** BiLSTM *****) models provide state-of-the-art results. | ||
| P17-1044 We use a deep highway ***** BiLSTM ***** architecture with constrained decoding, while observing a number of recent best practices for initialization and regularization. | ||
| 2020.findings-emnlp.312 Experimental results on Reddit data show the performance gain of our method when compared to standard text classification methods based on ***** BiLSTM *****, and BERT. | ||
| D19-1562 To demonstrate this hypothesis, unlike previous models with complicated architectures, we limit our base model to a simple ***** BiLSTM ***** with attention classifier, and instead focus on how and where the attributes should be incorporated in the model | ||
| biases | 123 | |
| 2021.gebnlp-1.6 We show that existing algorithms' inconsistent results are consequences of prior research's inconsistent definitions of ***** biases ***** themselves. | ||
| W17-4209 Experimenting with a dataset of approximately 1.6M user comments from a Greek news sports portal, we explore how a state of the art RNN-based moderation method can be improved by adding user embeddings, user type embeddings, user ***** biases *****, or user type ***** biases *****. | ||
| 2021.acl-long.330 Motivated by a lack of studies on ***** biases ***** from decoding techniques, we also conduct experiments to quantify the effects of these techniques. | ||
| 2021.emnlp-main.135 This analysis gives us a simple statistical test for dataset artifacts, which we use to show more subtle ***** biases ***** than were described in prior work, including demonstrating that models are inappropriately affected by these less extreme ***** biases *****. | ||
| L12-1475 The ultimate goal of this larger study is to produce a detailed enumeration of the primary ***** biases ***** online, and identify sampling strategies which control and minimise unwanted effects of document attrition | ||
| correlations | 122 | |
| L08-1573 It then selects translation candidates that have the highest ***** correlations ***** with a certain percentage or more of the associated words. | ||
| 2020.coling-main.511 We also devise objective functions that exploit label ***** correlations ***** in the training data explicitly. | ||
| 2020.emnlp-main.186 Finally, our measures capture complementary information to typologically driven language distance measures, and the combination of measures from the two families yields even higher task performance ***** correlations *****. | ||
| 2004.amta-papers.8 We present results from the statistical analysis of 20,000 words of MT output, manually annotated using our classification scheme, and describe ***** correlations ***** between error frequencies and human scores for fluency and adequacy. | ||
| 2020.coling-main.517 Our study empirically analyses the effectiveness of the induced emotion lexicons by measuring translation precision and ***** correlations ***** with existing emotion lexicons, along with measurements on a downstream task of sentence emotion prediction | ||
| Offensive | 122 | |
| 2021.dravidianlangtech-1.34 This paper describes the submission of the team Amrita_CEN_NLP to the shared task on ***** Offensive ***** Language Identification in Dravidian Languages at EACL 2021. | ||
| 2021.dravidianlangtech-1.47 This paper describes the models submitted by the team MUCS for ***** Offensive ***** Language Identification in Dravidian Languages-EACL 2021 shared task that aims at identifying and classifying code-mixed texts of three language pairs namely, Kannada-English (Kn-En), Malayalam-English (Ma-En), and Tamil-English (Ta-En) into six predefined categories (5 categories in Ma-En language pair). | ||
| S19-2108 SemEval 2019 Task 6 was OffensEval : Identifying and Categorizing *****Offensive***** Language in Social Media . | ||
| S19-2104 This paper describes the system we developed for SemEval 2019 on Identifying and Categorizing *****Offensive***** Language in Social Media ( OffensEval - Task 6 ) . | ||
| S19-2010 We present the results and the main findings of SemEval-2019 Task 6 on Identifying and Categorizing *****Offensive***** Language in Social Media ( OffensEval ) . | ||
| knowledge bases | 122 | |
| 2021.emnlp-main.292 We avoid crucial assumptions of previous work that do not transfer well to real-world settings, including exploiting knowledge of the fixed number of retrieval steps required to answer each question or using structured metadata like ***** knowledge bases ***** or web links that have limited availability. | ||
| 2018.jeptalnrecital-court.37 Entity linking systems typically rely on encyclopedic ***** knowledge bases ***** such as DBpedia or Freebase. | ||
| 2021.eacl-main.153 Pretrained language models have been suggested as a possible alternative or complement to structured ***** knowledge bases *****. | ||
| 2020.lrec-1.694 In this paper we present an approach to validate terminological data retrieved from open encyclopaedic ***** knowledge bases *****. | ||
| 2021.bionlp-1.12 We start from pre-trained conditional generative language models, use ***** knowledge bases ***** to help correct input errors, and rerank single system outputs to boost coverage. | ||
| latent dirichlet allocation | 122 | |
| 2020.emnlp-main.234 To scale non-parametric extensions of probabilistic topic models such as *****Latent Dirichlet allocation***** to larger data sets, practitioners rely increasingly on parallel and distributed systems. | ||
| P17-2084 Topical PageRank (TPR) uses latent topic distribution inferred by *****Latent Dirichlet Allocation***** (LDA) to perform ranking of noun phrases extracted from documents. | ||
| 2021.clpsych-1.10 To this end, we design SHTM, a Self-Harm Topic Model that combines *****Latent Dirichlet Allocation***** with a self-harm dictionary for modeling daily tweets of users. | ||
| 2021.adaptnlp-1.6 Specifically, we use *****Latent Dirichlet Allocation***** (LDA), with word and character N-grams. | ||
| S19-1011 Our evaluation compares our topic modeling approach to *****Latent Dirichlet Allocation***** (LDA) on three metrics: 1) qualitative topic match, measured using evaluations by Amazon Mechanical Turk (MTurk) workers, 2) performance on classification tasks using each topic model as a sparse feature representation, and 3) topic coherence. | ||
| visual | 122 | |
| 2020.emnlp-main.162 Trained with these contextually generated vokens, our ***** visual *****ly-supervised language models show consistent improvements over self-supervised alternatives on multiple pure-language tasks such as GLUE, SQuAD, and SWAG. | ||
| 2020.emnlp-main.355 We applied ALICEin two ***** visual ***** recognition tasks, bird species classification and social relationship classification. | ||
| 2021.acl-srw.8 As a lot of these models are based on Transformers, several studies on the attention mechanisms used by the models to learn to associate phrases with their ***** visual ***** grounding in the image have been conducted. | ||
| 2021.splurobonlp-1.3 We automatically extract different trigger and zoomer pairs based on the ***** visual ***** property that the questions rely on (e.g. | ||
| 2020.semeval-1.99 Information on social media comprises of various modalities such as textual, ***** visual ***** and audio. | ||
| XML | 121 | |
| 2014.amta-researchers.5 We compare two embedding methods that can be easily used at run-time without altering the normal activity of an SMT system: ***** XML ***** markup and the cache-based model. | ||
| N19-1289 Extreme Multi-label classification (***** XML *****) is an important yet challenging machine learning task, that assigns to each instance its most relevant candidate labels from an extremely large label collection, where the numbers of labels, features and instances could be thousands or millions. | ||
| L08-1321 It was designed to streamline and speed-up the lexicon building process, and to free the linguists from writing ***** XML ***** files which is both cumbersome and error-prone. | ||
| W19-5212 Our experiments show that learning to translate with the ***** XML ***** tags improves translation accuracy, and the beam search accurately generates ***** XML ***** structures | ||
| L10-1230 The current study presents a conversion and unification of the Penn Discourse TreeBank 2.0 ( PDTB ) and the Penn TreeBank ( PTB ) under *****XML***** format . | ||
| phrase | 121 | |
| L14-1666 It consists of genuine dependency annotations, i. e. they have not been transformed from ***** phrase ***** structures. | ||
| D19-1082 In this work, we present multi-granularity self-attention (Mg-Sa): a neural network that combines multi-head self-attention and ***** phrase ***** modeling. | ||
| 2020.lrec-1.847 Experimental results using the standard dataset for ***** phrase ***** alignment evaluation show that SAPPHIRE outperforms the previous method and establishes the state-of-the-art performance. | ||
| C16-1169 Two syntactic subtree matching rules based on ***** phrase ***** structure grammar are proposed to filter the translation hypotheses more strictly | ||
| 2010.amta-papers.6 In this paper, we present the insights gained from a detailed study of coupling a highly modular English-Hindi RBMT system with a standard ***** phrase *****-based SMT system. | ||
| semantic role | 121 | |
| C16-1121 We present a successful collaboration of word embeddings and co-training to tackle in the most difficult test case of ***** semantic role ***** labeling: predicting out-of-domain and unseen semantic frames. | ||
| L08-1428 We discuss evaluation results of the defined concepts for ***** semantic role ***** annotation concerning the redundancy and completeness of the tagset and the reliability of annotations in terms of inter-annotator agreement. | ||
| L12-1201 Both noun and verb terms are explored in context in order to identify and represent the ***** semantic role *****s held by their participants (arguments and circumstants), and therefore explore some of the relations established by these terms. | ||
| L16-1601 In the resource's current status, there are 98 frames, 662 frame evoking words, 872 senses, and about 13000 annotated frames, with their ***** semantic role *****s assigned to portions of text. | ||
| S17-1018 Frame - semantic parsing and *****semantic role***** labelling , that aim to automatically assign semantic roles to arguments of verbs in a sentence , have become an active strand of research in NLP . | ||
| knowledge graph embedding | 121 | |
| D18-1358 Most existing researches are focusing on ***** knowledge graph embedding ***** (KGE) models. | ||
| P18-1186 We then build a deep zeroshot multimodal network for MNED that 1) extracts contexts from both text and image, and 2) predicts correct entity in the ***** knowledge graph embedding *****s space, allowing for zeroshot disambiguation of entities unseen in training set as well. | ||
| P18-1094 We evaluate our proposal on learning word embeddings, order embeddings and ***** knowledge graph embedding *****s and observe both faster convergence and improved results on multiple metrics. | ||
| 2020.emnlp-main.667 Little is known about the trustworthiness of predictions made by ***** knowledge graph embedding ***** (KGE) models. | ||
| P17-1162 To model both structured knowledge and unstructured language, we propose a neural model with dynamic ***** knowledge graph embedding *****s that evolve as the dialogue progresses. | ||
| penn discourse treebank | 121 | |
| P19-1411 We propose several techniques to enable such knowledge transfer that yield the state-of-the-art performance for IDRR on several settings of the benchmark dataset (i.e., the *****Penn Discourse Treebank***** dataset). | ||
| W17-5502 To this end, we address a significant gap in the inter-sentential discourse relations annotated in the *****Penn Discourse Treebank***** (PDTB), namely the class of cross-paragraph implicit relations, which account for 30% of inter-sentential relations in the corpus. | ||
| 2020.lrec-1.131 In our study, we translate the English *****Penn Discourse TreeBank***** into German and experiment with various methods of annotation projection to arrive at the German counterpart of the PDTB. | ||
| W17-0903 The resulting classifier outperforms strong baselines on two datasets (the *****Penn Discourse Treebank***** and the CSTNews corpus) annotated with different schemes and containing examples in two languages, English and Portuguese. | ||
| L08-1093 We present the second version of the *****Penn Discourse Treebank*****, PDTB-2.0, describing its lexically-grounded annotations of discourse relations and their two abstract object arguments over the 1 million word Wall Street Journal corpus. | ||
| TM | 120 | |
| 2021.blackboxnlp-1.19 We achieve this by extracting semantically related words from pre-trained word representations as input features to the ***** TM *****. | ||
| 2001.mtsummit-eval.1 The scenario here focuses on the software localisation industry, which already uses ***** TM ***** systems and looks to further streamline the overall translation process by integrating Machine Translation (MT). | ||
| 2010.amta-papers.19 Rather than relegate SMT to a last-resort status where it is only used should the ***** TM ***** system fail to produce the desired output, for us SMT is an integral part of the translation process that we rely on to obtain high-quality results. | ||
| 2002.amta-papers.11 In the present paper, we discuss the opening of SMT to examples automatically extracted from a Translation Memory (***** TM *****). | ||
| 2012.amta-government.11 We are also examining summary statistics produced by *****TM***** systems to test the degree to which material from each domain serves as a useful vault for translating material from each of the other domains , as well as the degree to which vault size improves the number and quality of proposed matches . | ||
| transcribed | 120 | |
| W03-3019 In the specific task of determining grammatical relations (such as subjects and objects) in ***** transcribed ***** spoken language, we show that a combination of rule-based and corpus-based approaches, where a rule-based system is used as the teacher (or an automatic data annotator) to a corpus-based system, outperforms either system in isolation. | ||
| W18-4107 We explore the use of natural language processing and machine learning for detecting evidence of Parkinson's disease from ***** transcribed ***** speech of subjects who are describing everyday tasks. | ||
| D18-2016 In our experiments, we demonstrate that a single-core version of the crawler can obtain around 150 hours of ***** transcribed ***** speech within a day, containing an estimated 3.5% word error rate in the transcriptions. | ||
| L14-1586 So far 250 CLP speakers were manually ***** transcribed *****, 120 of these were analyzed by a speech therapist and 27 of them by four additional therapists | ||
| D19-1125 Dialogue systems benefit greatly from optimizing on detailed annotations , such as *****transcribed***** utterances , internal dialogue state representations and dialogue act labels . | ||
| typological | 120 | |
| 2021.eacl-main.38 We verify this hypothesis by blinding a model to ***** typological ***** information, and investigate how cross-lingual sharing and performance is impacted. | ||
| 2021.eacl-main.302 The distributions of orthographic word types are very different across languages due to ***** typological ***** characteristics, different writing traditions and potentially other factors. | ||
| 2020.sigtyp-1.3 This paper describes the NEMO submission to SIGTYP 2020 shared task (Bjerva et al., 2020) which deals with prediction of linguistic ***** typological ***** features for multiple languages using the data derived from World Atlas of Language Structures (WALS). | ||
| 2021.acl-long.38 Languages vary in many ***** typological ***** dimensions, and it is difficult to single out one or two to investigate without the others acting as confounders | ||
| 2020.ldl-1.4 Language catalogues and *****typological***** databases are two important types of resources containing different types of knowledge about the world 's natural languages . | ||
| toolkit | 120 | |
| 2020.acl-demos.31 The final Python-based implementation of our ***** toolkit ***** is flexible, easy to use, and easy to extend not only for technically experienced users, such as machine learning researchers, but also for less technically experienced users, such as linguists or cognitive scientists, thereby providing a flexible platform for collaborative research. | ||
| 2020.wmt-1.95 Our system is based on a Transformer model with TensorFlow Model Garden ***** toolkit *****. | ||
| L08-1134 These results are compared to a system based on the freely available Moses ***** toolkit *****. | ||
| D19-3019 We present Joey NMT, a minimalist neural machine translation ***** toolkit ***** based on PyTorch that is specifically designed for novices | ||
| L16-1612 The ***** toolkit ***** is open source, includes working examples and can be found on http://github.com/jorispelemans/scale. | ||
| inflected | 120 | |
| W18-4506 Exploratory techniques based on locating and counting words may, therefore, lead to conclusions that reinforce culturally ***** inflected ***** boundaries. | ||
| 1995.iwpt-1.24 It can be applied to any language whose morphology is fully described by a finite state transducer, or with a word list comprising all ***** inflected ***** forms with very large word lists of root and ***** inflected ***** forms (some containing well over 200,000 forms), generating all candidate solutions within 10 to 45 milliseconds (with edit distance 1) on a SparcStation 10/41. | ||
| 2021.mtsummit-research.19 We extend the seq2seq architecture with a character-level decoder that takes the lemma of a user-specified term and the words generated from the word-level decoder to output a correct ***** inflected ***** form of the lemma. | ||
| 2020.acl-main.598 Given only raw text and a lemma list, the task consists of generating the morphological paradigms, i.e., all ***** inflected ***** forms, of the lemmas. | ||
| L14-1331 Finally, the surface analysis conducted through a Levensthein distance analysis, highlighted that the most common distance is of 2 characters and mainly involves differences between ***** inflected ***** forms of a unique item | ||
| ontological | 120 | |
| L06-1199 Recent work has aimed at discovering ***** ontological ***** relations from text corpora. | ||
| L08-1271 The key to success of knowledge sharing in the field of agriculture is using and sharing agreed terminologies such as ***** ontological ***** knowledge especially in multiple languages. | ||
| L08-1381 We then describe the Balanced Distance Metric (BDM) which takes ***** ontological ***** similarity into account. | ||
| L06-1368 We also describe the process of building of corresponding ***** ontological ***** resources and their application for semi–automatic generation of scientific portals. | ||
| L10-1574 The relationship between the abstract and the concrete, which is at the basis of the Conceptual Metaphor perspective, can be considered strictly related to the variation of the ***** ontological ***** values found in our analysis of the PNs and their belonging classes which are codified in the ItalWordNet database | ||
| rhetorical structure theory | 120 | |
| U19-1010 In this paper, we propose to use neural discourse representations obtained from a ***** rhetorical structure theory ***** (RST) parser to enhance document representations. | ||
| L16-1167 We present the first corpus of texts annotated with two alternative approaches to discourse structure, *****Rhetorical Structure Theory***** (Mann and Thompson, 1988) and Segmented Discourse Representation Theory (Asher and Lascarides, 2003). | ||
| 2021.ranlp-srw.29 Therefore, given that deceiving actions require advanced cognitive development that honesty simply does not require, as well as people's cognitive mechanisms have promising guidance for deception detection, in this Ph.D. ongoing research, we propose to examine discourse structure patterns in multilingual deceptive news corpora using the *****Rhetorical Structure Theory***** framework. | ||
| P17-1092 We show that discourse structure, as defined by *****Rhetorical Structure Theory***** and provided by an existing discourse parser, benefits text categorization. | ||
| W19-2719 *****Rhetorical Structure Theory***** (RST) has been commonly used in the analysis of discourse organization of written texts; however, limited research has been conducted to date on RST annotation and parsing of spoken language, in particular, non-native spontaneous speech. | ||
| math word problem | 120 | |
| D17-1084 It first retrieves a few relevant equation system templates and aligns numbers in ***** math word problem *****s to those templates for candidate equation generation. | ||
| 2020.coling-main.38 Recently, to address the ***** math word problem *****-solving task, researchers have applied the encoder-decoder architecture, which is mainly used in machine translation tasks. | ||
| 2021.acl-long.456 Previous ***** math word problem ***** solvers following the encoder-decoder paradigm fail to explicitly incorporate essential math symbolic constraints, leading to unexplainable and unreasonable predictions. | ||
| U19-1024 We have experimented our model on the tasks of semantic parsing and ***** math word problem ***** solving. | ||
| I17-3017 DILTON uses a Deep Neural based model to solve *****math word problems*****. | ||
| named | 119 | |
| 2021.bsnlp-1.12 The results demonstrate good generalization, even in ***** named ***** entities with weak regularity, such as book titles, or entities that were never seen during the training. | ||
| L06-1284 However, there are still some ***** named ***** entity phenomena that present problems for existing techniques; in particular, relatively little work has explored the disambiguation of conjunctions appearing in candidate ***** named ***** entity strings. | ||
| L12-1104 The most common sources of failure were limitations on inference, errors in coreference (particularly with nominal anaphors), and errors in ***** named ***** entity recognition. | ||
| 2020.lrec-1.90 Several models have been published achieving promising results in all the major NLP applications, from question answering to text classification, passing through ***** named ***** entity recognition | ||
| W17-1413 In the paper we present an adaptation of Liner2 framework to solve the BSNLP 2017 shared task on multilingual ***** named ***** entity recognition. | ||
| knowledge graph completion | 119 | |
| 2021.naacl-main.202 Moreover, we investigate the effect of the temporal dataset's time granularity on temporal ***** knowledge graph completion *****. | ||
| P19-1431 State-of-the-art models for ***** knowledge graph completion ***** aim at learning a fixed embedding representation of entities in a multi-relational graph which can generalize to infer unseen entity relationships at test time. | ||
| 2020.wnut-1.36 We introduce a system which uses contextualized ***** knowledge graph completion ***** to classify relations and events between known entities in a noisy text environment. | ||
| 2020.coling-main.153 Recently, a new ***** knowledge graph completion ***** method using a pre-trained language model, such as KG-BERT, is presented and showed high performance. | ||
| 2020.coling-main.327 Besides, CoLAKE achieves surprisingly high performance on our synthetic task called word-*****knowledge graph completion*****, which shows the superiority of simultaneously contextualizing language and knowledge representation. | ||
| Discriminative | 118 | |
| S18-1159 This paper describes the participation of ELiRF - UPV team at task 10 , Capturing *****Discriminative***** Attributes , of SemEval-2018 . | ||
| S18-1117 This paper describes the SemEval 2018 Task 10 on Capturing *****Discriminative***** Attributes . | ||
| 2012.amta-papers.3 *****Discriminative***** training for MT usually involves numerous features and requires large - scale training set to reach reliable parameter estimation . | ||
| S18-1162 Luminoso participated in the SemEval 2018 task on Capturing *****Discriminative***** Attributes with a system based on ConceptNet , an open knowledge graph focused on general knowledge . | ||
| S18-1170 We describe the University of Maryland 's submission to SemEval-018 Task 10 , Capturing *****Discriminative***** Attributes : given word triples ( w1 , w2 , d ) , the goal is to determine whether d is a discriminating attribute belonging to w1 but not w2 . | ||
| Generative | 118 | |
| D18-1056 This paper presents a challenge to the community: ***** Generative ***** adversarial networks (GANs) can perfectly align independent English word embeddings induced using the same algorithm, based on distributional information alone; but fails to do so, for two different embeddings algorithms. | ||
| 2021.naacl-main.449 ***** Generative ***** models for dialog systems have gained much interest because of the recent success of RNN and Transformer based models in tasks like question answering and summarization. | ||
| D19-1048 *****Generative***** classifiers offer potential advantages over their discriminative counterparts , namely in the areas of data efficiency , robustness to data shift and adversarial examples , and zero - shot learning ( Ng and Jordan,2002 ; Yogatama et al . , 2017 ; Lewis and Fan,2019 ) . | ||
| P17-2019 *****Generative***** models defining joint distributions over parse trees and sentences are useful for parsing and language modeling , but impose restrictions on the scope of features and are often outperformed by discriminative models . | ||
| 2020.emnlp-main.134 *****Generative***** models for Information Retrieval , where ranking of documents is viewed as the task of generating a query from a document 's language model , were very successful in various IR tasks in the past . | ||
| lexical semantics | 118 | |
| W16-5303 Regular polysemy was extensively investigated in ***** lexical semantics *****, but this phenomenon has been very little studied in distributional semantics. | ||
| 2020.cl-2.3 LESSLEX has been tested on three tasks relevant to ***** lexical semantics *****: conceptual similarity, contextual similarity, and semantic text similarity. | ||
| 2020.conll-1.17 This paper investigates contextual language models, which produce token representations, as a resource for ***** lexical semantics ***** at the word or type level. | ||
| 2017.lilt-15.4 This paper presents a new classification of verbs of change and modification, proposing a dynamic interpretation of the ***** lexical semantics ***** of the predicate and its arguments | ||
| L14-1499 The annotation will serve as training and test data for classifiers for CMCs, and the CMC definitions developed throughout this study will be used in extending VerbNet to handle representations of sentences in which a verb is used in a syntactic context that is atypical for its ***** lexical semantics *****. | ||
| transfer | 118 | |
| 2021.bionlp-1.28 We show that both ***** transfer ***** learning methods combined achieve the highest ROUGE scores. | ||
| P17-1190 These improve LSTMs in both ***** transfer ***** learning and supervised settings. | ||
| D19-5631 To address these challenges, we propose to leverage data from both tasks and do ***** transfer ***** learning between MT, NLG, and MT with source-side metadata (MT+NLG). | ||
| D19-1148 We deploy our methods on a state-of-the-art unsupervised discriminative parser and evaluate it on both ***** transfer ***** grammar induction and bilingual grammar induction. | ||
| 2020.acl-main.639 Experimental results show that our proposed model achieves state-of-the-art performance in terms of both ***** transfer ***** accuracy and content preservation | ||
| English | 118 | |
| L12-1122 We describe SUTIME , a temporal tagger for recognizing and normalizing temporal expressions in *****English***** text . | ||
| L16-1021 This paper presents WikiCoref , an *****English***** corpus annotated for anaphoric relations , where all documents are from the English version of Wikipedia . | ||
| S17-2060 This paper describes the systems we submitted to the task 3 ( Community Question Answering ) in SemEval 2017 which contains three subtasks on *****English***** corpora , i.e. , subtask A : Question - Comment Similarity , subtask B : Question - Question Similarity , and subtask C : Question - External Comment Similarity . | ||
| 2020.wnut-1.68 In this system paper , we present a transformer - based approach to the detection of informativeness in *****English***** tweets on the topic of the current COVID-19 pandemic . | ||
| L10-1490 In this paper , we report on our attempt at assigning semantic information from the *****English***** FrameNet to lexical units in the Bulgarian valency lexicon . | ||
| Assessing | 117 | |
| 2020.readi-1.4 ***** Assessing ***** reading skills is an important task teachers have to perform at the beginning of a new scholastic year to evaluate the starting level of the class and properly plan next learning activities. | ||
| 2020.semeval-1.129 In this paper we describe our system submitted to SemEval 2020 Task 7 : *****Assessing***** Humor in Edited News Headlines . | ||
| 2020.semeval-1.140 We describe the UTFPR system for SemEval-2020 's Task 7 : *****Assessing***** Humor in Edited News Headlines . | ||
| 2020.semeval-1.135 This paper presents two different systems for the SemEval shared task 7 on *****Assessing***** Humor in Edited News Headlines , sub - task 1 , where the aim was to estimate the intensity of humor generated in edited headlines . | ||
| 2020.semeval-1.101 This paper describes the winning system for SemEval-2020 task 7 : *****Assessing***** Humor in Edited News Headlines . | ||
| morphological analyzer | 117 | |
| D17-1073 However, adding learning features from a ***** morphological analyzer ***** to model the space of possible analyses provides additional improvement. | ||
| 2021.conll-1.47 ARETA employs a large Arabic ***** morphological analyzer *****, but is completely unsupervised otherwise. | ||
| L08-1611 With the aid of the same large-scale Arabic ***** morphological analyzer ***** and PoS tagger in the runtime, the possible senses of virtually any given Arabic word are retrievable. | ||
| L10-1629 Given a ***** morphological analyzer *****, it is even possible to extract novel roots from words. | ||
| 2010.iwslt-evaluation.15 Next, we present our solution for disambiguating the output of an Arabic ***** morphological analyzer ***** | ||
| neural models | 117 | |
| P17-2025 Recent work has proposed several generative ***** neural models ***** for constituency parsing that achieve state-of-the-art results. | ||
| W18-6112 Meanwhile, there is plenty of evidence to the effectiveness of character-based ***** neural models ***** in mitigating this OOV problem. | ||
| W19-4411 We examine this claim in ***** neural models ***** for content scoring. | ||
| D18-1263 Beyond SDP, our linearization technique opens the door to integration of graph-based semantic representations as features in ***** neural models ***** for downstream applications. | ||
| P17-2059 While natural languages are compositional, how state-of-the-art ***** neural models ***** achieve compositionality is still unclear. | ||
| multilingual nmt | 117 | |
| 2021.emnlp-main.2 Furthermore, with much less training computation cost and training data, our model achieves better performance on 15 any-to-English test sets than CRISS and m2m-100, two strong *****multilingual NMT***** baselines. | ||
| 2020.lrec-1.450 In the context of under-resourced neural machine translation (NMT), transfer learning from an NMT model trained on a high resource language pair, or from a *****multilingual NMT***** (M-NMT) model, has been shown to boost performance to a large extent. | ||
| 2018.iwslt-1.8 The parameter transfer mechanism is evaluated in two scenarios: i) to adapt a trained single language NMT system to work with a new language pair and ii) to continuously add new language pairs to grow to a *****multilingual NMT***** system. | ||
| 2021.americasnlp-1.29 Our *****multilingual NMT***** models reached the first rank on all language pairs in track 1, and first rank on nine out of ten language pairs in track 2. | ||
| 2021.wmt-1.62 This paper proposes a technique for adding a new source or target language to an existing *****multilingual NMT***** model without re-training it on the initial set of languages. | ||
| Motivated | 116 | |
| 2020.coling-main.142 ***** Motivated ***** by them, we propose a two-phase prototypical network with prototype attention alignment and triplet loss to dynamically recognize the novel relations with a few support instances meanwhile without catastrophic forgetting. | ||
| C16-1090 ***** Motivated ***** by the intuition that the history of users should impact the recommendation procedure, in this work, we extend end-to-end memory networks to perform this task. | ||
| 2021.acl-long.137 ***** Motivated ***** by the recent finding that models trained with random negative samples are not ideal in real-world scenarios, we propose a hierarchical curriculum learning framework that trains the matching model in an “easy-to-difficult” scheme. | ||
| 2016.iwslt-1.10 ***** Motivated ***** by the fact that speech disfluencies are commonly observed throughout different languages, we investigate the potential of multilingual disfluency modeling. | ||
| 2020.lrec-1.244 ***** Motivated ***** by these limitations, we present the first cross-domain study of edge detection for biomedical event extraction | ||
| compositional generalization | 116 | |
| 2021.acl-short.81 First, we study ways to convert a natural language sequence-to-sequence dataset to a classification dataset that also requires ***** compositional generalization *****. | ||
| 2021.blackboxnlp-1.9 This provides experimental evidence that the ***** compositional generalization ***** assessed in SCAN is particularly useful in resource-starved and domain-shifted scenarios. | ||
| 2020.findings-emnlp.225 We analyze a wide variety of models and propose multiple extensions to the attention module of the semantic parser, aiming to improve ***** compositional generalization *****. | ||
| 2021.eacl-main.48 (2019) introduced a dataset to assess ***** compositional generalization ***** in image captioning, where models are evaluated on their ability to describe images with unseen adjective–noun and noun–verb compositions | ||
| 2021.acl-long.74 In this work, we posit that a span-based parser should lead to better ***** compositional generalization *****. | ||
| sequence | 116 | |
| 2021.acl-long.152 Our experiments focus on ***** sequence ***** labeling tasks, with potential applicability on other cross-lingual and multi-lingual tasks. | ||
| 2020.findings-emnlp.410 Pooling-based recurrent neural architectures consistently outperform their counterparts without pooling on ***** sequence ***** classification tasks. | ||
| 2021.naacl-main.264 Instead of relying on more general pretraining objectives from prior work (e.g., language modeling, response selection), ConVEx's pretraining objective, a novel pairwise cloze task using Reddit data, is well aligned with its intended usage on ***** sequence ***** labeling tasks. | ||
| W18-6310 Embedding and projection matrices are commonly used in neural language models (NLM) as well as in other ***** sequence ***** processing networks that operate on large vocabularies. | ||
| 2021.emnlp-main.318 In detail, we introduce a dual-encoder design, in which a pair encoder especially focuses on candidate aspect-opinion pair classification, and the original encoder keeps attention on ***** sequence ***** labeling | ||
| spoken language translation | 116 | |
| 2014.iwslt-papers.3 In the past, this task has been treated separately in ASR or MT contexts and we propose here a joint estimation of word confidence for a ***** spoken language translation ***** (SLT) task involving both ASR and MT. | ||
| 2018.iwslt-1.28 A ***** spoken language translation ***** (ST) system consists of at least two modules: an automatic speech recognition (ASR) system and a machine translation (MT) system. | ||
| 2007.iwslt-1.28 Our focus was threefold: using hierarchical phrase-based models in ***** spoken language translation *****, the incorporation of sub-lexical information in model estimation via morphological analysis (Arabic) and word and character segmentation (Chinese), and the use of n-gram sequence models for source-side punctuation prediction. | ||
| 2017.iwslt-1.11 Punctuation and segmentation is crucial in ***** spoken language translation *****, as it has a strong impact to translation performance. | ||
| 2020.iwslt-1.11 This report summarizes the Air Force Research Laboratory (AFRL) submission to the offline ***** spoken language translation ***** (SLT) task as part of the IWSLT 2020 evaluation campaign. | ||
| parses | 115 | |
| 1984.bcs-1.13 For example, a grammar writer can write a pattern that recognizes and ***** parses ***** an arbitrary numbers of sub-trees. | ||
| 1993.iwpt-1.2 Since there can be exponentially many ***** parses *****, comparing all of them is not efficient. | ||
| 2021.naacl-main.231 It is popular that neural graph-based models are applied in existing aspect-based sentiment analysis (ABSA) studies for utilizing word relations through dependency ***** parses ***** to facilitate the task with better semantic guidance for analyzing context and aspect words. | ||
| K17-3002 Our system uses relatively simple LSTM networks to produce part of speech tags and labeled dependency ***** parses ***** from segmented and tokenized sequences of words. | ||
| W89-0241 The use of TD prediction, which in the Earley algorithm is allowed to hypothesize new parse paths, is here restricted to confirming initial ***** parses ***** produced BU, and specializing these according to future (feature) expectations | ||
| manually annotated | 115 | |
| 2020.mwe-1.4 However, based on a corpus study examining new Slavic language material and a binomial logistic regression modelling of the ***** manually annotated ***** data, we argue that two separate analyses are needed to account for these constructions, namely a scalar analysis for the N-BY-N construction and a mereological one for the NUM-BY-NUM construction. | ||
| 2021.bea-1.15 We release the ***** manually annotated ***** learner dataset, used for testing, for general use. | ||
| 2020.cmlc-1.2 We show that these probability estimates are highly correlated with the actual attachment scores on a ***** manually annotated ***** test set. | ||
| 2020.bucc-1.7 We work on a ***** manually annotated ***** subset obtained from a French comparable corpus and show how we can drastically reduce the number of sentence pairs that have to be fed to a classifier so that the results can be manually handled. | ||
| 2021.ranlp-1.108 Labels are provided by a multi-label CamemBERT classifier trained and checked on a ***** manually annotated ***** subset of the corpus, while the tweets are selected to avoid undesired biases | ||
| dialogue system | 115 | |
| 2020.acl-main.54 The goal-oriented ***** dialogue system ***** needs to be optimized for tracking the dialogue flow and carrying out an effective conversation under various situations to meet the user goal. | ||
| P17-1120 Recently emerged intelligent assistants on smartphones and home electronics (e.g., Siri and Alexa) can be seen as novel hybrids of domain-specific task-oriented spoken ***** dialogue system *****s and open-domain non-task-oriented ones. | ||
| W18-1401 The challenge for computational models of spatial descriptions for situated ***** dialogue system *****s is the integration of information from different modalities. | ||
| C18-1105 In this paper, we study the problem of data augmentation for language understanding in task-oriented ***** dialogue system *****. | ||
| W17-5503 We test state of the art ***** dialogue system *****s for their behaviour in response to user-initiated sub-dialogues, i.e. | ||
| variational inference | 115 | |
| N19-1123 Comprehensive experiments are conducted examining both continuous and discrete action types and two different optimization methods based on stochastic ***** variational inference *****. | ||
| 2020.coling-main.102 Recent work proposed to cooperate ***** variational inference ***** on a target-related latent variable to introduce the diversity. | ||
| D19-1225 Specifically, a ***** variational inference ***** procedure factors each training glyph into the combination of a character-specific content embedding and a latent font-specific style variable. | ||
| 2021.emnlp-main.743 For joint training, we use amortized ***** variational inference ***** and policy gradient methods. | ||
| D18-1495 We present an amortized ***** variational inference ***** method for GraphBTM. | ||
| ablation | 114 | |
| 2020.lrec-1.149 We characterize the performance of various classification algorithms on this dataset and perform ***** ablation ***** studies to understand the nature of the linguistic models suitable for capturing the nuances of the embedded discourse structures in the presented corpus. | ||
| D19-1507 Moreover, we performed a comprehensive analysis with ***** ablation ***** study to figure out the importance of each component. | ||
| 2020.lrec-1.691 We discuss the Byte Pair Encoding (BPE) used in the pre-processing phase and suggest feature ***** ablation ***** in relation to the granularity of syntactic and semantic annotations. | ||
| W19-5211 We furthermore evaluate different ways to integrate lexical connections into the transformer architecture and present ***** ablation ***** experiments exploring the effect of proposed shortcuts on model behavior. | ||
| 2020.coling-main.166 In addition, detailed ***** ablation ***** experiments are conducted to deepen our understanding of the proposed framework | ||
| diachronic | 114 | |
| 2020.parlaclarin-1.11 We present a case study focusing on lexical items associated with political parties in two ***** diachronic ***** corpora of Austrian German, namely a ***** diachronic ***** media corpus (AMC) and a corpus of parliamentary records (ParlAT), and measure the cross-temporal stability of lexical usage over a period of 20 years. | ||
| 2020.lrec-1.117 Additionally, an overview of the UD-style structure of the treebank is given, and some ***** diachronic ***** aspects of the transition from Latin to Romance languages are highlighted. | ||
| 2020.vardial-1.8 This work is part of a more general ongoing project for the construction of a morphosyntactically annotated historical corpus of Basque called Basque in the Making (BIM): A Historical Look at a European Language Isolate, whose main objective is the systematic and ***** diachronic ***** study of a number of grammatical features. | ||
| 2021.eacl-main.10 We optimize existing models by (i) pre-training on large corpora and refining on ***** diachronic ***** target corpora tackling the notorious small data problem, and (ii) applying post-processing transformations that have been shown to improve performance on synchronic tasks. | ||
| 2020.wanlp-1.17 We measure the presence of biases across several dimensions, namely: embedding models (Skip-Gram, CBOW, and FastText) and vector sizes, types of text (encyclopedic text, and news vs. user-generated content), dialects (Egyptian Arabic vs. Modern Standard Arabic), and time (***** diachronic ***** analyses over corpora from different time periods) | ||
| terminological | 114 | |
| L12-1273 We describe the ***** terminological ***** database and the way in which the idiomatic expressions can be organised within the system, so that, similarly to the other synsets, they are connected to other concepts represented in the database, but at the same time continue to belong to a group of particular linguistic expressions. | ||
| W18-6448 To create our training data, we concatenated several parallel corpora, both from in-domain and out-of-domain sources, as well as ***** terminological ***** resources from UMLS. | ||
| L16-1058 Moreover, by performing a ***** terminological ***** comparison over a period of time it is possible to trace the presence of obsolete words in outdated research areas as well as of neologisms in the most recent fields. | ||
| L10-1276 This paper presents a new algorithm for automatic summarization of specialized texts combining ***** terminological ***** and semantic resources: a term extractor and an ontology | ||
| 2020.computerm-1.1 The first step of any *****terminological***** work is to setup a reliable , specialized corpus composed of documents written by specialists and then to apply automatic term extraction ( ATE ) methods to this corpus in order to retrieve a first list of potential terms . | ||
| counterfactual | 114 | |
| D19-1509 Finally, we evaluate the ***** counterfactual ***** rewriting capacities of several competitive baselines based on pretrained language models, and assess whether common overlap and model-based automatic metrics for text generation correlate well with human scores for ***** counterfactual ***** rewriting. | ||
| 2021.naacl-main.305 (2) For ***** counterfactual ***** bias, we focus on substituting demographic tokens (e.g., gender, race) and measure the shift of the expected prediction among constructed sentences. | ||
| D19-5624 Compared to a state of the art attention model, our ***** counterfactual ***** attention models produce 68% of function words and 21% of content words in our German-English dataset. | ||
| W19-0601 Under the standard approach to ***** counterfactual *****s, to determine the meaning of a ***** counterfactual ***** sentence, we consider the “closest” possible world(s) where the antecedent is true, and evaluate the consequent | ||
| 2021.acl-long.523 We show that Polyjuice produces diverse sets of realistic ***** counterfactual *****s, which in turn are useful in various distinct applications: improving training and evaluation on three different tasks (with around 70% less annotation effort than manual generation), augmenting state-of-the-art explanation techniques, and supporting systematic ***** counterfactual ***** error analysis by revealing behaviors easily missed by human experts. | ||
| subword segmentation | 114 | |
| 2020.wmt-1.134 For both tasks, our efforts concentrate on efficient use of monolingual and related bilingual corpora with scheduled multi-task learning as well as an optimized ***** subword segmentation ***** with sampling. | ||
| 2020.wmt-1.9 Additionally, our experiments show that using a linguistically motivated ***** subword segmentation ***** technique (Ataman et al., 2017) does not consistently outperform the more widely used, non-linguistically motivated SentencePiece algorithm (Kudo and Richardson, 2018), despite the agglutinative nature of Tamil morphology. | ||
| 2020.acl-main.648 The model is based on ***** subword segmentation *****, two language models, as well as a method for mapping between subword sequences. | ||
| 2021.ranlp-1.93 The experimental results show that the model can be successfully applied for texts in a non-English language, and that adding non-lexical features to tweet representations significantly improves performance, while ***** subword segmentation ***** has a moderate but positive effect on model accuracy. | ||
| 2020.coling-main.378 While existing ***** subword segmentation ***** methods tokenize a sentence without considering its translation, the proposed method tokenizes a sentence by using subword units induced from bilingual sentences; this method could be more favorable to machine translation | ||
| quality | 114 | |
| P18-1153 Automatic and human evaluations show that our models are able to generate homographic puns of good readability and ***** quality *****. | ||
| P18-1020 We describe a novel method for efficiently eliciting scalar annotations for dataset construction and system ***** quality ***** estimation by human judgments. | ||
| W19-4447 In view of the influence of the first language on learners, we further propose an effective approach to improve the ***** quality ***** of the suggested sentences. | ||
| R19-1021 The ability to produce high-***** quality ***** publishable material is critical to academic success but many Post-Graduate students struggle to learn to do so. | ||
| D19-1236 The review and selection process for scientific paper publication is essential for the ***** quality ***** of scholarly publications in a scientific field. | ||
| clinical | 114 | |
| W19-5004 In this work, we build a unifying framework for RE, applying this on three highly used datasets (from the general, biomedical and ***** clinical ***** domains) with the ability to be extendable to new datasets. | ||
| 2020.***** clinical *****nlp-1.19 In addition, we apply temperature scaling, a simple but efficient model calibration method, to produce more reliable predictions. | ||
| 2020.***** clinical *****nlp-1.15 We pre-trained several models of common architectures on this dataset and empirically showed that such pre-training leads to improved performance and convergence speed when fine-tuning on downstream medical tasks. | ||
| W18-2313 Our method, evaluated using the TREC 2016 ***** clinical ***** decision support challenge dataset, shows statistically significant improvement not only over various baselines that use standard MeSH terms and UMLS concepts for query expansion, but also over baselines using human expert–assigned concept tags for the queries, run on top of a standard Okapi BM25–based document retrieval system. | ||
| L14-1334 In this paper, we consider the importance of identifying the change of state for events - in particular, ***** clinical ***** events that measure and compare the multiple states of a patients health across time. | ||
| transformer encoder | 114 | |
| 2021.nlp4prog-1.2 We also introduce baselines based on *****transformer encoder*****-decoders, and study the effects of including syntactic information and context. | ||
| 2020.coling-main.327 CoLAKE is pre-trained on large-scale WK graphs with the modified *****Transformer encoder*****. | ||
| 2021.bionlp-1.11 Our system is built upon a pre-trained *****Transformer encoder*****-decoder architecture, i.e., PEGASUS, deployed with an additional domain adaptation module to particularly handle the transfer and generalization issue. | ||
| 2020.nlptea-1.5 Our system is built on the model of multi-layer bidirectional *****transformer encoder***** and ResNet is integrated into the encoder to improve the performance. | ||
| 2021.inlg-1.39 Moreover, the persona information is encoded by a different *****Transformer encoder*****, along with the dialogue history, is fed to the decoder for generating responses. | ||
| dialectal | 113 | |
| W18-3212 This paper describes our system submission to the CALCS 2018 shared task on named entity recognition on code-switched data for the language variant pair of Modern Standard Arabic and Egyptian ***** dialectal ***** Arabic. | ||
| W19-4616 DIWAN is a ***** dialectal ***** word annotation tool, but we upgraded it by adding a new tag-set that is based on traditional Arabic grammar and by adding the roots and morphological patterns of nouns and verbs. | ||
| P17-2033 We propose a simple yet effective text-based user geolocation model based on a neural network with one hidden layer, which achieves state of the art performance over three Twitter benchmark geolocation datasets, in addition to producing word and phrase embeddings in the hidden layer that we show to be useful for detecting ***** dialectal ***** terms. | ||
| 2020.lrec-1.360 Furthermore, we present some difficulties regarding ***** dialectal ***** and orthographic variations. | ||
| 2021.nodalida-main.51 Norway has a large amount of *****dialectal***** variation , as well as a general tolerance to its use in the public sphere . | ||
| constructions | 113 | |
| 2020.framenet-1.1 These lexically-evoked frames, however, do not reflect pragmatic properties of ***** constructions ***** (LUs and other types of ***** constructions *****), such as expressing illocutions or being considered polite or very informal. | ||
| 1991.iwpt-1.9 The data, originally collected by Bach, Brown and Marslen-Wilson (1986), concern the comprehensibility of verb dependency ***** constructions ***** in Dutch and German: right-branching, center-embedded, and cross-serial dependencies of one to four levels deep. | ||
| 2020.dmr-1.9 We demonstrate this representation on argument structure ***** constructions ***** with Transfer of Possession verbs and test the viability of this scheme with an annotation exercise. | ||
| 2020.acl-main.775 Despite their special status and prevalence, current dependency-annotation schemes require treating such flat structures as if they had internal syntactic heads, and most current parsers handle them in the same fashion as headed ***** constructions ***** | ||
| W89-0222 The probabilities provide a natural mechanism for exploring more common grammatical ***** constructions ***** first. | ||
| representation | 113 | |
| S18-1049 Our approach in these task was to come up with a model on count based ***** representation ***** and use machine learning techniques for regression and classification related tasks. | ||
| W19-2916 Visual identification tasks show human speakers can exhibit considerable variation in their understanding, ***** representation ***** and verification of certain quantifiers. | ||
| W89-0232 Thoughts also go in the direction of an integration of the coarse-grained parallelism with knowledge ***** representation ***** in a fine-grained parallel (connectionist) way. | ||
| 2020.semeval-1.254 In character level based ***** representation ***** we implemented a hyper CNN and LSTM model | ||
| D19-1212 Multi-view learning algorithms are powerful ***** representation ***** learning tools, often exploited in the context of multimodal problems. | ||
| automatic speech | 113 | |
| 2020.coling-main.314 We introduce dual-decoder Transformer, a new model architecture that jointly performs ***** automatic speech ***** recognition (ASR) and multilingual speech translation (ST). | ||
| 2020.acl-main.215 Our model treats speech and its own textual representation as two separate modalities or views, as it jointly learns from streamed audio and its noisy transcription into text via ***** automatic speech ***** recognition. | ||
| L14-1277 Speech disfluency is one of the most challenging tasks to deal with in ***** automatic speech ***** processing. | ||
| 2021.winlp-1.1 We present a state-of-the-art ***** automatic speech ***** recognition (ASR) model for Fon, and a benchmark ASR model result for Igbo. | ||
| L14-1558 We also present an add-on for the GOS corpus, which enables its usage for ***** automatic speech ***** recognition. | ||
| recurrent unit | 113 | |
| 2020.sustainlp-1.7 We conduct an empirical study that stacks various approaches and demonstrates that combination of replacing decoder self-attention with simplified ***** recurrent unit *****s, adopting a deep encoder and a shallow decoder architecture and multi-head attention pruning can achieve up to 109% and 84% speedup on CPU and GPU respectively and reduce the number of parameters by 25% while maintaining the same translation quality in terms of BLEU. | ||
| C18-1250 Without changing the ***** recurrent unit *****s, SRNNs are 136 times as fast as standard RNNs and could be even faster when we train longer sequences. | ||
| P17-1013 Different from conventional approaches (LSTM unit and GRU), LAUs uses linear associative connections between input and output of the ***** recurrent unit *****, which allows unimpeded information flow through both space and time The model is quite simple, but it is surprisingly effective. | ||
| N18-1193 In particular, CMN uses multimodal approach comprising audio, visual and textual features with gated ***** recurrent unit *****s to model past utterances of each speaker into memories. | ||
| 2020.semeval-1.110 We use BERT, FastText, Elmo, and Word2Vec to encode these titles then pass them to a bidirectional gated ***** recurrent unit ***** (BiGRU) with attention. | ||
| MSA | 112 | |
| 2020.wanlp-1.5 We report a significant improvement in F-measure for the AQMAR and the NEWS datasets, which are written in Modern Standard Arabic (***** MSA *****), and competitive results for the TWEETS dataset, which contains tweets that are mostly in the Egyptian dialect and contain many mistakes or misspellings. | ||
| L16-1116 This is part of a larger work to create a completely annotated and segmented speech corpus for ***** MSA *****. | ||
| 2021.wassa-1.25 In this paper, we extract ONE corpus, and we propose ONE algorithm to automatically construct ONE training corpus using ONE classification model architecture for sentiment analysis ***** MSA ***** and different dialects. | ||
| L16-1175 In this paper, we present the Dialectal Arabic Linguistic Learning Assistant (DALILA), a Chrome extension that utilizes cutting-edge Arabic dialect NLP research to assist learners and non-native speakers in understanding text written in either ***** MSA ***** or DA. | ||
| 2021.winlp-1.2 Compared to other Arabic dialects which are mostly based on ***** MSA *****, the Tunisian dialect is a combination of many other languages like ***** MSA *****, Tamazight, Italian and French | ||
| unseen | 112 | |
| K19-1053 We address this by augmenting a prior state of the art model with multiple sources of external knowledge so as to enable prediction on ***** unseen ***** politicians. | ||
| W18-6248 0.05) in 10-fold cross-validation experiments on the training data and an F-score of 0.60 on ***** unseen ***** data. | ||
| 2021.emnlp-main.342 We empirically show that, despite pre-training on large open-domain text, performance of models degrades significantly when they are evaluated on ***** unseen ***** topics. | ||
| 2021.acl-long.341 RADDLE also includes a diagnostic checklist that facilitates detailed robustness analysis in aspects such as language variations, speech errors, ***** unseen ***** entities, and out-of-domain utterances. | ||
| 2020.emnlp-main.105 We instantiate this frame- work with a new English language dataset, ZEST, structured for task-oriented evaluation on ***** unseen ***** tasks | ||
| structured | 112 | |
| D18-1406 Maximum-likelihood estimation (MLE) is one of the most widely used approaches for training ***** structured ***** prediction models for text-generation based natural language processing applications. | ||
| 2020.lrec-1.253 To achieve this, we ***** structured ***** all statements in assembly minutes. | ||
| 2021.naacl-main.5 Most recent work models these two subtasks jointly, either by casting them in one ***** structured ***** prediction framework, or performing multi-task learning through shared representations. | ||
| 2020.emnlp-main.388 Here, we extend the Energy based model framework (Krishna et al., 2020), proposed for several ***** structured ***** prediction tasks in Sanskrit, in 2 simple yet significant ways. | ||
| 2020.acl-main.188 In this study, we propose a general calibration scheme for output entities of interest in neural network based ***** structured ***** prediction models | ||
| keyphrase generation | 112 | |
| 2020.coling-main.462 Recent advances in neural natural language generation have made possible remarkable progress on the task of ***** keyphrase generation *****, demonstrated through improvements on quality metrics such as F1-score. | ||
| 2021.acl-long.111 To overcome this limitation, we propose SEG-Net, a neural ***** keyphrase generation ***** model that is composed of two major components, (1) a selector that selects the salient sentences in a document and (2) an extractor-generator that jointly extracts and generates keyphrases from the selected sentences. | ||
| P17-1054 We name it as deep ***** keyphrase generation ***** since it attempts to capture the deep semantic meaning of the content with a deep learning method. | ||
| 2020.emnlp-main.645 We introduce a new ***** keyphrase generation ***** approach using Generative Adversarial Networks (GANs). | ||
| N19-1070 Most of the proposed supervised and unsupervised methods for ***** keyphrase generation ***** are unable to produce terms that are valuable but do not appear in the text. | ||
| qualitative | 111 | |
| 2021.eacl-main.197 This paper reports ***** qualitative ***** and empirical insights into the most common and challenging types of refinements that a voice-based conversational search system must support. | ||
| 2020.smm4h-1.12 We then apply state-of-the-art classification models to this dataset, providing a competitive set of baselines alongside ***** qualitative ***** error analysis. | ||
| C16-1262 The overall low reliability we observe, nevertheless, casts doubt on the suitability of word neighborhoods in embedding spaces as a basis for ***** qualitative ***** conclusions on synchronic and diachronic lexico-semantic matters, an issue currently high up in the agenda of Digital Humanities. | ||
| D18-1521 Quantitative and ***** qualitative ***** experiments demonstrate that GN-GloVe successfully isolates gender information without sacrificing the functionality of the embedding model. | ||
| 2020.acl-main.706 Neural networks lack the ability to reason about *****qualitative***** physics and so can not generalize to scenarios and tasks unseen during training . | ||
| polysemy | 111 | |
| 2016.gwc-1.17 The state of the art ***** polysemy ***** approaches identify several ***** polysemy ***** types in WordNet | ||
| L06-1140 However, since a word is determined to belong to only one cluster that represents a concept, Markov clusters cannot show the ***** polysemy ***** or semantic indetermination among the properties of natural language. | ||
| 2020.coling-main.258 A diachronic check using contextualized embeddings with the WordNet Sense Inventory also demonstrates the possible role of the ***** polysemy ***** of lexical roots across diachronic settings. | ||
| Q18-1036 This approach allows us to seamlessly incorporate linguistic intuitions — including ***** polysemy ***** and the existence of multiword lexical items — into our language model | ||
| D17-1118 Our analysis shows that the effects reported in recent literature must be substantially revised: (i) the proposed negative correlation between meaning change and word frequency is shown to be largely an artefact of the models of word representation used; (ii) the proposed negative correlation between meaning change and prototypicality is shown to be much weaker than what has been claimed in prior art; and (iii) the proposed positive correlation between meaning change and ***** polysemy ***** is largely an artefact of word frequency. | ||
| communicative | 111 | |
| 2020.lrec-1.212 Starting from a seed list of labelled formulaic expressions, we retrieved new sentences from scholarly papers in the ACL Anthology and asked multiple human evaluators to label ***** communicative ***** functions. | ||
| W19-4012 The empirical study of these patterns relies on the availability of data about the actual use of argumentation in ***** communicative ***** practice. | ||
| L12-1266 Two maps for the map-tasks and one emergency diapix were design to prompt semi-spontaneous dialogues simulating stress and natural ***** communicative ***** situations. | ||
| 2020.lrec-1.43 The Common European Framework of Reference for Languages ( CEFR ) defines six levels of learner proficiency , and links them to particular *****communicative***** abilities . | ||
| L04-1196 An unified language for the *****communicative***** acts between agents is essential for the design of multi - agents architectures . | ||
| Commonsense | 111 | |
| D19-1282 ***** Commonsense ***** reasoning aims to empower machines with the human ability to make presumptions about ordinary situations in our daily life. | ||
| N19-1094 ***** Commonsense ***** reasoning is fundamental to natural language understanding. | ||
| 2020.coling-main.182 ***** Commonsense ***** generation aims at generating plausible everyday scenario description based on a set of provided concepts | ||
| S18-1175 This paper describes the system we submitted to the Task 11 in SemEval 2018 , i.e. , Machine Comprehension using *****Commonsense***** Knowledge . | ||
| 2020.coling-main.467 *****Commonsense***** reasoning refers to the ability of evaluating a social situation and acting accordingly . | ||
| linguistics | 111 | |
| 2021.eacl-main.176 We argue that this insight is valuable for multi-task learning, ***** linguistics ***** and interpretability research and can lead to exciting new findings in all three domains. | ||
| 2021.eacl-main.215 Further examining the characteristics that our classifiers rely on, we find that features such as passive voice, animacy and case strongly correlate with classification decisions, suggesting that mBERT does not encode subjecthood purely syntactically, but that subjecthood embedding is continuous and dependent on semantic and discourse factors, as is proposed in much of the functional ***** linguistics ***** literature. | ||
| 2020.cllrd-1.2 Citizen ***** linguistics ***** can help to create language resources and annotate language resources, not only for the improvement of language technologies, such as machine translation but also for the advancement of linguistic research. | ||
| C16-1336 We systematize the properties of such networks and show their relevance for ***** linguistics ***** | ||
| D17-2003 Case studies tend to be used in legal, business, and health education contexts, but less in the teaching and learning of ***** linguistics *****. | ||
| domain shift | 111 | |
| Q13-1035 One is a macro-level analysis that measures how ***** domain shift ***** affects corpus-level evaluation; the second is a micro-level analysis for word-level errors. | ||
| 2021.emnlp-main.463 The paradigm of pre-training followed by finetuning has become a standard procedure for NLP tasks, with a known problem of ***** domain shift ***** between the pre-training and downstream corpus. | ||
| 2021.emnlp-main.669 The systems trained with our approach rely more on the source tokens, are more robust against ***** domain shift ***** and suffer less hallucinations. | ||
| D19-6102 We highlight the differences of these approaches in terms of unlabeled data requirements and capability to overcome additional ***** domain shift ***** in the data. | ||
| 2021.ranlp-1.72 Recently, ***** domain shift *****, which affects accuracy due to differences in data between source and target domains, has become a serious issue when using machine learning methods to solve natural language processing tasks. | ||
| CNNs | 110 | |
| 2020.sdp-1.10 We present DeepPaperComposer, a simple solution for preparing highly accurate (100%) training data without manual labeling to extract content from scholarly articles using convolutional neural networks (***** CNNs *****). | ||
| D18-1444 Convolutional neural networks (***** CNNs *****) have met great success in abstractive summarization, but they cannot effectively generate summaries of desired lengths. | ||
| W19-6140 The taggers all rely on different machine learning mechanisms: decision trees, hidden Markov models (HMMs), conditional random fields (CRFs), long-short term memory networks (LSTMs), and convolutional neural networks (***** CNNs *****). | ||
| W18-5408 ***** CNNs ***** used for computer vision can be interpreted by projecting filters into image space, but for discrete sequence inputs ***** CNNs ***** remain a mystery. | ||
| Q16-1019 We propose three attention schemes that integrate mutual influence between sentences into ***** CNNs *****; thus, the representation of each sentence takes into consideration its counterpart | ||
| ABSA | 110 | |
| N19-1035 Aspect-based sentiment analysis (***** ABSA *****), which aims to identify fine-grained opinion polarity towards a specific aspect, is a challenging subtask of sentiment analysis (SA). | ||
| 2021.emnlp-main.727 In this paper, we consider the unsupervised cross-lingual transfer for the ***** ABSA ***** task, where only labeled data in the source language is available and we aim at transferring its knowledge to the target language having no labeled data. | ||
| 2021.naacl-main.146 In this paper, we firstly compare the induced trees from PTMs and the dependency parsing trees on several popular models for the ***** ABSA ***** task, showing that the induced tree from fine-tuned RoBERTa (FT-RoBERTa) outperforms the parser-provided tree. | ||
| 2020.coling-main.19 We demonstrate how basic constituents of emotions can be mapped to the VAD model, along with their interactions respecting the polarized context in ***** ABSA ***** settings using biased key-concepts (e.g., “stop Brexit” vs. “support Brexit”). | ||
| 2020.acl-main.296 Aspect terms extraction and opinion terms extraction are two key problems of fine-grained Aspect Based Sentiment Analysis (***** ABSA *****) | ||
| Wordnet | 110 | |
| 2020.globalex-1.6 Having in mind known issues related to such words in language translation, and further motivated by false friend-related issues on the alignment of a Portuguese wordnet with Princeton ***** Wordnet *****, we aim to widen this discussion, while suggesting preliminary ideas of how wordnets could benefit from this kind of research. | ||
| 2016.gwc-1.36 Our work tries to address this problem by providing an algorithm for automatic building of a frequency based dictionary of noun-CL pairs, mapped to concepts in the Chinese Open ***** Wordnet ***** (Wang and Bond, 2013), an open machine-tractable dictionary for Chinese. | ||
| 2021.emnlp-main.654 When applied to all of WordNet, our model predicts that 1,118 synsets in English ***** Wordnet ***** (1.4%) are BLC, far fewer than existing methods, and with a precision improvement of over 200% over these. | ||
| L08-1040 Cornetto is a two-year Stevin project (project number STE05039) in which a lexical semantic database is built that combines ***** Wordnet ***** with Framenet-like information for Dutch. | ||
| L08-1077 In this paper we describe the construction of an illustrated Japanese ***** Wordnet ***** | ||
| dialects | 110 | |
| 2010.amta-government.5 COLABA enables MSA users to interpret ***** dialects ***** correctly. | ||
| 2020.vardial-1.14 Hence, the dataset is not only a valuable resource for studying the diachronic evolution of Italian and the differences between its ***** dialects *****, but it is also useful to investigate stylistic aspects between single authors. | ||
| W17-8102 This distance, as well as the mutual understanding between the speakers, is the correct criterion for the classification of idioms as different languages, or as ***** dialects *****, or as regional variants close to the standard. | ||
| L16-1517 The case study at hand is that of the curation of 42 fascicles of the Dictionaries of the Brabantic and Limburgian ***** dialects *****, and 6 fascicles of the Dictionary of ***** dialects ***** in Gelderland. | ||
| W19-4632 We firstly build a coarse identification model to classify each sentence into one out of six ***** dialects *****, then use this label as a feature for the fine-grained model that classifies the sentence among 26 ***** dialects ***** from different Arab cities, after that we apply ensemble voting classifier on both sub-systems | ||
| transformers | 110 | |
| 2021.mmsr-1.5 The problem of interpretation of knowledge learned by multi-head self-attention in ***** transformers ***** has been one of the central questions in NLP. | ||
| 2021.wassa-1.27 We find that RobBERT clearly outperforms BERTje, but that directly adding lexicon information to ***** transformers ***** does not improve performance. | ||
| 2021.acl-long.331 We demonstrate that ***** transformers ***** obtain impressive performance even when some of the layers are randomly initialized and never updated. | ||
| 2021.case-1.17 Similarly in the event extraction task, our transformer-LSTM-CRF architecture outperforms regular ***** transformers ***** significantly. | ||
| 2021.acl-long.163 It is a common belief that training deep ***** transformers ***** from scratch requires large datasets | ||
| method | 110 | |
| 2021.naacl-main.353 We provide a novel dataset for this task, encompassing over 8,000 comparative entries, and show that neural sequence models outperform conventional ***** method *****s applied to this task so far. | ||
| Q14-1005 We propose a new ***** method ***** that projects model expectations rather than labels, which facilities transfer of model uncertainty across language boundaries. | ||
| P17-1078 Results show that such pretraining significantly improves the model, leading to accuracies competitive to the best ***** method *****s on six benchmarks. | ||
| 2020.aacl-main.29 To resolve the cold start problem in training, we propose a ***** method ***** using a pseudo data generator which generates pseudo texts and KB triples for learning an initial model. | ||
| P18-1020 We describe a novel ***** method ***** for efficiently eliciting scalar annotations for dataset construction and system quality estimation by human judgments. | ||
| multilingual machine translation | 110 | |
| 2021.acl-long.21 Existing ***** multilingual machine translation ***** approaches mainly focus on English-centric directions, while the non-English directions still lag behind. | ||
| 2020.emnlp-main.187 Sparse language vectors from linguistic typology databases and learned embeddings from tasks like ***** multilingual machine translation ***** have been investigated in isolation, without analysing how they could benefit from each other's language characterisation. | ||
| 2020.acl-main.754 When training ***** multilingual machine translation ***** (MT) models that can translate to/from multiple languages, we are faced with imbalanced training sets: some languages have much more training data than others. | ||
| 2021.wmt-1.48 In this paper, we focus on the task of ***** multilingual machine translation ***** for African languages and describe our contribution in the 2021 WMT Shared Task: Large-Scale Multilingual Machine Translation. | ||
| 2020.loresmt-1.8 Prior works have demonstrated that a low-resource language pair can benefit from *****multilingual machine translation***** (MT) systems, which rely on many language pairs' joint training. | ||
| temporal relation extraction | 110 | |
| 2021.emnlp-main.636 In the second one, we devise an end-to-end architecture composed of hyperbolic neural units tailored for the ***** temporal relation extraction ***** task. | ||
| W18-5607 We explore to adapt the tree-based LSTM-RNN model proposed by Miwa and Bansal (2016) to ***** temporal relation extraction ***** from clinical text, obtaining a five point improvement over the best 2016 Clinical TempEval system and two points over the state-of-the-art. | ||
| 2021.emnlp-main.815 Event time is one of the most important features for event-event ***** temporal relation extraction *****. | ||
| L14-1129 Our goals in this paper are to (1) manually annotate certain types of missing links that cannot be automatically recovered in the i2b2 Clinical Temporal Relations Challenge Corpus, one of the recently released evaluation corpora for ***** temporal relation extraction *****; and (2) empirically determine the usefulness of these additional annotations. | ||
| S17-2098 We used a neural network based approach for entity and *****temporal relation extraction*****, and experimented with two domain adaptation strategies. | ||
| english wikipedia | 110 | |
| 2021.acl-long.70 We describe a series of experiments that measure usable information by selectively ablating lexical and structural information in transformer language models trained on *****English Wikipedia*****. | ||
| P19-1436 Moreover, by representing phrases as pointers to their start and end tokens, our model indexes phrases in the entire *****English Wikipedia***** (up to 60 billion phrases) using under 2TB. | ||
| 2020.lrec-1.132 This paper describes a corpus of annotated typed lambda calculus translations for approximately 2,000 sentences in Simple *****English Wikipedia*****, which is assumed to constitute a broad-coverage domain for precise, complex descriptions. | ||
| 2021.eacl-main.201 Pre-training language models on *****English Wikipedia***** table data further improves performance. | ||
| 2021.naacl-main.198 We apply this methodology to the *****English Wikipedia***** and extract our large-scale WEC-Eng dataset. | ||
| multitask | 109 | |
| N18-1008 We explore ***** multitask ***** models for neural translation of speech, augmenting them in order to reflect two intuitive notions. | ||
| 2020.coling-main.41 In this paper, we explore three ways of leveraging an auxiliary task to shape the latent variable distribution: via pre-training, to obtain an informed prior, and via ***** multitask ***** learning. | ||
| 2020.lrec-1.642 In addition, we show that the accuracy of this parser can be improved by using a ***** multitask ***** learning architecture that makes it possible to train the parser on additional treebanks that use other annotation models. | ||
| 2020.codi-1.17 In particular, starting with a strong baseline neural parser unaware of any coreference information, we compare a parser which utilizes only the output of a neural coreference resolver, with a more sophisticated model, where discourse parsing and coreference resolution are jointly learned in a neural ***** multitask ***** fashion. | ||
| 2021.acl-long.328 We conduct a detailed analysis to understand the impact of the auxiliary task on the primary task within the ***** multitask ***** learning framework | ||
| sequential | 109 | |
| 2020.louhi-1.11 In this work we addressed the problem of capturing ***** sequential ***** information contained in longitudinal electronic health records (EHRs). | ||
| D19-1145 Experimental results on NIST Chinese-to-English and WMT14 English-to-German translation tasks show that the proposed approach consistently boosts performance over both the absolute and relative ***** sequential ***** position representations. | ||
| W18-3217 In specific, we trained character level models and word level models based on Bidirectional LSTMs (Bi-LSTMs) to perform ***** sequential ***** tagging. | ||
| W17-5308 We present a simple ***** sequential ***** sentence encoder for multi-domain natural language inference. | ||
| W16-4911 The proposed fluctuation smoothing approach, based on classical ***** sequential ***** pattern mining, exploits lexical overlap in students' answers to any typical question | ||
| answer | 109 | |
| 2020.coling-main.457 An essential task of most Question Answering (QA) systems is to re-rank the set of ***** answer ***** candidates, i.e., Answer Sentence Selection (AS2). | ||
| 2021.emnlp-main.756 This paper investigates whether models can learn to find evidence from a large corpus, with only distant supervision from ***** answer ***** labels for model training, thereby generating no additional annotation cost. | ||
| D19-5823 We introduce a heuristic extractive version of the data set, which allows us to approach the more feasible problem of ***** answer ***** extraction (rather than generation). | ||
| D19-1511 Thus, we devise two methods to further enhance semantic coherence between post and question under the guidance of ***** answer *****. | ||
| K19-1065 In this paper, we focus on extracting evidence sentences that can explain or support the ***** answer *****s of multiple-choice MRC tasks, where the majority of ***** answer ***** options cannot be directly extracted from reference documents | ||
| texts | 109 | |
| R19-1118 In this paper, we describe a new approach to distant supervision for extracting sentiment attitudes between named entities mentioned in ***** texts *****. | ||
| 2020.lrec-1.530 In this paper, we propose “Event Appearance” labels that show the relationship between events mentioned in ***** texts ***** and those happening in the real world. | ||
| L14-1256 While most existing systems focus on news ***** texts ***** and extract explicit temporal information exclusively, we show that this approach is not feasible for narratives. | ||
| 2020.fnp-1.16 The experiments have shown BERT (Large) performed the best, giving a F1 score of 0.958, in the task of detecting the causality of sentences in financial ***** texts ***** and reports. | ||
| W19-3717 The resulted ***** texts ***** are then converted to vectors by averaging the vectorial representation of words derived from a pre-trained Word2Vec English model | ||
| user | 109 | |
| P19-1359 It is desirable for dialog systems to have capability to express specific emotions during a conversation , which has a direct , quantifiable impact on improvement of their usability and *****user***** satisfaction . | ||
| 2021.eval4nlp-1.9 Learning authors representations from their textual productions is now widely used to solve multiple downstream tasks , such as classification , link prediction or *****user***** recommendation . | ||
| 2021.sigdial-1.23 Dialogue State Tracking ( DST ) is a sub - task of task - based dialogue systems where the *****user***** intention is tracked through a set of ( domain , slot , slot - value ) triplets . | ||
| 2020.coling-demos.7 We demonstrate the functionalities of the new *****user***** interface for CogniVal . | ||
| 2020.aacl-main.89 Explainable recommendation is a good way to improve *****user***** satisfaction . | ||
| syntactically | 108 | |
| L14-1704 We also investigate the influence of further linguistic factors, such as the ambiguity and the overall frequency of the verbs and a ***** syntactically ***** separate occurrences of verbs and particles that causes difficulties for the correct lemmatization of Particle Verbs. | ||
| L10-1230 The converted corpus allows for a simultaneous search for ***** syntactically ***** specified discourse information based on the XQuery standard, which is illustrated with a simple example in the article. | ||
| D18-1113 We study the automatic generation of syntactic paraphrases using four different models for generation: data-to-text generation, text-to-text generation, text reduction and text expansion, We derive training data for each of these tasks from the WebNLG dataset and we show (i) that conditioning generation on syntactic constraints effectively permits the generation of ***** syntactically ***** distinct paraphrases for the same input and (ii) that exploiting different types of input (data, text or data+text) further increases the number of distinct paraphrases that can be generated for a given input. | ||
| 2020.coling-main.357 English verb alternations allow participating verbs to appear in a set of ***** syntactically ***** different constructions whose associated semantic frames are systematically related. | ||
| 2020.lrec-1.542 In contrast to most previous compositionality datasets we also consider ***** syntactically ***** complex constructions and publish a formal specification of each expression | ||
| Grammatical | 108 | |
| D19-1435 We present a Parallel Iterative Edit (PIE) model for the problem of local sequence transduction arising in tasks like ***** Grammatical ***** error correction (GEC). | ||
| L12-1319 The Berkeley FrameNet Project ( BFN , https://framenet.icsi.berkeley.edu/fndrupal/ ) created descriptions of 73 non - core grammatical constructions , annotation of 50 of these constructions and about 1500 example sentences in its one year project Beyond the Core : A Pilot Project on Cataloging *****Grammatical***** Constructions and Multiword Expressions in English supported by the National Science Foundation . | ||
| L14-1060 We present the creation of an English - Swedish FrameNet - based grammar in *****Grammatical***** Framework . | ||
| 2021.sigtyp-1.9 *****Grammatical***** gender may be determined by semantics , orthography , phonology , or could even be arbitrary . | ||
| L10-1172 *****Grammatical***** approaches to language technology are often considered less optimal than statistical approaches in multilingual settings , where large - scale portability becomes an important issue . | ||
| visual question answering | 108 | |
| W19-1808 Recent work on visually grounded language learning has focused on broader applications of grounded representations, such as ***** visual question answering ***** and multimodal machine translation. | ||
| W19-4806 Focusing on the FiLM ***** visual question answering ***** model, our experiments indicate that a form of approximate number system emerges whose performance declines with more difficult scenes as predicted by Weber's law. | ||
| 2021.acl-long.564 However, we uncover a striking contrast to this promise: across 5 models and 4 datasets on the task of ***** visual question answering *****, a wide variety of active learning approaches fail to outperform random selection. | ||
| 2021.eacl-main.240 In fact, this is the case with most existing ***** visual question answering ***** (VQA) datasets where they assume only one ground-truth answer for each question. | ||
| 2021.emnlp-main.517 Knowledge-based ***** visual question answering ***** (VQA) requires answering questions with external knowledge in addition to the content of images. | ||
| grammatical error | 108 | |
| 2014.amta-wptp.1 The PE output was analyzed taking into account accuracy errors (mistranslations and omissions) as well as language errors (***** grammatical error *****s and syntax errors). | ||
| W17-5907 Detection and correction of Chinese ***** grammatical error *****s have been two of major challenges for Chinese automatic ***** grammatical error ***** diagnosis.This paper presents an N-gram model for automatic detection and correction of Chinese ***** grammatical error *****s in NLPTEA 2017 task. | ||
| 2020.lrec-1.835 The lack of large-scale datasets has been a major hindrance to the development of NLP tasks such as spelling correction and ***** grammatical error ***** correction (GEC). | ||
| 2021.ranlp-1.94 The proposed IP approach optimizes the selection of a single best system for each ***** grammatical error ***** type present in the data. | ||
| 2020.acl-srw.5 Recently, several studies have focused on improving the performance of ***** grammatical error ***** correction (GEC) tasks using pseudo data. | ||
| discriminator | 107 | |
| D19-1027 A byproduct of the ***** discriminator ***** is that the features generated by the learned ***** discriminator ***** network allow the visualization of the extracted events. | ||
| D17-1230 In this generative adversarial network approach, the outputs from the ***** discriminator ***** are used to encourage the system towards more human-like dialogue. | ||
| 2021.emnlp-main.24 Our approach has three characteristics:1) the generator automatically generates massive and diverse antonymous sentences; 2) the ***** discriminator ***** contains a original-side sentiment predictor and an antonymous-side sentiment predictor, which jointly evaluate the quality of the generated sample and help the generator iteratively generate higher-quality antonymous samples; 3) the ***** discriminator ***** is directly used as the final sentiment classifier without the need to build an extra one. | ||
| W19-2301 We also explore two approaches to accomplish the conditional ***** discriminator *****: (1) phredGANa, a system that passes the attribute representation as an additional input into a traditional adversarial ***** discriminator *****, and (2) phredGANd, a dual ***** discriminator ***** system which in addition to the adversarial ***** discriminator *****, collaboratively predicts the attribute(s) that generated the input utterance. | ||
| 2021.naacl-main.298 In this work, we introduce two privacy-preserving regularization methods for training language models that enable joint optimization of utility and privacy through (1) the use of a ***** discriminator ***** and (2) the inclusion of a novel triplet-loss term | ||
| augmented | 107 | |
| 2021.acl-short.79 Specifically, we perform extensive analysis to measure the efficacy of few-shot approaches ***** augmented ***** with automatic translations and permutations of context-question-answer pairs. | ||
| C16-1047 The sentiment ***** augmented ***** optimized vector obtained at the end is used for the training of SVM for sentiment classification. | ||
| P19-1154 To alleviate such inconveniences, we propose a neural P2C conversion model ***** augmented ***** by an online updated vocabulary with a sampling mechanism to support open vocabulary learning during IME working. | ||
| 2020.emnlp-main.113 We further utilize this ***** augmented ***** data for pretraining and leverage it for the task of disfluency detection. | ||
| 2020.conll-1.50 We find that a highly ***** augmented ***** model shows highest accuracy in predicting held-out forms, and investigate other properties of interest learned by our models' representations | ||
| OOV | 107 | |
| 2020.emnlp-main.631 In our paper, we study a pre-trained multilingual BERT model and analyze the ***** OOV ***** rate on downstream tasks, how it introduces information loss, and as a side-effect, obstructs the potential of the underlying model. | ||
| 2020.lrec-1.492 Knowing the constituent structure of a word form makes it possible to generate the optimal split for a given task, e.g., a full split for subword tokenization, or, in the case of part-of-speech tagging, splitting an ***** OOV ***** word until the largest known morphological head is found. | ||
| 2020.osact-1.15 To deal with the issues associated with ***** OOV *****, we generated a character-level embeddings model, which was trained on a massive data collected carefully. | ||
| 2010.iwslt-papers.6 Combining all these processes, we reduced 10% of the singletons, 2% ***** OOV ***** words, and obtained 1.5 absolute (7% relative) BLEU improvement on the WMT 2010 German to English News translation task. | ||
| L08-1044 Then, we demonstrate the relevance of the Web for the ***** OOV ***** word retrieval | ||
| extrinsic | 107 | |
| N18-1148 In this paper, we identify and differentiate between two relevant data generating scenarios (intrinsic vs. ***** extrinsic ***** labels), introduce a simple but novel method which emphasizes the importance of calibration, and then analyze and experimentally validate the appropriateness of various methods for each of the two scenarios. | ||
| 2020.coling-main.416 These proof-of-concept results reveal the potential of emergent communication pretraining for both natural language processing tasks in resource-poor settings and ***** extrinsic ***** evaluation of artificial languages. | ||
| 2020.coling-main.106 We evaluate word prisms in comparison to other meta-embedding methods on six ***** extrinsic ***** evaluations and observe that word prisms offer improvements in performance on all tasks. | ||
| 2021.emnlp-demo.23 Finally, we present an innovative cross-lingual word-guessing game as an ***** extrinsic ***** evaluation metric to measure end-to-end system performance. | ||
| N19-1395 We present an alternative, ***** extrinsic *****, evaluation metric for this task, Answering Performance for Evaluation of Summaries | ||
| Vector | 107 | |
| N19-1033 The ***** Vector ***** of Locally-Aggregated Word Embeddings (VLAWE) representation of a document is then computed by accumulating the differences between each codeword vector and each word vector (from the document) associated to the respective codeword. | ||
| S18-1061 Our best performing system used a Bag of Words model with a Linear Support ***** Vector ***** Machine as its' classifier. | ||
| 2021.calcs-1.9 We experiment with word uni-grams, word n-grams, character n-grams, Viterbi Decoding, Latent Dirichlet Allocation, Support ***** Vector ***** Machine and Logistic Regression. | ||
| I17-4024 It is based on a traditional Support ***** Vector ***** Machine classifier exploiting multilingual word embeddings and character n-grams. | ||
| L16-1723 In this paper , we claim that *****Vector***** Cosine which is generally considered one of the most efficient unsupervised measures for identifying word similarity in Vector Space Models can be outperformed by a completely unsupervised measure that evaluates the extent of the intersection among the most associated contexts of two target words , weighting such intersection according to the rank of the shared contexts in the dependency ranked lists . | ||
| Contextual | 107 | |
| D18-1179 ***** Contextual ***** word representations derived from pre-trained bidirectional language models (biLMs) have recently been shown to provide significant improvements to the state of the art for a wide range of NLP tasks | ||
| 2020.emnlp-main.504 *****Contextual***** embeddings are proved to be overwhelmingly effective to the task of Word Sense Disambiguation ( WSD ) compared with other sense representation techniques . | ||
| 2020.coling-main.326 *****Contextual***** embeddings derived from transformer - based neural language models have shown state - of - the - art performance for various tasks such as question answering , sentiment analysis , and textual similarity in recent years . | ||
| 2020.acl-main.734 *****Contextual***** features always play an important role in Chinese word segmentation ( CWS ) . | ||
| N18-1196 *****Contextual***** influences on language often exhibit substantial cross - lingual regularities ; for example , we are more verbose in situations that require finer distinctions . | ||
| Distributional | 107 | |
| 2021.emnlp-main.654 For both English and Mandarin, we test three methods of generating such features for any synset within Wordnet (WN): extraction of textual features from Wikipedia pages, ***** Distributional ***** Memory (DM) and BART. | ||
| W19-0410 ***** Distributional ***** semantics has had enormous empirical success in Computational Linguistics and Cognitive Science in modeling various semantic phenomena, such as semantic similarity, and distributional models are widely used in state-of-the-art Natural Language Processing systems. | ||
| W16-4102 In this paper , we introduce for the first time a *****Distributional***** Model for computing semantic complexity , inspired by the general principles of the Memory , Unification and Control framework(Hagoort , 2013 ; Hagoort , 2016 ) . | ||
| K18-1026 *****Distributional***** models provide a convenient way to model semantics using dense embedding spaces derived from unsupervised learning algorithms . | ||
| L14-1496 *****Distributional***** thesauri have been applied for a variety of tasks involving semantic relatedness . | ||
| Metaphor | 107 | |
| L10-1574 The relationship between the abstract and the concrete, which is at the basis of the Conceptual ***** Metaphor ***** perspective, can be considered strictly related to the variation of the ontological values found in our analysis of the PNs and their belonging classes which are codified in the ItalWordNet database. | ||
| W18-0907 We report on the shared task on metaphor identification on the VU Amsterdam ***** Metaphor ***** Corpus conducted at the NAACL 2018 | ||
| 2020.lrec-1.726 We present new results on *****Metaphor***** Detection by using text from visual datasets . | ||
| K17-1006 *****Metaphor***** detection has been both challenging and rewarding in natural language processing applications . | ||
| 2020.figlang-1.16 This paper reports a linguistically - enriched method of detecting token - level metaphors for the second shared task on *****Metaphor***** Detection . | ||
| emoji prediction | 107 | |
| S18-1104 The CNN is trained on the ***** emoji prediction ***** task. | ||
| D17-1169 Through ***** emoji prediction ***** on a dataset of 1246 million tweets containing one of 64 common emojis we obtain state-of-the-art performance on 8 benchmark datasets within emotion, sentiment and sarcasm detection using a single pretrained model. | ||
| 2020.findings-emnlp.148 Each year, new shared tasks and datasets are proposed, ranging from classics like sentiment analysis to irony detection or ***** emoji prediction *****. | ||
| N18-2107 In this paper we extend recent advances in ***** emoji prediction ***** by putting forward a multimodal approach that is able to predict emojis in Instagram posts. | ||
| W19-1311 Through the human ratings that we obtained, we also argue for preference metric to better evaluate the usefulness of an ***** emoji prediction ***** system. | ||
| IR | 106 | |
| L10-1242 Several baseline ***** IR ***** experiments report the effect of using video-associated metadata on retrieval effectiveness. | ||
| P19-1612 We argue that both are suboptimal, since gold evidence is not always available, and QA is fundamentally different from ***** IR *****. | ||
| 2020.acl-main.652 The good results carry over into the more challenging ***** IR ***** scenario. | ||
| C18-1025 The latter idea accounts better for the ***** IR ***** task and is favored by recent research works, which is the one we will follow in this paper. | ||
| 2021.eacl-main.47 We propose to extend the ***** IR ***** approach by treating the problem as an instance of positive-unlabeled (PU) learning—i.e., learning binary classifiers from only positive (the query documents) and unlabeled (the results of the ***** IR ***** engine) data | ||
| Sequence | 106 | |
| W18-2321 ***** Sequence ***** labeling of biomedical entities, e.g., side effects or phenotypes, was a long-term task in BioNLP and MedNLP communities. | ||
| D19-1626 In this work , we propose an aggregation method to combine the Bidirectional Encoder Representations from Transformer ( BERT ) with a MatchLSTM layer for *****Sequence***** Matching . | ||
| 2021.emnlp-main.807 *****Sequence***** models are a critical component of modern NLP systems , but their predictions are difficult to explain . | ||
| 2020.acl-main.193 *****Sequence***** labeling is a fundamental task for a range of natural language processing problems . | ||
| D18-1238 *****Sequence***** encoders are crucial components in many neural architectures for learning to read and comprehend . | ||
| SQL | 106 | |
| N18-2093 This requires a system that understands users' questions and converts them to ***** SQL ***** queries automatically. | ||
| P19-1444 Finally, IRNet deterministically infers a ***** SQL ***** query from the synthesized SemQL query with domain knowledge. | ||
| 2020.coling-main.33 To effectively capture historical information of ***** SQL ***** query and reuse the previous ***** SQL ***** query tokens, we use a hybrid pointer-generator network as decoder to copy tokens from the previous ***** SQL ***** query via pointer, the generator part is utilized to generate new tokens. | ||
| D18-1193 We evaluate Syntax***** SQL *****Net on a new large-scale text-to-***** SQL ***** corpus containing databases with multiple tables and complex ***** SQL ***** queries containing multiple ***** SQL ***** clauses and nested queries. | ||
| P19-1448 In spider, a recently-released text-to-***** SQL ***** dataset, new and complex DBs are given at test time, and so the structure of the DB schema can inform the predicted ***** SQL ***** query | ||
| word analogy | 106 | |
| P17-1007 In recent years word-embedding models have gained great popularity due to their remarkable performance on several tasks, including *****word analogy***** questions and caption generation. | ||
| R19-1147 Second, for evaluation, we analyse the quality of pre-trained embeddings using an input *****word analogy***** list. | ||
| 2020.lrec-1.310 We present a new *****word analogy***** test set considering the original English Word2vec analogy test set and some specific linguistic aspects of the Greek language as well. | ||
| 2020.lrec-1.501 We present a collection of such datasets for the *****word analogy***** task in nine languages: Croatian, English, Estonian, Finnish, Latvian, Lithuanian, Russian, Slovenian, and Swedish. | ||
| P17-1187 We conduct experiments on two tasks including word similarity and *****word analogy*****, and our models significantly outperform baselines. | ||
| hyperpartisan news | 106 | |
| S19-2183 This paper describes our system for the SemEval 2019 Task 4 on *****hyperpartisan news***** detection. | ||
| S19-2158 This paper summarizes our contribution to the *****Hyperpartisan News***** Detection task in SemEval 2019. | ||
| S19-2164 We use various natural processing and machine learning methods to perform the *****Hyperpartisan News***** Detection task. | ||
| S19-2185 This paper describes our system submitted to the formal run of SemEval-2019 Task 4: *****Hyperpartisan news***** detection. | ||
| S19-2165 We investigate the recently developed Bidi- rectional Encoder Representations from Transformers (BERT) model (Devlin et al. 2018) for the *****hyperpartisan news***** detection task. | ||
| typology | 105 | |
| 2020.lrec-1.709 We use the ***** typology ***** to annotate a corpus of 520 sentence pairs in English and we demonstrate that unlike previous typologies, SHARel can be applied to all relations of interest with a high inter-annotator agreement. | ||
| 2020.eamt-1.17 In this study, we designed an error ***** typology ***** based on the error types that were typically generated by NMT systems and might cause significant impact in technical translations: “Addition,” “Omission,” “Mistranslation,” “Grammar,” and “Terminology.” | ||
| 2021.wnut-1.42 In particular, we focus on language pairs where transfer learning is difficult for mBERT: those where source and target languages are different in script, vocabulary, and linguistic ***** typology *****. | ||
| D18-1468 In the experiments, we analyze the features of linguistic ***** typology *****, with a special focus on the order of subject, object and verb. | ||
| 2021.sigtyp-1.2 Research in linguistic ***** typology ***** has shown that languages do not fall into the neat morphological types (synthetic vs. analytic) postulated in the 19th century | ||
| GLUE | 105 | |
| 2021.emnlp-main.831 We demonstrate that through a combination of software optimizations, design choices, and hyperparameter tuning, it is possible to produce models that are competitive with BERT-base on ***** GLUE ***** tasks at a fraction of the original pretraining cost. | ||
| 2021.acl-short.27 We show that the token representations and self-attention activations within BERT are surprisingly resilient to shuffling the order of input tokens, and that for several ***** GLUE ***** language understanding tasks, shuffling only minimally degrades performance, e.g., by 4% for QNLI. | ||
| 2021.emnlp-main.232 Different from most prior work that focuses on a particular modality, comprehensive empirical evidence on 11 natural language understanding and cross-modal tasks illustrates that CAPT is applicable for both language and vision-language tasks, and obtains surprisingly consistent improvement, including 0.6% absolute gain on ***** GLUE ***** benchmarks and 0.8% absolute increment on NLVR2. | ||
| 2020.acl-main.747 Finally, we show, for the first time, the possibility of multilingual modeling without sacrificing per-language performance; XLM-R is very competitive with strong monolingual models on the ***** GLUE ***** and XNLI benchmarks. | ||
| 2021.acl-long.86 On the ***** GLUE ***** test set our 6 layer RoBERTa based model outperforms BERT-large | ||
| visualization | 105 | |
| L12-1368 ANALEC allows researchers to dynamically build their own annotation scheme and use the possibilities of scheme revision, data querying and graphical ***** visualization ***** during the annotation process. | ||
| 2020.acl-main.492 This has motivated the development of methods for interpreting such models, e.g., via gradient-based saliency maps or the ***** visualization ***** of attention weights. | ||
| W19-2011 We use ***** visualization ***** and nearest neighbor analysis to show that better encoding of entity-type and relational information leads to this superiority. | ||
| 2021.emnlp-main.543 Such information is even more important for story ***** visualization ***** since its inputs have an explicit narrative structure that needs to be translated into an image sequence (or visual story) | ||
| 2021.acl-demo.9 We present MT - Telescope , a *****visualization***** platform designed to facilitate comparative analysis of the output quality of two Machine Translation ( MT ) systems . | ||
| morphological analysis | 105 | |
| I17-1094 The experimental results show that our method can extract widely covered variants from large Twitter data and improve the recall of normalization without degrading the overall accuracy of Japanese ***** morphological analysis *****. | ||
| L08-1297 We have performed a set of experiments made to investigate the utility of ***** morphological analysis ***** to improve retrieval of documents written in languages with relatively large morphological variation in a practical commercial setting, using the SiteSeeker search system developed and marketed by Euroling Ab. | ||
| L14-1056 With the help of better ***** morphological analysis *****, we present the best labelled dependency parsing scores to date on Turkish. | ||
| L12-1411 Our tool produces lemmatisation and ***** morphological analysis ***** reaching accuracy that is considerably higher compared to the existing alternative tools: 83.6% relative error reduction on lemmatisation and 8.1% relative error reduction on ***** morphological analysis *****. | ||
| 2004.amta-papers.23 For these reasons, we can consider that we can translate Japanese into Uighur in such a manner as word-by-word aligning after ***** morphological analysis ***** of the input sentences without complicated syntactical analysis. | ||
| dialog state tracking | 105 | |
| W19-5932 We formulate ***** dialog state tracking ***** as a reading comprehension task to answer the question what is the state of the current dialog? | ||
| 2021.naacl-main.206 Experiments on two NLP applications: few-shot text classification and multi-domain ***** dialog state tracking ***** demonstrate the superior performance of our proposed method. | ||
| 2020.inlg-1.35 To generate a dataset for SG-NLG we re-purpose an existing dataset for another task: ***** dialog state tracking *****, which includes a large and rich schema spanning multiple different attributes, including information about the domain, user intent, and slot descriptions. | ||
| 2021.teachingnlp-1.12 This paper describes a class project for a recently introduced undergraduate NLP course that gives computer science students the opportunity to explore the data of *****Dialog State Tracking***** Challenge 2 (DSTC 2). | ||
| N19-1372 In experiments on the benchmark dataset used in *****Dialog State Tracking***** Challenge 4, the proposed models achieved significantly higher F1 scores than the state-of-the-art contextual models. | ||
| learners | 104 | |
| W18-0534 The dataset includes 1,868 student essays written by ***** learners ***** of European Portuguese, native speakers of the following L1s: Chinese, English, Spanish, German, Russian, French, Japanese, Italian, Dutch, Tetum, Arabic, Polish, Korean, Romanian, and Swedish. | ||
| 2020.lrec-1.595 We compared the performances of classical machine ***** learners ***** where features comprised sentence representations obtained from a pretrained embedding model (Universal Sentence Encoder) vs. neural classifiers in which sentence embedding vector representations are adapted or fine-tuned while training for the absorption recognition task. | ||
| L16-1511 We present the COPLE2 corpus, a learner corpus of Portuguese that includes written and spoken texts produced by ***** learners ***** of Portuguese as a second or foreign language. | ||
| 2020.lrec-1.38 The experiment allowed to gather more than 12,000 answers from ***** learners ***** on different question types. | ||
| 2020.nlptea-1.4 This paper presents the NLPTEA 2020 shared task for Chinese Grammatical Error Diagnosis (CGED) which seeks to identify grammatical error types, their range of occurrence and recommended corrections within sentences written by ***** learners ***** of Chinese as a foreign language | ||
| semantic annotation | 104 | |
| L10-1270 Different from existing formalizations, our purpose is to extend ontologies by ***** semantic annotation ***** rules whose complexity increases along two dimensions: the linguistic complexity and the rule syntactic complexity. | ||
| L16-1629 This work explores new approaches to detecting inconsistency in ***** semantic annotation *****. | ||
| L08-1188 In this work we present a novel web browser extension which combines several features coming from the worlds of terminology and information extraction, ***** semantic annotation ***** and knowledge management, to support users in the process of both keeping track of interesting information they find on the web, and organizing its associated content following knowledge representation standards offered by the Semantic Web | ||
| L08-1186 It introduces a ***** semantic annotation ***** scheme for spoken information access requests, specifically derived from Question Answering (QA) research. | ||
| W19-3301 Developers of cross-linguistic ***** semantic annotation ***** schemes face a number of issues not encountered in monolingual annotation. | ||
| french | 104 | |
| L10-1116 We set out our system and we evaluate how much time can be saved when looking for a sign in a ***** french ***** sign language video. | ||
| L06-1387 Experiments have been carried out on the corpus used during ESTER, the ***** french ***** evaluation campaign. | ||
| W19-2908 The Evolex project aims at proposing a new data-based inductive method for automatically characterising the relation between pairs of ***** french ***** words collected in psycholinguistics experiments on lexical access. | ||
| C16-2023 In this paper, we introduce an original Python implementation of datetime resolution in ***** french *****, which we make available as open-source library. | ||
| L10-1117 In this paper, we apply a standard extraction procedure to a 100 millions words parsed corpus of ***** french ***** and obtain rather poor results. | ||
| active learning | 104 | |
| C16-2028 TextPro-AL is a web-based application integrating four components: a machine learning based NLP pipeline, an annotation editor for task definition and text annotations, an incremental re-training procedure based on ***** active learning ***** selection from a large pool of unannotated data, and a graphical visualization of the learning status of the system. | ||
| 2021.eacl-main.145 We conduct an extensive empirical study of various Bayesian uncertainty estimation methods and Monte Carlo dropout options for deep pre-trained models in the ***** active learning ***** framework and find the best combinations for different types of models. | ||
| K18-1015 We study the application of ***** active learning ***** techniques to the translation of unbounded data streams via interactive neural machine translation. | ||
| L10-1447 We also explore ***** active learning ***** techniques to determine the suitable size for a corpus of questions in order to achieve adequate accuracy while minimizing the annotation efforts. | ||
| 2021.ranlp-1.26 This paper presents an *****active learning***** approach that aims to reduce the human effort required during the annotation of natural language corpora composed of entities and semantic relations . | ||
| neural language | 104 | |
| W19-4819 Here we present a suite of experiments probing whether ***** neural language ***** models trained on linguistic data induce these stack-like data structures and deploy them while incrementally predicting words. | ||
| 2021.ranlp-1.48 Character-aware ***** neural language ***** models can capture the relationship between words by exploiting character-level information and are particularly effective for languages with rich morphology. | ||
| 2021.acl-short.19 Our results indicate that saliency could be a cognitively more plausible metric for interpreting ***** neural language ***** models. | ||
| 2020.acl-main.47 We examine a methodology using ***** neural language ***** models (LMs) for analyzing the word order of language. | ||
| 2021.nodalida-main.1 In this paper, we introduce a simple, fully automated pipeline for creating language-specific BERT models from Wikipedia data and introduce 42 new such models, most for languages up to now lacking dedicated deep ***** neural language ***** models. | ||
| complex word identification | 104 | |
| W18-0540 Our results show that deep neural networks are able to perform as well as traditional machine learning methods using manually engineered features for the task of ***** complex word identification ***** in English. | ||
| C18-1292 The system trains a ***** complex word identification ***** model on this set, and then applies the model to find texts that contain the desired proportion of new, challenging, and familiar vocabulary. | ||
| N18-1019 We propose a ***** complex word identification ***** (CWI) model that exploits both lexical and contextual features, and a simplification mechanism which relies on a word-embedding lexical substitution model to replace the detected complex words with simpler paraphrases. | ||
| W17-5910 This paper revisits the problem of ***** complex word identification ***** (CWI) following up the SemEval CWI shared task. | ||
| W18-0537 In this paper, we describe our experiments for the Shared Task on *****Complex Word Identification***** (CWI) 2018 (Yimam et al., 2018), hosted by the 13th Workshop on Innovative Use of NLP for Building Educational Applications (BEA) at NAACL 2018. | ||
| knowledge base completion | 104 | |
| 2021.eacl-main.217 Once pre-trained, these models become applicable to multiple entity-centric tasks such as ranked retrieval, ***** knowledge base completion *****, question answering, and more. | ||
| P17-1088 We evaluate ITransF on two benchmark datasets—WN18 and FB15k for ***** knowledge base completion ***** and obtains improvements on both the mean rank and Hits@10 metrics, over all baselines that do not use additional information. | ||
| P18-2013 State-of-the-art ***** knowledge base completion ***** (KBC) models predict a score for every known or unknown fact via a latent factorization over entity and relation embeddings. | ||
| E17-2083 In this paper we present a cross-lingual extension of a neural tensor network model for ***** knowledge base completion *****. | ||
| D19-5320 In this paper we discuss a simple approach to automatically build and rank paths between a source and target entity pair with learned embeddings using a ***** knowledge base completion ***** model (KBC). | ||
| opinion | 104 | |
| L10-1465 This paper describes an annotation scheme for argumentation in ***** opinion *****ated texts such as newspaper editorials, developed from a corpus of approximately 500 English texts from Nepali and international newspaper sources. | ||
| 2021.eacl-main.229 Since the number of reviews for each target can be prohibitively large, neural network-based methods follow a two-stage approach where an extractive step first pre-selects a subset of salient ***** opinion *****s and an abstractive step creates the summary while conditioning on the extracted subset. | ||
| D18-1403 We present a neural framework for ***** opinion ***** summarization from online product reviews which is knowledge-lean and only requires light supervision (e.g., in the form of product domain labels and user-provided ratings). | ||
| 2021.nodalida-main.16 This article studies register classification of documents from the unrestricted web, such as news articles or ***** opinion ***** blogs, in a multilingual setting, exploring both the benefit of training on multiple languages and the capabilities for zero-shot cross-lingual transfer. | ||
| D19-1342 If a real-world sentiment classification system ignores the existence of conflict ***** opinion *****s when it is designed, it will incorrectly mixed conflict ***** opinion *****s into other sentiment polarity categories in action. | ||
| conditioned | 103 | |
| P19-1082 In this paper, we present an approach to incorporate retrieved datapoints as supporting evidence for context-dependent semantic parsing, such as generating source code ***** conditioned ***** on the class environment. | ||
| D17-1239 Recent neural models have shown significant progress on the problem of generating short descriptive texts ***** conditioned ***** on a small number of database records. | ||
| 2021.naacl-main.392 The 3 tasks jointly optimize the same pre-trained Transformer – ***** conditioned ***** dialogue generation task on the labeled dialogue data, ***** conditioned ***** language encoding task and ***** conditioned ***** language generation task on the labeled text data. | ||
| 2020.emnlp-main.22 Recently, advances in neural language models (LMs) enable us to directly generate cover text ***** conditioned ***** on the secret message. | ||
| C18-1089 The task of data-to-text generation aims to generate descriptive texts ***** conditioned ***** on a number of database records, and recent neural models have shown significant progress on this task | ||
| weights | 103 | |
| N19-1357 For example, learned attention ***** weights ***** are frequently uncorrelated with gradient-based measures of feature importance, and one can identify very different attention distributions that nonetheless yield equivalent predictions. | ||
| 2020.acl-main.312 Many existing approaches focused on examining whether the local attention ***** weights ***** could reflect the importance of input representations. | ||
| 2020.acl-main.385 We propose two methods for approximating the attention to input tokens given attention ***** weights *****, attention rollout and attention flow, as post hoc methods when we use attention ***** weights ***** as the relative relevance of the input tokens. | ||
| 2021.acl-long.103 We increase the attention ***** weights ***** assigned to the indispensable tokens, whose removal leads to a dramatic performance decrease. | ||
| E17-1003 These novel architectures differ from standard approaches in that they use external resources to compute attention ***** weights ***** and preserve sequence information | ||
| OCR | 103 | |
| 2021.emnlp-main.680 The goal of this study is to understand if a pretrained language model (LM) can be used in an unsupervised way to reconcile the different ***** OCR ***** views such that their combination contains fewer errors than each individual view. | ||
| L16-1155 The crowdsourced ***** OCR ***** gold standard and the corresponding original ***** OCR ***** recognition results from Abby FineReader 7 for each page are available as a resource. | ||
| 2021.bionlp-1.19 document triage) and biomedical expression ***** OCR *****. | ||
| 2021.ranlp-1.164 Two steps are usually taken to correct ***** OCR ***** errors: detection and corrections | ||
| L12-1623 In this paper we deal with named entity detection on data acquired via *****OCR***** process on documents dating from 1890 . | ||
| synonyms | 103 | |
| W18-3003 (ii) Transferring semantic knowledge by averaging each representative values of ***** synonyms ***** and filling them in the expanded dimension(s). | ||
| P18-2056 We illustrate the interest of Pseudofit for acquiring ***** synonyms ***** and study several variants of Pseudofit according to this perspective. | ||
| C18-1218 The ***** synonyms ***** come from an existing lexical network and they have been semantically disambiguated and refined. | ||
| 2021.acl-short.71 Discrimination between antonyms and ***** synonyms ***** is an important and challenging NLP task | ||
| W17-1915 This paper compares two approaches to word sense disambiguation using word embeddings trained on unambiguous ***** synonyms *****. | ||
| tagset | 103 | |
| L14-1565 These text types contain linguistic phenomena that are missing from or are only suboptimally covered by STTS; in a community effort, German NLP researchers have therefore proposed additions to and modifications of the ***** tagset ***** that will handle these phenomena more appropriately. | ||
| L08-1347 The paper describes some of the background motivation, the ODL language itself, and concludes with a short example of how lexical values expressed in ODL can be mapped to an existing ***** tagset ***** together with some speculations about future work. | ||
| 2020.emnlp-main.454 We experiment with a multi-label dataset of movie synopses and a ***** tagset ***** representing various attributes of stories (e.g., genre, type of events). | ||
| L12-1552 While annotating this richer ***** tagset ***** is more complicated than annotating the base ***** tagset *****, it is much easier than annotating treebank data. | ||
| L10-1153 In addition to an accuracy metric capturing the internal quality of a ***** tagset *****, we introduce a way to evaluate the external quality of ***** tagset ***** mappings so that we can ensure that the mapping retains linguistically important information from the original ***** tagset ***** | ||
| Textual | 103 | |
| 2021.eacl-main.122 ***** Textual ***** information extraction is a typical research topic in the NLP community. | ||
| 2020.cogalex-1.10 *****Textual***** definitions constitute a fundamental source of knowledge when seeking the meaning of words , and they are the cornerstone of lexical resources like glossaries , dictionaries , encyclopedia or thesauri . | ||
| C16-1199 *****Textual***** information is of critical importance for automatic user classification in social media . | ||
| L10-1450 *****Textual***** information is an important communication medium contained rich expression of emotion , and emotion recognition on text has wide applications . | ||
| 2020.fever-1.3 *****Textual***** patterns ( e.g. , Country 's president Person ) are specified and/or generated for extracting factual information from unstructured data . | ||
| argument | 103 | |
| W16-4903 The Score Assignment phase trains models to classify relations between ***** argument ***** units (Support, Attack or Neutral). | ||
| 2021.sigdial-1.40 We discuss the connection between ***** argument ***** structure and check-worthy statements and develop several baseline models for detecting check-worthy statements in the climate change domain. | ||
| W17-5115 The segmentation of an ***** argument *****ative text into ***** argument ***** units and their non-***** argument *****ative counterparts is the first step in identifying the ***** argument *****ative structure of the text. | ||
| C18-1046 (1) We mine the most correlated word pairs from two discourse ***** argument *****s to model pair specific clues, and integrate them as interactive attention into ***** argument ***** representations produced by the bidirectional long short-term memory network. | ||
| 2021.argmining-1.4 This paper shows how to integrate images into ***** argument ***** mining research, specifically into ***** argument ***** retrieval | ||
| image | 103 | |
| 2020.coling-main.210 While novel metrics are proposed every year, a few popular metrics remain as the de facto metrics to evaluate tasks such as ***** image ***** captioning and machine translation, despite their known limitations. | ||
| 2021.semeval-1.7 The evaluation results for the third subtask confirmed the importance of both modalities, the text and the ***** image *****. | ||
| E17-1019 Moreover, we explore the utilization of the recently proposed Word Mover's Distance (WMD) document metric for the purpose of ***** image ***** captioning. | ||
| 2020.semeval-1.159 To utilise both text and ***** image ***** data, a multi-modal CNN-LSTM model is proposed to jointly learn latent features for positive, negative and neutral category predictions. | ||
| 2021.dravidianlangtech-1.43 As memes are in ***** image *****s forms with embedded text, it can quickly spread hate, offence and violence. | ||
| frame semantic | 103 | |
| 2020.acl-main.83 To bridge the gap, we proposed a novel Frame-based Sentence Representation (FSR) method, which employs ***** frame semantic ***** knowledge to facilitate sentence modelling. | ||
| L06-1195 In this paper we discuss the annotation framework (***** frame semantic *****s) and its cross-lingual applicability, problems arising from exhaustive annotation, strategies for quality control, and possible applications. | ||
| W19-8704 We propose a metric for machine translation evaluation based on ***** frame semantic *****s which does not require the use of reference translations or human corrections, but is aimed at comparing original and translated output directly. | ||
| W16-4412 In this paper, we propose a ***** frame semantic *****s-based semantic parsing approach as KB-independent question pre-processing. | ||
| W19-4514 We also propose a verb-centric ***** frame semantic *****s with an effective set of semantic roles in order to support the analysis. | ||
| inferring | 102 | |
| 2020.calcs-1.2 Natural Language Inference (NLI) is the task of ***** inferring ***** the logical relationship, typically entailment or contradiction, between a premise and hypothesis. | ||
| W18-5202 Argument Mining (AM) is a relatively recent discipline, which concentrates on extracting claims or premises from discourses, and ***** inferring ***** their structures. | ||
| D18-1004 To answer this, we create (i) a new dataset and method for identifying supportive replies and (ii) new methods for ***** inferring ***** gender from text and name. | ||
| Q18-1015 Second, our novel taxonomy guided, submodular, active learning method for collecting annotations about rare entities (e.g., oriole, a bird) is 6x more effective at ***** inferring ***** further new facts about them than multiple active learning baselines. | ||
| P18-1076 Instead of relying only on document-to-question interaction or discrete features as in prior work, our model attends to relevant external knowledge and combines this knowledge with the context representation before ***** inferring ***** the answer | ||
| lemma | 102 | |
| L10-1413 Also ***** lemma ***** information was used to introduce new factors to the corpora and to use this information for better word alignment or for alternative path back-off translation. | ||
| Q14-1019 But while the two tasks are pretty similar, they differ in a fundamental respect: in EL the textual mention can be linked to a named entity which may or may not contain the exact mention, while in WSD there is a perfect match between the word form (better, its ***** lemma *****) and a suitable word sense. | ||
| W19-4207 Our models combine sparse sequence-to-sequence models with a two-headed attention mechanism that learns separate attention distributions for the ***** lemma ***** and inflectional tags. | ||
| L10-1066 From other side the change of word forms reduces the frequency of words with the same ***** lemma *****; and the number of words belonging to a specific tag reduces as well. | ||
| L08-1110 We compare two language modelling toolkits, the CMU and the SRI toolkit and arrive at three results: 1) models of ***** lemma *****-based feature functions produce better results than token-based models, 2) adding PoS-tag feature function to the ***** lemma ***** models improves the output and 3) weights for lexical translations are suited if the training material is similar to the texts to be translated | ||
| pronunciation | 102 | |
| 2020.mmw-1.7 We describe on-going work consisting in adding ***** pronunciation ***** information to wordnets, as such information can indicate specific senses of a word. | ||
| L06-1012 Finally, we also propose in this paper a method for detecting ***** pronunciation ***** variants and possible ***** pronunciation ***** mistakes by non-native speakers. | ||
| L14-1212 In Dialäkt Äpp, launched in 2013, the user provides his or her own ***** pronunciation ***** through buttons, while the Voice Äpp, currently in development, asks users to pronounce the word and uses speech recognition techniques to identify the variants and localize the user. | ||
| L14-1372 Because of the dominance of non-standard Arabic in conversational speech, a graphemic ***** pronunciation ***** model (PM) is utilized. | ||
| 2020.acl-main.696 We trained our model on a newly-compiled ***** pronunciation ***** lexicon extracted from various online dictionaries. | ||
| pretrained language | 102 | |
| 2021.alta-1.26 Our empirical experiments reveal that these modern ***** pretrained language ***** models suffer from high variance, and the ensemble method can improve the model performance. | ||
| 2020.peoples-1.7 Furthermore, we propose contextual augmentation of ***** pretrained language ***** models for emotion recognition in conversations, which is to consider not only previous utterances, but also conversation-related information such as speakers, speech acts and topics. | ||
| 2021.sustainlp-1.16 Recent work has explored approaches to adapt ***** pretrained language ***** models to new domains by incorporating additional pretraining on domain-specific corpora and task data. | ||
| 2021.ranlp-srw.3 With the recent success of large ***** pretrained language ***** models, we explore the possibility of using multilingual pretrained transformers like mBART and mT5 for exploring one such task of code-mixed Hinglish to English machine translation. | ||
| 2021.emnlp-main.111 In this paper , we investigate what types of stereotypical information are captured by *****pretrained language***** models . | ||
| expression generation | 102 | |
| 2020.lrec-1.13 We are releasing this dataset to encourage research in the field of coreference resolution, referring ***** expression generation ***** and identification within realistic, deep dialogs involving multiple domains. | ||
| W19-8645 We suggest four extensions to that framework: (1) we introduce a trainable neural planning component that can generate effective plans several orders of magnitude faster than the original planner; (2) we incorporate typing hints that improve the model's ability to deal with unseen relations and entities; (3) we introduce a verification-by-reranking stage that substantially improves the faithfulness of the resulting texts; (4) we incorporate a simple but effective referring ***** expression generation ***** module. | ||
| L12-1032 The ontology is used in particular for concept generalizations during referring ***** expression generation *****. | ||
| W18-6540 This task presents two advantages: many of the mechanisms already available for static contexts may be applied with small adaptations, and it introduces the concept of changing conditions into the task of referring ***** expression generation *****. | ||
| 2020.inlg-1.16 A previous approach, called Perceptual Cost Pruning, modeled human QRE production using a preference-based referring ***** expression generation ***** algorithm, first removing facts from the input knowledge base based on a model of perceptual cost. | ||
| reranking | 101 | |
| 2020.semeval-1.46 We analyse the metrics used in the evaluation and we propose an additional score based on model from subtask B, which correlates well with our manual ranking, as well as ***** reranking ***** method based on the same principle. | ||
| W19-5333 We also ensemble and fine-tune our models on domain-specific data, then decode using noisy channel model ***** reranking *****. | ||
| 2016.iwslt-1.22 The attention-based approach has been used for ***** reranking ***** the n-best lists for both phrasebased and hierarchical setups. | ||
| R17-1101 We propose a neural ***** reranking ***** system for named entity recognition (NER), leverages recurrent neural network models to learn sentence-level patterns that involve named entity mentions. | ||
| 2021.emnlp-main.292 We employ a single multi-task transformer model to perform all the necessary subtasks—retrieving supporting facts, ***** reranking ***** them, and predicting the answer from all retrieved documents—in an iterative fashion | ||
| VQA | 101 | |
| N18-1201 We propose four categories of auxiliary features for ensembling for ***** VQA *****. | ||
| 2021.naacl-main.248 To address this, we first present a gradient-based interpretability approach to determine the questions most strongly correlated with the reasoning question on an image, and use this to evaluate ***** VQA ***** models on their ability to identify the relevant sub-questions needed to answer a reasoning question. | ||
| W19-4812 Deep neural networks have enabled significant progress on many challenging problems such as visual question answering (***** VQA *****). | ||
| 2021.naacl-main.289 We then modify the best existing ***** VQA ***** methods and propose baseline solvers for this task | ||
| W19-1801 Visual question answering ( VQA ) models have been shown to over - rely on linguistic biases in *****VQA***** datasets , answering questions blindly without considering visual context . | ||
| informative | 101 | |
| D19-1323 In this paper, we introduce SENECA, a novel System for ENtity-drivEn Coherent Abstractive summarization framework that leverages entity information to generate ***** informative ***** and coherent abstracts. | ||
| 2021.emnlp-main.184 In this paper, we propose an Entity-Agnostic Representation Learning (EARL) method to introduce knowledge graphs to ***** informative ***** conversation generation. | ||
| 2021.deelio-1.4 Experimental results manifest that our model can properly schedule conversational topics and pick suitable knowledge to generate ***** informative ***** responses comparing to several strong baselines. | ||
| 2021.ccl-1.87 Based on these ***** informative ***** knowledge triples wedesign two auxiliary tasks to incorporate commonsense knowledge into the main QG modelwhere one task is Concept Relation Classification and the other is Tail Concept Generation. | ||
| C16-1244 The experiments show that ***** informative ***** and interpretable anecdotes can be recognized | ||
| language technology | 101 | |
| L16-1526 For this, we use the WebLicht ***** language technology ***** infrastructure. | ||
| W16-4016 In this paper we describe how the complexity of human communication can be analysed with the help of ***** language technology *****. | ||
| L14-1089 The DARPA RATS program was established to foster development of ***** language technology ***** systems that can perform well on speaker-to-speaker communications over radio channels that evince a wide range in the type and extent of signal variability and acoustic degradation. | ||
| P17-5005 Our target audience are researchers and practitioners in machine learning, parsing (syntactic and semantic) and ***** language technology *****, not necessarily experts in MWEs, who are interested in tasks that involve or could benefit from considering MWEs as a pervasive phenomenon in human language and communication. | ||
| L14-1529 This data is suitable for use in further lexicographic work and in various ***** language technology ***** projects. | ||
| answering | 101 | |
| L10-1362 Question ***** answering ***** (QA) systems aim at retrieving precise information from a large collection of documents. | ||
| 2021.wnut-1.1 In this work, we investigate the effect of text simplification in the task of question-***** answering ***** using a comprehension context. | ||
| 2021.emnlp-main.293 Information seeking is an essential step for open-domain question ***** answering ***** to efficiently gather evidence from a large corpus. | ||
| 2021.naacl-main.193 Comprehensive experiments on three video-and-language tasks (text-to-video retrieval, video captioning, and video question ***** answering *****) across five datasets demonstrate that our approach outperforms previous state-of-the-art methods. | ||
| C16-1191 Natural language generation (NLG) is an important component of question ***** answering *****(QA) systems which has a significant impact on system quality. | ||
| spoken dialogue | 101 | |
| P17-1120 Recently emerged intelligent assistants on smartphones and home electronics (e.g., Siri and Alexa) can be seen as novel hybrids of domain-specific task-oriented ***** spoken dialogue ***** systems and open-domain non-task-oriented ones. | ||
| L10-1351 We describe an experimentalWizard-of-Oz-setup for the integration of emotional strategies into ***** spoken dialogue ***** management. | ||
| L08-1493 Regulus is an Open Source platform that supports construction of rule-based medium-vocabulary ***** spoken dialogue ***** applications. | ||
| W18-6539 This paper summarises the experimental setup and results of the first shared task on end-to-end (E2E) natural language generation (NLG) in ***** spoken dialogue ***** systems. | ||
| D18-1417 Spoken Language Understanding ( SLU ) , which typically involves intent determination and slot filling , is a core component of *****spoken dialogue***** systems . | ||
| adversarial network | 101 | |
| 2020.findings-emnlp.218 We build our model based on the conditional generative ***** adversarial network *****, and propose to incorporate a simple yet effective diversity loss term into the model in order to improve the diversity of outputs. | ||
| D18-1387 In particular, we investigate context-aware and context-agnostic models for predicting vague words, and explore auxiliary-classifier generative ***** adversarial network *****s for characterizing sentence vagueness. | ||
| 2021.naacl-industry.30 We propose OodGAN, a sequential generative ***** adversarial network ***** (SeqGAN) based model for OOD data generation. | ||
| N18-1133 Inspired by generative ***** adversarial network *****s (GANs), we use one knowledge graph embedding model as a negative sample generator to assist the training of our desired model, which acts as the discriminator in GANs. | ||
| 2021.eacl-srw.23 Although several generative *****adversarial networks***** (GANs) have been proposed thus far, these models still suffer from mode-collapsing if the models are not pre-trained. | ||
| neural machine translation ( NMT | 101 | |
| P19-1352 Word embedding is central to *****neural machine translation ( NMT***** ) , which has attracted intensive research interest in recent years . | ||
| D19-6503 Document - level context has received lots of attention for compensating *****neural machine translation ( NMT***** ) of isolated sentences . | ||
| 2020.emnlp-main.82 Despite the improvement of translation quality , *****neural machine translation ( NMT***** ) often suffers from the lack of diversity in its generation . | ||
| D19-1331 We report on search errors and model errors in *****neural machine translation ( NMT***** ) . | ||
| 2020.emnlp-main.216 As a sequence - to - sequence generation task , *****neural machine translation ( NMT***** ) naturally contains intrinsic uncertainty , where a single sentence in one language has multiple valid counterparts in the other . | ||
| cosine | 100 | |
| 2021.emnlp-main.372 Standard representational similarity measures such as ***** cosine ***** similarity and Euclidean distance have been successfully used in static word embedding models to understand how words cluster in semantic space. | ||
| L10-1367 Similarity is computed as the ***** cosine ***** of the probability distributions for each word over WordNet. | ||
| N19-1100 On the latter, we show that even the simplest averaged word vectors compared by rank correlation easily rival the strongest deep representations compared by ***** cosine ***** similarity. | ||
| 2019.icon-1.20 For every sentence, the semantic relatedness between the words from sentence and a set of emotion-specific words is calculated using ***** cosine ***** similarity. | ||
| W19-2001 Furthermore, we show that it produces considerably less entropic concept activation profiles than the popular ***** cosine ***** similarity | ||
| RST | 100 | |
| J18-2001 We analyze ***** RST ***** discourse parsing as dependency parsing by adapting to ***** RST ***** a recent proposal in syntactic parsing that relies on head-ordered dependency trees, a representation isomorphic to headed constituency trees. | ||
| W19-2705 ***** RST ***** relations are categorized as either mononuclear (comprising a nucleus and a satellite span) or multinuclear (comprising two or more nuclei spans). | ||
| J17-4001 In particular, we show that (i) all aspects of the ***** RST ***** tree are relevant, (ii) nuclearity is more useful than relation type, and (iii) the similarity of the translation ***** RST ***** tree to the reference ***** RST ***** tree is positively correlated with translation quality. | ||
| 2020.aacl-main.67 We demonstrate this through our tree-recursive neural model, namely ***** RST *****-Recursive, which takes advantage of the text's ***** RST ***** features produced by a state of the art ***** RST ***** parser. | ||
| W19-2706 A finding of this analysis is that the nuclearity and the relational identification of attribution structures are shown to depend on the writer's intended effect, such that attributional relations cannot be considered as a single relation, but rather as attributional instances of other ***** RST ***** relations | ||
| taggers | 100 | |
| L08-1368 Experimental results show significant differences of performance between the ***** taggers *****. | ||
| C16-1044 Our approach is based on Recurrent Neural Networks (RNN) and has the following advantages: (a) it does not use word alignment information, (b) it does not assume any knowledge about foreign languages, which makes it applicable to a wide range of resource-poor languages, (c) it provides truly multilingual ***** taggers *****. | ||
| 2021.nodalida-main.31 We then mask UPOS tags based on errors made by ***** taggers ***** to tease away the contribution of UPOS tags that ***** taggers ***** succeed and fail to classify correctly and the impact of tagging errors. | ||
| 1995.iwpt-1.25 The scheme also enables using the same information sources as the Constraint Grammar approach, and the hope is that it can improve on the performance of both statistical ***** taggers ***** and surface-syntactic analyzers. | ||
| L06-1372 The idea is to present the human ***** taggers ***** a pre-tagged version of the corpus | ||
| variational | 100 | |
| P18-1245 To enable posterior inference over the latent variables, we derive an efficient ***** variational ***** inference procedure based on the wake-sleep algorithm. | ||
| N19-1113 While both objectives are subject to noise in gradient updates, we show through analysis and experiments that the ***** variational ***** lower bound is robust whereas the generalized Brown objective is vulnerable. | ||
| P18-5003 This tutorial offers a general introduction to ***** variational ***** inference followed by a thorough and example-driven discussion of how to use ***** variational ***** methods for training DGMs. | ||
| P17-2019 We provide interpretations of the framework based on expectation maximization and ***** variational ***** inference, and show that it enables parsing and language modeling within a single implementation. | ||
| L16-1477 While there is a large interest in linguistics to increase the quantitative aspect of such studies, it requires training in both ***** variational ***** linguistics and computational methods, a combination that is still not common | ||
| tokenization | 100 | |
| 2021.sustainlp-1.16 In datasets from four disparate domains, we find adaptive ***** tokenization ***** on a pretrained RoBERTa model provides greater than 85% of the performance benefits of domain specific pretraining. | ||
| W19-5419 We also report the effect of ***** tokenization ***** on translation model performance. | ||
| L14-1317 The multilayered annotation of the corpus involves a XML-TEI encoding followed by a ***** tokenization ***** step where each token is univocally identified through a CTS urn notation and then associated to a part-of-speech and a lemma. | ||
| L08-1408 The current version of the tool suite provides functions ranging from ***** tokenization ***** to chunking and Named Entity Recognition (NER). | ||
| L10-1310 It is implemented as a UIMA (Unstructured Information Management Architecture) annotator and is highly configurable: concepts can come from standardised or proprietary terminologies; arbitrary attributes can be associated with dictionary entries, and those attributes can then be associated with the named entities in the output; numerous search strategies and search options can be specified; any tokenizer packaged as a UIMA annotator can be used to tokenize the dictionary, so the same ***** tokenization ***** can be guaranteed for the input and dictionary, minimising ***** tokenization ***** mismatch errors; and the types and features of UIMA annotations used as input and generated as output can also be controlled | ||
| text simplification | 100 | |
| 2021.wnut-1.1 In this work, we investigate the effect of ***** text simplification ***** in the task of question-answering using a comprehension context. | ||
| N18-1063 Current measures for evaluating ***** text simplification ***** systems focus on evaluating lexical text aspects, neglecting its structural aspects. | ||
| P19-1331 Our model outperforms previous state-of-the-art neural sentence simplification models (without external knowledge) by large margins on three benchmark ***** text simplification ***** corpora in terms of SARI (+0.95 WikiLarge, +1.89 WikiSmall, +1.41 Newsela), and is judged by humans to produce overall better and simpler output sentences. | ||
| 2020.readi-1.7 Automatic ***** text simplification ***** is an active research area, and there are first systems for English, Spanish, Portuguese, and Italian. | ||
| L16-1045 This paper presents an approach for automatic evaluation of the readability of *****text simplification***** output for readers with cognitive disabilities . | ||
| RE | 99 | |
| D19-6203 Unfortunately, the models in the literature tend to employ different strategies to perform pooling for ***** RE *****, leading to the challenge to determine the best pooling mechanism for this problem, especially in the biomedical domain. | ||
| 2021.emnlp-main.366 Existing relation extraction (***** RE *****) methods typically focus on extracting relational facts between entity pairs within single sentences or documents. | ||
| 2020.acl-main.142 But, even with recent advances in unsupervised pre-training and knowledge enhanced neural ***** RE *****, models still show a high error rate. | ||
| N19-1323 One way to improve ***** RE ***** is to use KB Embeddings (KBE) for link prediction. | ||
| 2020.acl-main.140 Despite the recent progress, little is known about the features captured by state-of-the-art neural relation extraction (***** RE *****) models | ||
| Argument | 99 | |
| C16-1158 ***** Argument ***** mining aims to determine the argumentative structure of texts. | ||
| P19-1463 We show that feature-rich SVM learners and Neural Network architectures outperform standard baselines in ***** Argument ***** Mining over such complex data. | ||
| D17-1218 *****Argument***** mining has become a popular research area in NLP . | ||
| 2021.bea-1.22 *****Argument***** mining is often addressed by a pipeline method where segmentation of text into argumentative units is conducted first and proceeded by an argument component identification task . | ||
| 2020.argmining-1.5 Computational Argumentation in general and *****Argument***** Mining in particular are important research fields . | ||
| metrics | 99 | |
| 2020.lrec-1.317 However, some difficulties still remain: the time required for manual narrative transcription and the decision on how transcripts should be divided into sentences for successful application of parsers used in ***** metrics *****, such as Idea Density, to analyze the transcripts. | ||
| 2021.crac-1.5 The classifiers substantially increase coreference performance in our experiments with Dutch literature across all ***** metrics ***** on the development set: mention detection, LEA, CoNLL, and especially pronoun accuracy. | ||
| 2021.acl-long.533 Statistically, humans are unbiased, high variance estimators, while ***** metrics ***** are biased, low variance estimators. | ||
| S18-1042 According to the ***** metrics ***** of SemEval 2018, our system gets the final scores of 0.636, 0.531, 0.731, 0.708, and 0.408 on 5 subtasks, respectively. | ||
| D19-5817 Our study suggests that while current ***** metrics ***** may be suitable for existing QA datasets, they limit the complexity of QA datasets that can be created. | ||
| knowledge graphs | 99 | |
| 2021.emnlp-main.712 Such relation embeddings are appealing because they can, in principle, encode relational knowledge in a more fine-grained way than is possible with ***** knowledge graphs *****. | ||
| 2020.coling-main.369 Furthermore, when working on a specific domain, ***** knowledge graphs ***** in its entirety contribute towards extraneous information and noise. | ||
| 2020.emnlp-main.99 It performs multi-hop, multi-relational reasoning over subgraphs extracted from external ***** knowledge graphs *****. | ||
| 2021.acl-long.147 We study the problem of generating data poisoning attacks against Knowledge Graph Embedding (KGE) models for the task of link prediction in ***** knowledge graphs *****. | ||
| 2021.naacl-main.45 The problem of answering questions using knowledge from pre-trained language models (LMs) and ***** knowledge graphs ***** (KGs) presents two challenges: given a QA context (question and answer choice), methods need to (i) identify relevant knowledge from large KGs, and (ii) perform joint reasoning over the QA context and KG. | ||
| anaphora resolution | 99 | |
| 2020.coling-main.435 One critical issue of zero ***** anaphora resolution ***** (ZAR) is the scarcity of labeled data. | ||
| N18-2082 In applying this process to a representative NMT system, we find its encoder appears most suited to supporting inferences at the syntax-semantics interface, as compared to ***** anaphora resolution ***** requiring world knowledge. | ||
| 2020.lrec-1.2 Anaphora resolution (coreference) systems designed for the CONLL 2012 dataset typically cannot handle key aspects of the full ***** anaphora resolution ***** task such as the identification of singletons and of certain types of non-referring expressions (e.g., expletives), as these aspects are not annotated in that corpus. | ||
| W18-0702 In this paper, we discuss three datasets extracted from the ARRAU corpus to support the three subtasks of the CRAC 2018 Shared Task–identity ***** anaphora resolution ***** over ARRAU-style markables, bridging references resolution, and discourse deixis; the evaluation scripts assessing system performance on those datasets; and preliminary results on these three tasks that may serve as baseline for subsequent research in these phenomena. | ||
| 2020.coling-main.331 Bridging reference resolution is an *****anaphora resolution***** task that is arguably more challenging and less studied than entity coreference resolution . | ||
| reading | 99 | |
| P17-1172 In this paper, we present an approach of ***** reading ***** text while skipping irrelevant information if needed. | ||
| W19-5932 We formulate dialog state tracking as a ***** reading ***** comprehension task to answer the question what is the state of the current dialog? | ||
| D18-1238 This paper presents a new compositional encoder for ***** reading ***** comprehension (RC). | ||
| W19-8606 The writing process consists of several stages such as drafting, revising, editing, and proof***** reading *****. | ||
| 2020.coling-main.235 The novel framework shows an interesting perspective on machine ***** reading ***** comprehension and cognitive science. | ||
| nlg system | 99 | |
| W18-6537 Our study suggests that, given the rapidly increasing level of research in the area, a common framework is urgently needed to compare the performance of AQG systems and *****NLG systems***** more generally. | ||
| 2021.eacl-main.25 Despite growing interest in natural language generation (NLG) models that produce diverse outputs, there is currently no principled method for evaluating the diversity of an *****NLG system*****. | ||
| W18-6556 While our submission does not rank highly using automated metrics, qualitative investigation of generated utterances suggests the use of additional information in neural network *****NLG systems***** to be a promising research direction. | ||
| W17-3510 This talk will present a few *****NLG systems***** developed within Thomson Reuters providing information to professionals such as lawyers, accountants or traders. | ||
| D18-1429 There has always been criticism for using n-gram based similarity metrics, such as BLEU, NIST, etc, for evaluating the performance of *****NLG systems*****. | ||
| finetuning | 98 | |
| 2021.eacl-main.11 Finally, we conduct experiments showing that better response quality can be achieved in zero-shot and ***** finetuning ***** settings by training on our data than on the larger but much noisier Opensubtitles dataset. | ||
| 2021.eacl-main.95 In this work, we study how the ***** finetuning ***** stage in the pretrain-finetune framework changes the behavior of a pretrained neural language generator. | ||
| 2021.acl-short.117 Firstly, We evaluate our pre-trained model on various pronoun resolution datasets without any ***** finetuning *****. | ||
| 2020.emnlp-main.174 This confirms that masking can be utilized as an efficient alternative to ***** finetuning *****. | ||
| 2021.emnlp-main.737 Further ***** finetuning ***** a trained model with target speaker data is the most natural approach for adaptation, but it takes a lot of compute and may cause catastrophic forgetting to the existing speakers | ||
| valence | 98 | |
| 2021.emnlp-main.574 Our results indicate that ***** valence ***** associations of non-discriminatory, non-social group words represent widely-shared associations, in seven languages and over 200 years. | ||
| L14-1124 Aside from the creation of such a multilingual ***** valence ***** resource through converging or converting existing resources, the paper also addresses a tool for the creation of such a resource as part of corpus annotation for less resourced languages. | ||
| L08-1616 An experiment based on 702 sentences evaluated by judges shows that automatic techniques developed for estimating the ***** valence ***** from relatively small corpora are more efficient if the corpora used contain texts similar to the one that must be evaluated. | ||
| I17-4016 For word level task our best run achieved MAE 0.545 (ranked 2nd), PCC 0.892 (ranked 2nd) in ***** valence ***** prediction and MAE 0.857 (ranked 1st), PCC 0.678 (ranked 2nd) in arousal prediction. | ||
| L16-1215 This article presents Walenty - a new ***** valence ***** dictionary of Polish predicates, concentrating on its creation process and access via Internet browser | ||
| input | 98 | |
| R17-1048 Similarity between sentences is calculated from graph, and the similarity values are ***** input ***** to classifiers trained by Logistic Model Tree. | ||
| D18-1210 Despite their success, most existing CNN models employed in NLP share the same learned (and static) set of filters for all ***** input ***** sentences. | ||
| 2020.nlptea-1.13 Secondly, word vectors are ***** input ***** into BiLSTM layer to learn context features. | ||
| 2021.econlp-1.11 Next, tweets are ***** input ***** to an unsupervised deep clustering approach to automatically detect trading framing patterns. | ||
| W19-8619 In our investigation, although the recurrent encoder generally outperforms the pooling based encoder by learning the sequential dependencies, it is sensitive to the order of the ***** input ***** records (i.e., performance decreases when injecting the random shuffling noise over ***** input ***** data) | ||
| Arabic | 98 | |
| W19-4615 We present a collection of morphologically annotated corpora for seven ***** Arabic ***** dialects: Taizi Yemeni, Sanaani Yemeni, Najdi, Jordanian, Syrian, Iraqi and Moroccan ***** Arabic *****. | ||
| C18-1113 We also report on additional insights from a data analysis of similarity and difference across ***** Arabic ***** dialects. | ||
| 2020.lrec-1.165 The corpus comprises more than 30,000 ***** Arabic ***** song lyrics in 6 ***** Arabic ***** dialects for singers from 18 different ***** Arabic ***** countries. | ||
| 2014.amta-researchers.23 In this work we describe an extensive exploration of data selection techniques over ***** Arabic ***** to French datasets, and propose methods to address both similarity and coverage considerations while maintaining a limited model size. | ||
| 2021.acl-short.68 *****Arabic***** diacritization is a fundamental task for Arabic language processing . | ||
| abstract meaning | 98 | |
| 2020.emnlp-main.196 In the literature, the research on ***** abstract meaning ***** representation (AMR) parsing is much restricted by the size of human-curated dataset which is critical to build an AMR parser with good performance. | ||
| 2021.semeval-1.106 It shows that the pre-trained BERT token embeddings can be used as additional knowledge for understanding ***** abstract meaning *****s in question answering. | ||
| 2020.conll-shared.8 Among the five frameworks, we address only the ***** abstract meaning ***** representation framework and propose a joint state model for the graph-sequence iterative inference of (Cai and Lam, 2020) for a simplified graph-sequence inference. | ||
| W19-3320 Existing approaches such as semantic role labeling (SRL) and ***** abstract meaning ***** representation (AMR) still have features related to the peculiarities of the particular language. | ||
| 2021.semeval-1.20 It leverages heterogeneous knowledge to learn adequate evidences, and seeks for an effective semantic space of abstract concepts to better improve the ability of a machine in understanding the ***** abstract meaning ***** of natural language. | ||
| research | 98 | |
| 2006.bcs-1.1 Processing of Colloquial Arabic is a relatively new area of ***** research *****, and a number of interesting challenges pertaining to spoken Arabic dialects arise. | ||
| 2020.sdp-1.20 We were able to obtain ~267,000 unique ***** research ***** papers through our fully-automated framework using ~76,000 queries, resulting in almost 200,000 more papers than the number of queries. | ||
| 2020.sdp-1.2 I will discuss the status and future of arXiv, and possibilities and plans to make more effective use of the ***** research ***** database to enhance ongoing ***** research ***** efforts. | ||
| W19-6129 Finally we discuss the dataset in light of the results and point to future ***** research ***** and plans for further improving both the dataset and methods of predicting prosodic prominence from text. | ||
| L16-1072 We have seen that many resources exist which are useful for MT and similar work, but the majority are for (academic) ***** research ***** or educational use only, and as such not available for commercial use. | ||
| negative sampling | 98 | |
| 2021.teachingnlp-1.19 Students implement the core parts of the method, including text preprocessing, ***** negative sampling *****, and gradient descent. | ||
| 2021.emnlp-main.492 We study six ***** negative sampling ***** strategies and apply them to the fine-tuning stage and, as a noteworthy novelty, to the synthetic data that we use for pre-training. | ||
| 2020.emnlp-main.252 To address such an issue, we propose a new task of determining whether or not an input pair of emotion and cause has a valid causal relationship under different contexts, and construct a corresponding dataset via manual annotation and ***** negative sampling ***** based on an existing benchmark dataset. | ||
| D19-1075 We also propose a weighted ***** negative sampling ***** strategy to generate valuable negative samples during training and we regard prediction as a bidirectional problem in the end. | ||
| 2021.acl-long.300 It uses data augmentation and ***** negative sampling ***** techniques on noisy parallel sentence data to directly learn a cross-lingual embedding-based query relevance model. | ||
| phrase alignment | 98 | |
| E17-1066 We propose an architecture based on Gated Recurrent Unit that supports (i) representation learning of phrases of arbitrary granularity and (ii) task-specific attentive pooling of ***** phrase alignment *****s between two sentences. | ||
| 2020.emnlp-main.125 We address the ***** phrase alignment ***** problem by combining an unordered tree mapping algorithm and phrase representation modelling that explicitly embeds the similarity distribution in the sentences onto powerful contextualized representations. | ||
| 2020.lrec-1.847 It achieves search-efficiency by constraining the lattice so that all the paths go through a ***** phrase alignment ***** pair with the highest alignment score. | ||
| P19-1144 Our method can easily be applied to language models with different network architectures since an independent module is used for phrase induction and context-*****phrase alignment*****, and no change is required in the underlying language modeling network. | ||
| 2021.starsem-1.7 To merge symbolic and deep learning methods, we propose an inference framework called NeuralLog, which utilizes both a monotonicity-based logical inference engine and a neural network language model for *****phrase alignment*****. | ||
| low - resource | 98 | |
| W19-5433 We introduce a purely monolingual approach to filtering for parallel data from a noisy corpus in a *****low - resource***** scenario . | ||
| N19-1383 We focus on improving name tagging for *****low - resource***** languages using annotations from related languages . | ||
| 2020.sltu-1.43 Automatic Speech Recognition for *****low - resource***** languages has been an active field of research for more than a decade . | ||
| W16-3714 Acquiring labeled speech for *****low - resource***** languages is a difficult task in the absence of native speakers of the language . | ||
| 2020.coling-main.405 Text input technologies for *****low - resource***** languages support literacy , content authoring , and language learning . | ||
| HPSG | 97 | |
| L12-1023 The c-structure was based on a DCG grammar of Polish, while the f-structure level was mainly inspired by the available ***** HPSG ***** analyses of Polish. | ||
| L16-1197 We describe resources aimed at increasing the usability of the semantic representations utilized within the DELPH-IN (Deep Linguistic Processing with ***** HPSG *****) consortium. | ||
| 2019.lilt-17.3 It points out that TAG combines actual structure while ***** HPSG ***** (and Categorial Grammar and other valence-based frameworks) specify valence of lexical items and hence potential structure | ||
| 2000.iwpt-1.19 Over the past few years significant progress was accomplished in efficient processing with wide - coverage *****HPSG***** grammars . | ||
| 2020.iwpt-1.14 This paper presents the development of a deep parser for Spanish that uses a *****HPSG***** grammar and returns trees that contain both syntactic and semantic information . | ||
| crowdsourced | 97 | |
| N18-2120 Furthermore, we demonstrate through ***** crowdsourced ***** experiments that we can dramatically alter colorizations simply by manipulating descriptive color words in captions. | ||
| 2021.naacl-srw.15 Automatic evaluations and ***** crowdsourced ***** manual evaluations show that the proposed model makes generated responses more emotionally aware. | ||
| 2020.emnlp-main.23 In addition, we collect a new, ***** crowdsourced ***** evaluation benchmark. | ||
| S18-2026 We evaluate a wide range of approaches on a ***** crowdsourced ***** data set containing over 100,000 judgments on over 2,000 assertions. | ||
| 2021.repl4nlp-1.6 Learning effective language representations from ***** crowdsourced ***** labels is crucial for many real-world machine learning tasks | ||
| convolutional neural | 97 | |
| 2018.gwc-1.27 In addition, we use ***** convolutional neural ***** network and piecewise max pooling ***** convolutional neural ***** network relation extraction models that efficiently grasp key features in sentences. | ||
| C16-1289 Upon the generated source and target phrase structures, we stack a ***** convolutional neural ***** network to integrate vector representations of linguistic units on the structures into bilingual phrase embeddings. | ||
| C18-1156 When used along with content-based feature extractors such as ***** convolutional neural ***** networks, we see a significant boost in the classification performance on a large Reddit corpus. | ||
| D17-1191 In this paper, we design a novel ***** convolutional neural ***** network (CNN) with residual learning, and investigate its impacts on the task of distantly supervised noisy relation extraction. | ||
| W18-4930 For TRAPACC, the classifier consists of a data-independent dimension reduction and a ***** convolutional neural ***** network (CNN) for learning and labelling transitions. | ||
| deep reinforcement learning | 97 | |
| D19-1042 To solve the mismatch between training and inference as well as modeling label dependencies in a more principled way, we formulate HTC as a Markov decision process and propose to learn a Label Assignment Policy via ***** deep reinforcement learning ***** to determine where to place an object and when to stop the assignment process. | ||
| 2021.naacl-main.316 Building on these shortcomings, we propose a ***** deep reinforcement learning ***** approach that makes time-aware decisions to trade stocks while optimizing profit using textual data. | ||
| D17-1237 This paper addresses this challenge by formulating the task in the mathematical framework of options over Markov Decision Processes (MDPs), and proposing a hierarchical ***** deep reinforcement learning ***** approach to learning a dialogue manager that operates at different temporal scales. | ||
| P18-1053 In this paper, we show how to integrate these goals, applying ***** deep reinforcement learning ***** to deal with the task. | ||
| C18-1107 The proposed structured ***** deep reinforcement learning ***** is based on graph neural networks (GNN), which consists of some sub-networks, each one for a node on a directed graph. | ||
| methods | 96 | |
| 2021.naacl-main.353 We provide a novel dataset for this task, encompassing over 8,000 comparative entries, and show that neural sequence models outperform conventional ***** methods ***** applied to this task so far. | ||
| P17-1078 Results show that such pretraining significantly improves the model, leading to accuracies competitive to the best ***** methods ***** on six benchmarks. | ||
| 2021.ccl-1.82 Existing ***** methods ***** only consider the features of the microblog itself with-out combining the semantics of emotion categories for modeling. | ||
| 2020.lrec-1.855 The corpus database is distributed to permit fast indexing, and provides a simple web front-end with corpus linguistics ***** methods ***** for sub-corpus comparison and retrieval. | ||
| P17-1028 We evaluate a suite of ***** methods ***** across two different applications and text genres: Named-Entity Recognition in news articles and Information Extraction from biomedical abstracts. | ||
| amr graph | 96 | |
| 2021.iwpt-1.5 This paper presents a novel approach to AMR parsing by combining heterogeneous data (tokens, concepts, labels) as one input to a transformer to learn attention, and use only attention matrices from the transformer to predict all elements in *****AMR graphs***** (concepts, arcs, labels). | ||
| W17-2315 Our key contributions are: (1) an empirical validation of our hypothesis that an event is a subgraph of the *****AMR graph*****, (2) a neural network-based model that identifies such an event subgraph given an AMR, and (3) a distant supervision based approach to gather additional training data. | ||
| D18-1198 We propose to conduct the search in a refined search space based on a new compact *****AMR graph***** and an improved oracle. | ||
| 2021.law-1.6 By comparing parallel *****AMR graphs*****, we can identify specific points of divergence. | ||
| S17-2096 We cast language generation from AMR as a sequence of actions (e.g., insert/remove/rename edges and nodes) that progressively transform the *****AMR graph***** into a dependency parse tree. | ||
| natural language processing ( NLP ) | 96 | |
| 2020.acl-main.312 Attention has been proven successful in many *****natural language processing ( NLP )***** tasks . | ||
| 2021.ranlp-1.107 Pretraining - based neural network models have demonstrated state - of - the - art ( SOTA ) performances on *****natural language processing ( NLP )***** tasks . | ||
| 2021.acl-long.72 Sentence embeddings are an important component of many *****natural language processing ( NLP )***** systems . | ||
| 2020.lrec-1.332 The current situation regarding the existence of *****natural language processing ( NLP )***** resources and tools for Corsican reveals their virtual non - existence . | ||
| 2020.blackboxnlp-1.30 We study the behavior of several black - box search algorithms used for generating adversarial examples for *****natural language processing ( NLP )***** tasks . | ||
| ELMo | 95 | |
| R19-1070 Moreover, we analyze the impact of preprocessing steps (lowercasing, suppression of punctuation and stop words removal) and word meaning similarity based on different distributions (word translation probability, Word2Vec, fastText and ***** ELMo *****) on the performance of the task. | ||
| 2021.emnlp-main.745 Based on treebank size and available ***** ELMo ***** models, we select Hungarian, Uyghur (a zero-shot language for mBERT) and Vietnamese. | ||
| 2020.emnlp-main.436 Given that a key challenge with this task is the scarcity of annotated data, our models rely on either pretrained representations (i.e. RoBERTa, BERT or ***** ELMo *****), transfer and multi-task learning (by leveraging complementary datasets), and self-training techniques. | ||
| D19-1275 Finally, we find that while probes on the first layer of ***** ELMo ***** yield slightly better part-of-speech tagging accuracy than the second, probes on the second layer are substantially more selective, which raises the question of which layer better represents parts-of-speech. | ||
| 2020.coling-main.450 Classifiers trained on auxiliary probing tasks are a popular tool to analyze the representations learned by neural sentence encoders such as BERT and ***** ELMo ***** | ||
| overfitting | 95 | |
| 2021.acl-long.191 ProtAugment is a novel extension of Prototypical Networks, that limits ***** overfitting ***** on the bias introduced by the few-shots classification objective at each episode. | ||
| W19-3505 In so doing, we employ several well-established automated text analysis tools and build on the common practices for handling highly imbalanced datasets and reducing the sensitivity to ***** overfitting *****. | ||
| K19-1031 Further experiments on length-controlled training data reveal that absolute position actually causes ***** overfitting ***** to the sentence length. | ||
| 2020.acl-main.615 Prior work has explored directly regularizing the output distributions of probabilistic models to alleviate peaky (i.e. over-confident) predictions, a common sign of ***** overfitting *****. | ||
| 2009.iwslt-papers.3 We extend MERT optimization by maximizing the margin between the reference and incorrect translations under the L2-norm prior to avoid ***** overfitting ***** problem | ||
| bootstrapping | 95 | |
| 2021.acl-long.13 We demonstrate that the proposed framework is highly effective in ***** bootstrapping ***** the performance of the two agents in transfer learning. | ||
| C16-1147 These lexicons are typically mined using ***** bootstrapping *****, starting from very few seed words whose polarity is given, e.g., 50-60 words, and sometimes even just 5-6. | ||
| W19-6102 We show here that a ***** bootstrapping ***** approach to treebanking via interlingual grammars is plausible and useful in a process where grammar engineering and treebanking are jointly pursued when creating resources for the target language. | ||
| 2020.findings-emnlp.45 We query if machine translation is an adequate substitute for training data, and extend this to investigate ***** bootstrapping ***** using joint training with English, paraphrasing, and multilingual pre-trained models. | ||
| L14-1197 The relation extraction stage is a combination of two systems: SProUT, a shallow processor which uses hand-written rules to discover relation instances from local text units and DARE which extracts relation instances from complete sentences, using rules that are learned in a ***** bootstrapping ***** process, starting with semantic seeds | ||
| Annotated | 95 | |
| L06-1473 ***** Annotated ***** parallel texts are an important resource for quantitative and qualitative linguistic research | ||
| L14-1453 *****Annotated***** corpora are essential resources for many applications in Natural Language Processing . | ||
| 2020.computerm-1.12 The TermEval 2020 shared task provided a platform for researchers to work on automatic term extraction ( ATE ) with the same dataset : the *****Annotated***** Corpora for Term Extraction Research ( ACTER ) . | ||
| W16-5208 *****Annotated***** corpora are crucial language resources , and pre - annotation is an usual way to reduce the cost of corpus construction . | ||
| L12-1242 *****Annotated***** corpora such as treebanks are important for the development of parsers , language applications as well as understanding of the language itself . | ||
| inductive | 95 | |
| 2021.acl-demo.8 Examples include assessing language similarity for effective transfer learning, injecting ***** inductive ***** biases into machine learning models or creating resources such as dictionaries and inflection tables. | ||
| D18-1056 One plausible suggestion based on our initial experiments is that the differences in the ***** inductive ***** biases of the embedding algorithms lead to an optimization landscape that is riddled with local optima, leading to a very small basin of convergence, but we present this more as a challenge paper than a technical contribution. | ||
| 2020.findings-emnlp.252 To explore the ***** inductive ***** biases that cause these compositional representations to arise during training, we conduct simple experiments on synthetic data. | ||
| W17-2710 In the last few years, more ***** inductive ***** approaches have emerged, seeking to discover unknown event types and roles in raw text | ||
| 2021.repl4nlp-1.5 It has been long known that sparsity is an effective *****inductive***** bias for learning efficient representation of data in vectors with fixed dimensionality , and it has been explored in many areas of representation learning . | ||
| autoregressive | 95 | |
| 2020.emnlp-main.541 This paper proposes Recurrent Event Network (RE-Net), a novel ***** autoregressive ***** architecture for predicting future interactions. | ||
| 2021.eacl-main.18 Experimental results show that our model significantly outperforms existing non-***** autoregressive ***** baselines and achieves competitive performance with many strong ***** autoregressive ***** models. | ||
| P19-1125 In this paper, we propose an imitation learning framework for non-***** autoregressive ***** machine translation, which still enjoys the fast translation speed but gives comparable translation performance compared to its auto-regressive counterpart. | ||
| 2021.emnlp-main.734 Hence, these ***** autoregressive ***** models constitute ideal agents to operate in text-based environments where language understanding and generative capabilities are essential. | ||
| 2020.conll-1.40 Our approach 1) speeds up decoding by 3x while outperforming the ***** autoregressive ***** model and 2) significantly improves cross-lingual transfer in the low-resource setting by 37% compared to ***** autoregressive ***** baseline | ||
| emoji | 95 | |
| 2020.lincr-1.7 In sentiment analysis, several researchers have used ***** emoji ***** and hashtags as specific forms of training and supervision. | ||
| N18-2107 Our main finding is that incorporating the two synergistic modalities, in a combined model, improves accuracy in an ***** emoji ***** prediction task. | ||
| S18-1063 Our model achieved 30.25% macro-averaged F-score in the first subtask (i.e., ***** emoji ***** prediction in English), ranking 7th out of 48 participants. | ||
| 2021.wanlp-1.7 To conduct this investigation, we, first, created the Arabic ***** emoji ***** sentiment lexicon (Arab-ESL). | ||
| 2020.nlpcss-1.22 We explore the ways in which users include ***** emoji ***** in these self-descriptions, finding different patterns than those observed around ***** emoji ***** usage in tweets | ||
| Answer | 95 | |
| N18-1011 Complementing the research in linguistics on discourse and information structure , in computational linguistics identifying discourse concepts was also shown to improve the performance of certain applications , for example , Short *****Answer***** Assessment systems ( Ziai and Meurers , 2014 ) . | ||
| P18-1162 *****Answer***** selection is an important subtask of community question answering ( CQA ) . | ||
| K19-1085 *****Answer***** selection aims at identifying the correct answer for a given question from a set of potentially correct answers . | ||
| 2020.acl-main.498 *****Answer***** retrieval is to find the most aligned answer from a large set of candidates given a question . | ||
| C16-1224 *****Answer***** selection is a core component in any question - answering systems . | ||
| inflectional | 95 | |
| D18-1029 Moreover, fine-grained typological features such as exponence, flexivity, fusion, and ***** inflectional ***** synthesis are borne out to be responsible for the proliferation of low-frequency phenomena which are organically difficult to model by statistical architectures, or for the meaning ambiguity of character n-grams. | ||
| L06-1475 Arabic has a rich morphological system combining templatic and affixational paradigms for both ***** inflectional ***** and derivational morphology. | ||
| L16-1408 The dictionary is enriched with phonetic, morphological, semantic and other annotations, as well as augmented by various language processing tools allowing for the generation of ***** inflectional ***** forms and pronunciation, for on-the-fly selection of corpus examples, for suggesting synonyms, etc. | ||
| W19-4207 Our models combine sparse sequence-to-sequence models with a two-headed attention mechanism that learns separate attention distributions for the lemma and ***** inflectional ***** tags | ||
| 2020.sigmorphon-1.25 We investigate the problem of searching for a lexeme - set in speech by searching for its *****inflectional***** variants . | ||
| disfluency detection | 95 | |
| 2020.acl-main.346 ELMo or BERT) currently produce state-of-the-art results in joint parsing and ***** disfluency detection ***** in speech transcripts. | ||
| N19-1282 However, we show here that neural parsers can find EDITED disfluency nodes, and the best neural parsers find them with an accuracy surpassing that of specialized ***** disfluency detection ***** systems, thus making these specialized mechanisms unnecessary. | ||
| 2020.emnlp-main.113 We further utilize this augmented data for pretraining and leverage it for the task of ***** disfluency detection *****. | ||
| C18-1299 While the ***** disfluency detection ***** has achieved notable success in the past years, it still severely suffers from the data scarcity. | ||
| 2020.findings-emnlp.186 We specifically explore whether it is possible to train an ASR model to directly map disfluent speech into fluent transcripts, without relying on a separate ***** disfluency detection ***** model. | ||
| empirical | 94 | |
| 2020.lrec-1.538 The dataset can serve as a reliable ***** empirical ***** basis for comparing different theoretical frameworks concerned with collocations or as material for data-driven approaches to the studies of collocations including different machine learning experiments. | ||
| 2020.findings-emnlp.182 It motivates a theoretical analysis and controlled ***** empirical ***** study on German-English and Turkish-English tasks, which both suggest that Iterative Back-Translation is more effective than Dual Learning despite its relative simplicity. | ||
| 2021.acl-long.480 We report ***** empirical ***** findings that highlight the importance of MMT models' interpretability, and discuss how our findings will benefit future research. | ||
| 2020.nlpcss-1.10 Our main contribution are ***** empirical ***** findings on the benefits of contextualized embeddings and the potential of multi-task models for this purpose. | ||
| 2020.repl4nlp-1.20 In this paper, we conduct a comprehensive ***** empirical ***** evaluation of six span representation methods using eight pretrained language representation models across six tasks, including two tasks that we introduce | ||
| SQuAD | 94 | |
| 2021.acl-long.239 The resulting model obtains surprisingly good results on multiple benchmarks (e.g., 72.7 F1 on ***** SQuAD ***** with only 128 training examples), while maintaining competitive performance in the high-resource setting. | ||
| N19-1362 Experiments show a gain of 2.7% on the recently released ***** SQuAD ***** 2.0 and 1.3% on MultiNLI. | ||
| 2021.ranlp-1.51 Experiments performed on the ***** SQuAD ***** benchmark and more complex question answering datasets have shown that linguistic enhancing boosts the performance of the standard BERT model significantly | ||
| W18-5436 In this paper we present the results of an investigation of the importance of verbs in a deep learning QA system trained on *****SQuAD***** dataset . | ||
| 2020.lrec-1.667 Existing machine reading comprehension models are reported to be brittle for adversarially perturbed questions when optimizing only for accuracy , which led to the creation of new reading comprehension benchmarks , such as *****SQuAD***** 2.0 which contains such type of questions . | ||
| emojis | 94 | |
| S18-1036 For the five tasks, several preprocessing steps were evaluated and eventually the best system included diacritics removal, elongation adjustment, replacement of ***** emojis ***** by the corresponding Arabic word, character normalization and light stemming. | ||
| S18-2011 At the same time, the vector corresponding to the male modifier tends to be semantically close to ***** emojis ***** related to business or technology, whereas their female counterparts appear closer to ***** emojis ***** about love or makeup. | ||
| S18-1080 We investigated different methods of text preprocessing including replacing text ***** emojis ***** with respective tokens and splitting hashtags to capture more meaning. | ||
| L16-1626 We retrieve 10 millions tweets posted by USA users, and we build several skip gram word embedding models by mapping in the same vectorial space both words and ***** emojis *****. | ||
| 2020.emnlp-main.542 Predicting the proper ***** emojis ***** associated with text provides a way to summarize the text accurately, and it has been proven to be a good auxiliary task to many Natural Language Understanding (NLU) tasks. | ||
| automatically | 94 | |
| 2021.splurobonlp-1.3 We manually annotate the ***** automatically ***** extracted trigger and zoomer pairs to verify which zoomers require their trigger. | ||
| 2021.naacl-main.274 However, as the input for coreference resolution typically comes from upstream components in the information extraction pipeline, the ***** automatically ***** extracted symbolic features can be noisy and contain errors. | ||
| N19-1091 Our system aims at ***** automatically ***** transforming neutral customer care responses into courteous replies. | ||
| L16-1383 We investigate the quality of the ***** automatically ***** constructed links and identify two main classes of errors. | ||
| 2013.iwslt-papers.5 Given a set of features, we aim at ***** automatically ***** extracting the variables that better explain translation quality, and use them to predict the quality score | ||
| entities | 94 | |
| 2020.acl-main.3 Our model first learns the general pattern of slot ***** entities ***** by detecting whether the tokens are slot ***** entities ***** or not. | ||
| 2021.naacl-industry.7 This can be particularly challenging when ***** entities ***** are in the tenth of millions, as is the case of e.g. music catalogs. | ||
| 2020.emnlp-main.577 Therefore supervised approaches to both graph-to-text generation and text-to-graph knowledge extraction (semantic parsing) will always suffer from a shortage of domain-specific parallel graph-text data; at the same time, adapting a model trained on a different domain is often impossible due to little or no overlap in ***** entities ***** and relations. | ||
| C18-1224 Here, we focus on automatically extracting information about (1) the events that typically bring about certain ***** entities ***** (origins), (2) the events that are the typical functions of ***** entities *****, and (3) part-whole relationships in ***** entities *****. | ||
| S18-1007 For this task, the first two seasons of the popular TV show Friends are annotated, comprising a total of 448 dialogues, 15,709 mentions, and 401 ***** entities ***** | ||
| documents | 94 | |
| L10-1362 Question answering (QA) systems aim at retrieving precise information from a large collection of ***** documents *****. | ||
| L06-1015 That is, the retrieved ***** documents ***** from both systems are shown to the judges without any information about thesearch techniques. | ||
| L12-1249 This paper adresses the described challenge of phrase extraction from ***** documents ***** in different domains and languages and proposes an approach, which does not use comprehensive lexica and therefore can be easily transferred to new domains and languages. | ||
| 2020.coling-main.16 Experiments show that our framework using sentiment-related discourse augmentations for sentiment prediction enhances the overall performance for long ***** documents *****, even beyond previous approaches using well-established discourse parsers trained on human annotated data. | ||
| 2021.emnlp-main.11 Recent studies have leveraged graph neural networks to capture the inter-sentential relationship (e.g., the discourse graph) within the ***** documents ***** to learn contextual sentence embedding. | ||
| approach | 94 | |
| 2021.naacl-main.15 Over the years, many different filtering ***** approach *****es have been proposed. | ||
| 2020.coling-main.278 Our proposed LaAP-Net outperforms existing ***** approach *****es on three benchmark datasets for the text VQA task by a noticeable margin. | ||
| S18-1073 This paper describes our ***** approach ***** to SemEval-2018 Task 2, which aims to predict the most likely associated emoji, given a tweet in English or Spanish. | ||
| 2021.naacl-main.269 We integrate our ***** approach ***** into a self-training framework for boosting performance. | ||
| 1997.mtsummit-workshop.4 (1) From a linguistic viewpoint, the expected benefits include a refinement of the aspectual model in (Olsen, 1994; Olsen, 1997) (which provides necessary but not sufficient conditions for aspectual com- position), and a refinement of the verb classifications in (Levin, 1993); we also expect our ***** approach ***** to eventually produce a systematic definition (in terms of LCSs and compositional operations) of the precise meaning components responsible for Levin's classification. | ||
| artificial intelligence | 94 | |
| D19-1215 A long-term goal of ***** artificial intelligence ***** is to have an agent execute commands communicated through natural language. | ||
| W19-0419 Learning to follow human instructions is a long-pursued goal in ***** artificial intelligence *****. | ||
| P19-1159 While the study of bias in ***** artificial intelligence ***** is not new, methods to mitigate gender bias in NLP are relatively nascent. | ||
| 2020.nlposs-1.11 Despite the recent advances in applying language-independent approaches to various natural language processing tasks thanks to ***** artificial intelligence *****, some language-specific tools are still essential to process a language in a viable manner. | ||
| W16-4404 There are some open domain question answering systems, such as IBM Waston, which take the unstructured text data as input, in some ways of humanlike thinking process and a mode of ***** artificial intelligence *****. | ||
| visual dialog | 94 | |
| 2020.emnlp-main.269 Without the need of pretraining on external vision-language data, our model yields new state of the art, achieving the top position in both single-model and ensemble settings (74.54 and 75.35 NDCG scores) on the ***** visual dialog ***** leaderboard. | ||
| N19-1266 In this paper, we propose a novel Adversarial Multi-modal Feature Encoding (AMFE) framework for effective and robust auxiliary training of ***** visual dialog ***** systems. | ||
| P18-3005 More recent work has further extended the scope of this area to combine videos and language, learning to solve non-visual tasks using visual cues, visual question answering, and ***** visual dialog *****. | ||
| N19-1058 This is the first analysis of its kind for ***** visual dialog ***** models that was not possible without this dataset. | ||
| 2020.winlp-1.9 An interesting challenge for situated dialogue systems is referential *****visual dialog*****: by asking questions, the system has to identify the referent to which the user refers to. | ||
| generative adversarial network | 94 | |
| 2020.findings-emnlp.218 We build our model based on the conditional ***** generative adversarial network *****, and propose to incorporate a simple yet effective diversity loss term into the model in order to improve the diversity of outputs. | ||
| D18-1387 In particular, we investigate context-aware and context-agnostic models for predicting vague words, and explore auxiliary-classifier ***** generative adversarial network *****s for characterizing sentence vagueness. | ||
| 2021.naacl-industry.30 We propose OodGAN, a sequential ***** generative adversarial network ***** (SeqGAN) based model for OOD data generation. | ||
| N18-1133 Inspired by ***** generative adversarial network *****s (GANs), we use one knowledge graph embedding model as a negative sample generator to assist the training of our desired model, which acts as the discriminator in GANs. | ||
| 2021.eacl-srw.23 Although several *****generative adversarial networks***** (GANs) have been proposed thus far, these models still suffer from mode-collapsing if the models are not pre-trained. | ||
| continual learning | 94 | |
| P19-1350 We test what impact task difficulty has on ***** continual learning *****, and whether the order in which a child acquires question types facilitates computational models. | ||
| 2021.emnlp-main.590 We also suggest that the upper bound performance of ***** continual learning ***** should be equivalent to multitask learning when data from all domain is available at once. | ||
| 2020.emnlp-main.237 The experimental results show that DiCGRL could effectively alleviate the catastrophic forgetting problem and outperform state-of-the-art ***** continual learning ***** models. | ||
| 2020.findings-emnlp.310 To better fit real-life applications where new data come in a stream, we study NLG in a “***** continual learning *****” setting to expand its knowledge to new domains or functionalities incrementally. | ||
| 2021.bionlp-1.3 We validate our proposed few-shot learning approach on multiple biomedical relatedness benchmarks, and show that it allows for ***** continual learning *****, where we accumulate information from various conceptual hierarchies to consistently improve encoder performance. | ||
| multilingual neural machine translation | 94 | |
| 2021.acl-short.103 While adapter tuning was investigated for *****multilingual neural machine translation*****, this paper proposes a comprehensive analysis of adapters for multilingual speech translation (ST). | ||
| 2018.iwslt-1.24 *****Multilingual neural machine translation***** (M-NMT) has recently shown to improve performance of machine translation of low-resource languages. | ||
| 2021.acl-long.25 *****Multilingual neural machine translation***** aims at learning a single translation model for multiple languages. | ||
| W18-6327 In *****multilingual neural machine translation*****, it has been shown that sharing a single translation model between multiple languages can achieve competitive performance, sometimes even leading to performance gains over bilingually trained models. | ||
| 2020.acl-main.150 *****Multilingual neural machine translation***** (NMT) has led to impressive accuracy improvements in low-resource scenarios by sharing common linguistic information across languages. | ||
| word2vec | 93 | |
| D17-1198 A compelling aspect of our approach is that our models are trained with the same simple negative sampling objective function that is commonly used in ***** word2vec ***** to learn word embeddings. | ||
| 2020.coling-main.608 The popular continuous bag-of-words (CBOW) model of ***** word2vec ***** learns a vector embedding by masking a given word in a sentence and then using the other words as a context to predict it. | ||
| N18-1042 The key approach of popular models such as ***** word2vec ***** and GloVe is to learn dense vector representations from the context of words. | ||
| 2021.latechclfl-1.13 We examined ***** word2vec ***** models generated from two historical Portuguese corpora in these four test sets. | ||
| W18-3001 In the present article, we investigate whether LSA and ***** word2vec ***** capacity to identify relevant semantic relations increases with corpus size | ||
| media | 93 | |
| L14-1283 The creation of large-scale multi***** media ***** datasets has become a scientific matter in itself. | ||
| 2021.wnut-1.53 Our results show that while word-level, intrinsic, performance evaluation is behind other methods, our model improves performance on extrinsic, downstream tasks through normalization compared to models operating on raw, unprocessed, social ***** media ***** text. | ||
| N18-4018 While some work has been done on code-mixed social ***** media ***** text and in emotion prediction separately, our work is the first attempt which aims at identifying the emotion associated with Hindi-English code-mixed social ***** media ***** text. | ||
| 2020.lrec-1.159 Our goal is to compare domain experts to crowd workers and also to prove that ***** media ***** bias can be detected automatically. | ||
| C16-1177 According to the hearer's common sense knowledge and his comprehension of the preceding text, a discourse entity could be old, ***** media *****ted or new. | ||
| headline generation | 93 | |
| N19-1262 Using this setup, we can successfully adapt a model trained on small data of 40k samples for a ***** headline generation ***** task to a general text compression dataset at an acceptable compression quality with just 500 sampled instances annotated by a human. | ||
| C18-1148 In such cases, we cannot use paired supervised data, e.g., pairs of articles and headlines, to learn a ***** headline generation ***** model. | ||
| 2020.acl-main.123 Experimental results demonstrate that the ***** headline generation ***** model trained on filtered supervision data shows no clear difference in ROUGE scores but remarkable improvements in automatic and manual evaluations of the generated headlines. | ||
| 2021.emnlp-main.32 On several summarization and ***** headline generation ***** datasets, GenPET gives consistent improvements over strong baselines in few-shot settings. | ||
| 2021.emnlp-main.335 This paper explores a variant of automatic ***** headline generation ***** methods, where a generated headline is required to include a given phrase such as a company or a product name. | ||
| conversational question | 93 | |
| W19-5914 We present a spoken ***** conversational question ***** answering proof of concept that is able to answer questions about general knowledge from Wikidata. | ||
| 2020.scai-1.2 We introduce a simple framework that enables an automated analysis of the ***** conversational question ***** answering (QA) performance using question rewrites, and present the results of this analysis on the TREC CAsT and QuAC (CANARD) datasets. | ||
| 2021.eacl-main.72 This paper addresses the task of (complex) ***** conversational question ***** answering over a knowledge graph. | ||
| Q19-1016 We analyze CoQA in depth and show that ***** conversational question *****s have challenging phenomena not present in existing reading comprehension datasets (e.g., coreference and pragmatic reasoning). | ||
| D19-5809 This paper is first to presents a framework for ***** conversational question ***** generation that is unaware of the corresponding answers. | ||
| dialogue policy | 93 | |
| 2020.emnlp-main.278 We introduce a framework of Monte Carlo Tree Search with Double-q Dueling network (MCTS-DDU) for task-completion *****dialogue policy***** learning. | ||
| P19-3013 To this end, we use a generic and simple frame-slots data-structure with pre-defined *****dialogue policies***** that allows for fast design and implementation at the price of some flexibility reduction. | ||
| 2021.sigdial-1.42 Recently, principal reward components for *****dialogue policy***** reinforcement learning use task success and user satisfaction independently and neither the resulting learned behaviour has been analysed nor a suitable proper analysis method even existed. | ||
| 2021.sigdial-1.47 *****Dialogue policy***** optimisation via reinforcement learning requires a large number of training interactions, which makes learning with real users time consuming and expensive. | ||
| 2021.emnlp-main.354 Deep reinforcement learning has shown great potential in training *****dialogue policies*****. | ||
| SOTA | 92 | |
| 2020.acl-main.45 Notably, we are able to achieve ***** SOTA ***** results on CTB5, CTB6 and UD1.4 for the part of speech tagging task; ***** SOTA ***** results on CoNLL03, OntoNotes5.0, MSRA and OntoNotes4.0 for the named entity recognition task; along with competitive results on the tasks of machine reading comprehension and paraphrase identification. | ||
| 2021.emnlp-main.192 Compared with existing ***** SOTA ***** SSL methods on TextCNN, FLiText improves the accuracy of lightweight model TextCNN from 51.00% to 90.49% on IMDb, 39.8% to 58.06% on Yelp-5, and from 55.3% to 65.08% on Yahoo! | ||
| 2021.emnlp-main.348 Experiments on an educational gold-standard set and a large-scale generated MWP set show that our approach is superior on the MWP generation task, and it outperforms the ***** SOTA ***** models in terms of both automatic evaluation metrics, i.e., BLEU-4, ROUGE-L, Self-BLEU, and human evaluation metrics, i.e., equation relevance, topic relevance, and language coherence. | ||
| 2021.naacl-main.130 ***** SOTA ***** models do not make use of hierarchical representations of discourse structure. | ||
| 2021.acl-demo.12 The joint-model is trained and evaluated on 13 corpora of four tasks, yielding near state-of-the-art (***** SOTA *****) performance in dependency parsing and NER, achieving ***** SOTA ***** performance in CWS and POS | ||
| opinion mining | 92 | |
| 2016.lilt-14.7 Moreover, it can be a disruptive factor in sentiment analysis and ***** opinion mining *****, because it changes the polarity of a message implicitly. | ||
| S18-1075 ***** opinion mining *****,sentiment detection) and theoretical purposes(e.g. | ||
| 2018.jeptalnrecital-recital.2 In this paper, we propose a classification of inferences used in Chinese in tourist comments, for an ***** opinion mining ***** task, based on three levels of analysis (semantic realization, modality of realization and production mode). | ||
| 2020.sigdial-1.23 Emotion recognition in conversation (ERC) is an important topic for developing empathetic machines in a variety of areas including social ***** opinion mining *****, health-care and so on. | ||
| L10-1531 In this work we present SENTIWORDNET 3.0, a lexical resource explicitly devised for supporting sentiment classification and ***** opinion mining ***** applications. | ||
| recognition | 92 | |
| W17-1413 In the paper we present an adaptation of Liner2 framework to solve the BSNLP 2017 shared task on multilingual named entity ***** recognition *****. | ||
| 2021.blackboxnlp-1.19 Tsetlin Machine (TM) is an interpretable pattern ***** recognition ***** algorithm based on propositional logic, which has demonstrated competitive performance in many Natural Language Processing (NLP) tasks, including sentiment analysis, text classification, and Word Sense Disambiguation. | ||
| 2020.emnlp-main.355 We applied ALICEin two visual ***** recognition ***** tasks, bird species classification and social relationship classification. | ||
| 2021.ranlp-1.110 Potential predictors of speech intelligibility are validated with human performance in spoken cognate ***** recognition ***** experiments for Bulgarian and Russian. | ||
| 2020.coling-main.314 We introduce dual-decoder Transformer, a new model architecture that jointly performs automatic speech ***** recognition ***** (ASR) and multilingual speech translation (ST). | ||
| bilstm - crf | 92 | |
| 2020.wnut-1.35 In the first phase, we experiment with various contextualised word embeddings (like Flair, BERT-based) and a *****BiLSTM-CRF***** model to arrive at the best-performing architecture. | ||
| 2021.semeval-1.113 We use the *****BiLSTM-CRF***** model combining with ToxicBERT Classification to train the detection model for identifying toxic words in posts. | ||
| 2020.lrec-1.568 Our approach relies on *****BiLSTM-CRF***** neural networks (a widely used type of network for this area of research) that use vector and tensor embedding representations. | ||
| 2021.eacl-main.325 We show that *****BiLSTM-CRF***** models with syllable embeddings outperform a CRF baseline and different BERT-based approaches. | ||
| 2020.textgraphs-1.3 Most classical approaches use a sequence-based model (typically *****BiLSTM-CRF***** framework) without considering document structure. | ||
| Malayalam | 91 | |
| 2021.ltedi-1.8 This paper reports on the shared task of hope speech detection for Tamil, English, and ***** Malayalam ***** languages. | ||
| 2021.dravidianlangtech-1.15 As a part of this shared task, we organized four sub-tasks corresponding to machine translation of the following language pairs: English to Tamil, English to ***** Malayalam *****, English to Telugu and Tamil to Telugu which are available at https://competitions.codalab.org/competitions/27650. | ||
| 2021.ltedi-1.25 Results indicate that XLM-R outdoes all other techniques by gaining a weighted f_1-score of 0.93, 0.60 and 0.85 respectively for English, Tamil and ***** Malayalam ***** language. | ||
| R19-1072 Evaluation on ***** Malayalam ***** Wikipedia data shows that our approach is correct and the results, though not as good as Tamil, but comparable | ||
| 2021.vardial-1.14 The DLI training set includes 16,674 YouTube comments written in Roman script containing code - mixed text with English and one of the three South Dravidian languages : Kannada , *****Malayalam***** , and Tamil . | ||
| corresponding | 91 | |
| 2020.pam-1.13 Each frame provides a set of roles ***** corresponding ***** to the situation participants, e.g. Buyer and Goods, and lexical units (LUs) – words and phrases that can evoke this particular frame in texts, e.g. Sell. | ||
| 2016.amta-researchers.3 Proposals from TMs could be made more useful by using techniques such as fuzzy-match repair (FMR) which modify words in the target segment ***** corresponding ***** to mismatches identified in the source segment. | ||
| S17-2024 Given a cross-lingual task, we trained models ***** corresponding ***** to its two languages and combined the models by averaging the similarity scores. | ||
| 2021.emnlp-main.335 Previous methods using Transformer-based models generate a headline including a given phrase by providing the encoder with additional information ***** corresponding ***** to the given phrase. | ||
| L06-1105 BACO uses a generic relational database engine to store 1.5 million web documents in raw text (more than 6GB of plain text), ***** corresponding ***** to 35 million sentences, consisting of more than 1000 million words | ||
| analyzer | 91 | |
| 2020.emnlp-main.306 Moreover, on observing that the best OpenIE systems falter at handling coordination structures, our OpenIE system also incorporates a new coordination ***** analyzer ***** built with the same IGL architecture. | ||
| N18-1130 Furthermore, we show that our model learns to exploit morphological knowledge encoded in the ***** analyzer *****, and, as a byproduct, it can perform effective unsupervised morphological disambiguation. | ||
| L10-1231 For morphological annotation we used the already existing ***** analyzer ***** and manually disambiguated the results. | ||
| L10-1068 Along with the description of how the ***** analyzer ***** is implemented, this paper provides an evaluation of the ***** analyzer ***** on two large corpora | ||
| R19-1156 In this paper, we present a two-level morphological ***** analyzer ***** for Turkish. | ||
| schemas | 91 | |
| 2021.naacl-main.441 By ignoring names of semantic items in databases, abstract ***** schemas ***** are exploited in a well-designed graph projection neural network to obtain delexicalized representation of question and schema. | ||
| L06-1302 In this paper, we describe the role and the use of WORDNET as an external lexical resource in a methodology for matching hierarchical classification ***** schemas *****. | ||
| C18-1311 We also develop a method for evaluating the similarity between sets of narrative ***** schemas *****, and thus the stability of the schema induction algorithms. | ||
| 2021.emnlp-main.422 Extrinsic evaluation on schema-guided future event prediction further demonstrates the predictive power of our event graph model, significantly outperforming human ***** schemas ***** and baselines by more than 17.8% on HITS@1. | ||
| L14-1524 The approach we present here is easily scalable to any number of sources and ***** schemas ***** | ||
| psycholinguistic | 91 | |
| E17-4006 The importance the model gives to the preferences is in line with ***** psycholinguistic ***** studies. | ||
| E17-1069 To find out how users' social media behaviour and language are related to their ethical practices, the paper investigates applying Schwartz' ***** psycholinguistic ***** model of societal sentiment to social media text. | ||
| 2021.semeval-1.90 This paper presents the system created to assess single words lexical complexity, combining linguistic and ***** psycholinguistic ***** variables in a set of experiments involving random forest and XGboost regressors. | ||
| 2020.acl-demos.10 However, this line of research requires an uncommon confluence of skills: both the theoretical knowledge needed to design controlled ***** psycholinguistic ***** experiments, and the technical proficiency needed to train and deploy large-scale language models. | ||
| 2020.coling-main.508 Besides their role for ***** psycholinguistic ***** investigation (why do we employ different coreference strategies when we write or speak) and for the placement of Twitter in the spoken–written continuum, we see our results as a contribution to approaching genre-/media-specific coreference resolution | ||
| AMR | 91 | |
| 2020.findings-emnlp.199 Following previous findings on the importance of reen- trancies for ***** AMR *****, we empirically find and discuss several linguistic phenomena respon- sible for reentrancies in ***** AMR *****, some of which have not received attention before. | ||
| W19-4028 In this context, this paper presents an effort to build a general purpose ***** AMR *****-annotated corpus for Brazilian Portuguese by translating and adapting ***** AMR ***** English guidelines. | ||
| 2020.dmr-1.2 By extending ***** AMR ***** with indices for contexts and formulating constraints on these contexts, a formalism is derived that makes correct predictions for inferences involving negation and bound variables. | ||
| D17-1129 This paper proposes to tackle the AMR parsing bottleneck by improving two components of an *****AMR***** parser : concept identification and alignment . | ||
| D17-1130 We present a transition - based AMR parser that directly generates *****AMR***** parses from plain text . | ||
| sarcasm | 91 | |
| 2021.rocling-1.35 The results show that local features that affect the overall sentential sentiment confuse the model: multiple target entities, transitional words, ***** sarcasm *****, and rhetorical questions. | ||
| 2020.acl-main.349 In multimodal context, ***** sarcasm ***** is no longer a pure linguistic phenomenon, and due to the nature of social media short text, the opposite is more often manifested via cross-modality expressions. | ||
| W19-1309 The statistical ML classifier uncovers the indicators i.e., features of such ***** sarcasm *****. | ||
| 2020.figlang-1.10 As an immense growth of social media, ***** sarcasm ***** analysis helps to avoid insult, hurts and humour to affect someone. | ||
| 2020.acl-main.96 In particular, we encode various switching features to improve humour, ***** sarcasm ***** and hate speech detection tasks | ||
| dialog systems | 91 | |
| L12-1156 In this paper, we present the acquisition and labeling processes of the EDECAN-SPORTS corpus, which is a corpus that is oriented to the development of multimodal ***** dialog systems ***** acquired in Spanish and Catalan. | ||
| D18-1077 The main goal of this paper is to develop out-of-domain (OOD) detection for ***** dialog systems *****. | ||
| 2021.nlp4convai-1.26 This increase in usage of code-mixed language has prompted ***** dialog systems ***** in a similar language. | ||
| P17-1062 HCNs attain state-of-the-art performance on the bAbI dialog dataset (Bordes and Weston, 2016), and outperform two commercially deployed customer-facing ***** dialog systems ***** at our company. | ||
| D19-1162 Dependency parsing of conversational input can play an important role in language understanding for ***** dialog systems ***** by identifying the relationships between entities extracted from user utterances. | ||
| content | 91 | |
| 2021.emnlp-main.200 These directed subgraphs are considered to well preserve extra but relevant ***** content ***** to the short input text, and then they are decoded by the employed pre-trained model to generate coherent long text. | ||
| N18-1137 Experimental results demonstrate that models trained with ***** content *****-specific objectives improve upon a vanilla encoder-decoder which solely relies on soft attention. | ||
| 2020.coling-main.197 This assumes that it is possible to separate style from ***** content *****. | ||
| W17-5230 The first method uses ***** content *****-based features (hashtags, emoticons, elongated words, etc.). | ||
| P18-1182 In this paper, we present a new approach (NeuralREG), relying on deep neural networks, which makes decisions about form and ***** content ***** in one go without explicit feature extraction. | ||
| graph attention network | 91 | |
| 2020.acl-main.295 Extensive experiments are conducted on the SemEval 2014 and Twitter datasets, and the experimental results confirm that the connections between aspects and opinion words can be better established with our approach, and the performance of the ***** graph attention network ***** (GAT) is significantly improved as a consequence. | ||
| 2020.emnlp-main.597 In particular, the state-of-the-art method considers self- and inter-speaker dependencies in conversations by using relational ***** graph attention network *****s (RGAT). | ||
| 2021.wnut-1.12 We use ***** graph attention network *****s (GAT) over users and tweets in a conversation thread, combined with dense user history representations. | ||
| 2021.naacl-main.6 We then propose a ***** graph attention network *****-based approach to propagate temporal information over document-level event graphs constructed by shared entity arguments and temporal relations. | ||
| 2021.acl-long.326 To reduce the computational cost in graph learning, we further propose a novel flow ***** graph attention network ***** (GAT) that only transmits messages between neighboring parties in the tripartite graph. | ||
| transfer learning approach | 91 | |
| 2020.calcs-1.6 In a second step, two systems using cross-lingual embeddings were researched, being (1) a supervised classifier and (2) a ***** transfer learning approach ***** trained on English sentiment data and evaluated on code-mixed data. | ||
| W19-5406 Our submissions build upon the recent OpenKiwi framework: We combine linear, neural, and predictor-estimator systems with new ***** transfer learning approach *****es using BERT and XLM pre-trained models. | ||
| 2021.naacl-main.319 In addition, we propose a novel few-shot ***** transfer learning approach ***** that ensures better transferability for very small sample sizes. | ||
| 2020.socialnlp-1.8 Two prevalent ***** transfer learning approach *****es are used in recent works to improve neural networks performance for domains with small amounts of annotated data: Multi-task learning which involves training the task of interest with related auxiliary tasks to exploit their underlying similarities, and Mono-task fine-tuning, where the weights of the model are initialized with the pretrained weights of a large-scale labeled source domain and then fine-tuned with labeled data of the target domain (domain of interest). | ||
| D19-1153 A typical cross-lingual ***** transfer learning approach ***** boosting model performance on a language is to pre-train the model on all available supervised data from another language. | ||
| chatbot | 90 | |
| 2020.lrec-1.90 The study about the realization of PuffBot, an intelligent ***** chatbot ***** to support and monitor people suffering from asthma, shows how this type of technique could be an important piece in the development of future ***** chatbot *****s. | ||
| P18-1137 For example, customer service requires the generated responses to be specific and accurate, while ***** chatbot ***** prefers diverse responses so as to attract different users. | ||
| 2020.law-1.14 The dataset annotated according to this scheme is currently used to develop the prototype of a rule-based Natural Language Generation system aimed at improving the ***** chatbot ***** responses and the customer experience overall. | ||
| 2021.naacl-main.123 Our framework includes a guiding ***** chatbot ***** and an interlocutor model that plays the role of humans. | ||
| 2021.acl-short.130 With the explosion of *****chatbot***** applications , Conversational Question Answering ( CQA ) has generated a lot of interest in recent years . | ||
| Opinion | 90 | |
| 2018.jeptalnrecital-recital.2 Analysis of Inferences in Chinese for ***** Opinion ***** Mining ***** Opinion ***** mining is an essential activity for economic watch, made easier by social networks and ad hoc forums. | ||
| 2020.emnlp-main.337 *****Opinion***** summarization is the automatic creation of text reflecting subjective information expressed in multiple documents , such as user reviews of a product . | ||
| 2021.eacl-main.229 *****Opinion***** summarization is the task of automatically generating summaries for a set of reviews about a specific target ( e.g. , a movie or a product ) . | ||
| L16-1041 *****Opinion***** Mining is a topic which attracted a lot of interest in the last years . | ||
| 2021.emnlp-main.743 *****Opinion***** summarization has been traditionally approached with unsupervised , weakly - supervised and few - shot learning techniques . | ||
| features | 90 | |
| S19-2038 We utilize different word embeddings to empirically select the most suited embedding to represent our ***** features *****. | ||
| Q18-1007 Experiments show that our ***** features ***** are more effective in scoring specific aspects of narrative quality than a state-of-the-art feature set. | ||
| L16-1312 We use supervised machine learning techniques over all our ***** features ***** and compare our recommendation results to those produced by a popular similar artist recommendation website. | ||
| W18-2304 We employ deep learning architecture, i.e. Long Short-Term Memory, and leverage word embeddings, medical concepts from a knowledge base, and linguistic components as our ***** features *****. | ||
| W19-8706 Our results indicate lack of direct association between translationese and quality in our data: while our ***** features ***** distinguish translations and non-translations with the near perfect accuracy, the performance of the same algorithm on the quality classes barely exceeds the chance level | ||
| gender | 90 | |
| 2021.emnlp-main.123 Furthermore, these incorrectly ***** gender *****ed translations have the potential to reflect or amplify social biases. | ||
| W17-1606 Speakers' dialect and ***** gender ***** was controlled for by using videos uploaded as part of the “accent tag challenge”, where speakers explicitly identify their language background. | ||
| N19-1061 Several recent works tackle this problem, and propose methods for significantly reducing this ***** gender ***** bias in word embeddings, demonstrating convincing results. | ||
| W19-3812 In this work, contribution of transfer learning technique to pronoun resolution systems is investigated and the ***** gender ***** bias contained in classification models is evaluated. | ||
| 2020.findings-emnlp.280 Our structure enables the separation of the semantic latent information and ***** gender ***** latent information of given word into the disjoint latent dimensions. | ||
| ner dataset | 90 | |
| 2021.eacl-demos.7 We show the potential of the library by compiling nine public *****NER datasets***** into a unified format and evaluating the cross-domain and cross- lingual performance across the datasets. | ||
| 2021.emnlp-demo.32 Besides some basic features for crowd annotation like fast tagging and data management, CroAno provides a systematic solution for improving label consistency of Chinese *****NER dataset*****. | ||
| 2020.ccl-1.86 Experimental results on three widely used Chinese *****NER datasets***** demonstrate that our proposed model significantly outperforms other state-of-the-art methods. | ||
| 2021.emnlp-main.424 We also provide an expert-labeled, chemistry *****NER dataset***** with 62 fine-grained chemistry types (e.g., chemical compounds and chemical reactions). | ||
| 2021.naacl-main.118 To comprehensively evaluate our approaches, we create 3 large *****NER datasets***** (24M tokens) reflecting current challenges. | ||
| experimentally | 89 | |
| N18-1158 We use our algorithm to train a neural summarization model on the CNN and DailyMail datasets and demonstrate ***** experimentally ***** that it outperforms state-of-the-art extractive and abstractive systems when evaluated automatically and by humans. | ||
| 2021.acl-long.36 We jointly train the probes for multiple tasks and ***** experimentally ***** show that lexical and syntactic information is separated in the representations. | ||
| 2010.iwslt-papers.12 Using statistical MT systems for the 11 different languages of Europarl, we show ***** experimentally ***** that a direct translation system can be replaced by this pivot approach without a loss in translation quality if about six pivot languages are available. | ||
| 2021.naacl-main.87 We not only provide a rigorous analytic derivation of the certified condition but also ***** experimentally ***** compare the utility of WordDP with existing defense algorithms. | ||
| 2021.acl-long.558 We ***** experimentally ***** implement 154 systems on 11 datasets, covering three languages, comprehensive results show the effectiveness of span prediction models that both serve as base NER systems and system combiners | ||
| metaphor | 89 | |
| W18-0910 Despite the variety of approaches that are trying to process ***** metaphor *****, there is still a need for better models that mimic the human cognition while exploiting fewer resources. | ||
| 2020.figlang-1.32 In one case, out-of-domain data manually annotated for ***** metaphor ***** is used for the auxiliary task; in the other case, in-domain data automatically annotated for idioms is used for the auxiliary task. | ||
| P16-5008 The tutorial is geared to researchers and practitioners of language technology, not necessarily experts in ***** metaphor ***** analysis or knowledgeable about either FrameNet or MetaNet, but who are interested in natural language processing tasks that involve automatic ***** metaphor ***** processing, or could benefit from exposure to tools and resources that support frame-based deep semantic, analyses of language, including ***** metaphor ***** as a widespread phenomenon in human language. | ||
| 2020.figlang-1.27 In this paper we present a novel resource-inexpensive architecture for ***** metaphor ***** detection based on a residual bidirectional long short-term memory and conditional random fields. | ||
| P18-2024 We therefore construct a significant new corpus on ***** metaphor *****, with 5,605 manually annotated sentences in Chinese | ||
| referring expressions | 89 | |
| W17-3522 Using the furniture stimuli set developed for the TUNA and D-TUNA corpora, our corpus extends on these corpora by providing data collected in a simulated driving dual-task setting, and additionally provides exact duration annotations for the spoken ***** referring expressions *****. | ||
| K19-1040 As a result, past ***** referring expressions ***** for objects can provide strong signals for grounding subsequent ***** referring expressions *****. | ||
| L12-1025 As an alternative, and perhaps less traditional approach, we also use surface information to build statistical language models of the ***** referring expressions ***** that are most likely to occur in the corpus, and let the model probabilities guide attribute selection. | ||
| 2021.codi-main.5 The diversity of coreference chains is usually tackled by means of global features (length, types and number of ***** referring expressions *****, distance between them, etc.). | ||
| L14-1404 Third, every text in the corpus has been annotated for 14 layers of syntax and semantics, including: ***** referring expressions ***** and co-reference; events, time expressions, and temporal relationships; semantic roles; and word senses. | ||
| textual similarity | 89 | |
| P17-2099 Such an unsupervised representation is empirically validated via semantic ***** textual similarity ***** tasks on 19 different datasets, where it outperforms the sophisticated neural network models, including skip-thought vectors, by 15% on average. | ||
| W18-3012 This simple method even outperforms far more complex approaches such as LSTMs on ***** textual similarity ***** tasks. | ||
| 2020.findings-emnlp.39 Natural language inference (NLI) and semantic ***** textual similarity ***** (STS) are key tasks in natural language understanding (NLU). | ||
| 2020.inlg-1.45 Earlier research has shown that evaluation metrics based on ***** textual similarity ***** (e.g., BLEU, CIDEr, Meteor) do not correlate well with human evaluation scores for automatically generated text. | ||
| I17-4034 In this paper we present MappSent , a *****textual similarity***** approach that we applied to the multi - choice question answering in exams shared task . | ||
| multimodal machine translation | 89 | |
| W19-1808 Recent work on visually grounded language learning has focused on broader applications of grounded representations, such as visual question answering and ***** multimodal machine translation *****. | ||
| 2019.iwslt-1.6 The architecture consists of an automatic speech recognition (ASR) system followed by a Transformer-based ***** multimodal machine translation ***** (MMT) system. | ||
| D19-6406 It is assumed that ***** multimodal machine translation ***** systems are better than text-only systems at translating phrases that have a direct correspondence in the image. | ||
| W18-6402 We present the results from the third shared task on ***** multimodal machine translation *****. | ||
| 2021.acl-long.503 BERTGen is auto-regressively trained for language generation tasks, namely image captioning, machine translation and ***** multimodal machine translation *****, under a multi-task setting. | ||
| abstractive text summarization | 89 | |
| R19-1146 In this paper we describe how an ***** abstractive text summarization ***** method improved the informativeness of automatic summaries by integrating syntactic text simplification, subject-verb-object concept frequency scoring and a set of rules that transform text into its semantic representation. | ||
| 2020.lrec-1.222 Recently, generative language models have shown promise in ***** abstractive text summarization ***** tasks. | ||
| 2020.nlpbt-1.7 Prior work on multimodal ***** abstractive text summarization ***** only utilized information from the text and video modalities. | ||
| N19-4012 Neural ***** abstractive text summarization ***** (NATS) has received a lot of attention in the past few years from both industry and academia. | ||
| P17-1099 Neural sequence-to-sequence models have provided a viable new approach for ***** abstractive text summarization ***** (meaning they are not restricted to simply selecting and rearranging passages from the original text). | ||
| suggestion mining | 89 | |
| S19-2211 The task consists of two subtasks: ***** suggestion mining ***** under single-domain (Subtask A) and cross-domain (Subtask B) settings. | ||
| S19-2225 Tri-training proved to be an effective technique to accommodate domain shift for cross-domain ***** suggestion mining ***** (Subtask B) where there is no hand labelled training data. | ||
| S19-2221 We present a system for cross-domain ***** suggestion mining *****, prepared for the SemEval-2019 Task 9: Suggestion Mining from Online Reviews and Forums (Subtask B). | ||
| S19-2214 We participated in both subtasks for domain specific and also cross-domain ***** suggestion mining *****. | ||
| S19-2212 In this paper, we describe a ***** suggestion mining ***** system that participated in SemEval 2019 Task 9, SubTask A - Suggestion Mining from Online Reviews and Forums. | ||
| Consequently | 88 | |
| P18-2005 ***** Consequently *****, the authorship of training and evaluation corpora can have unforeseen impacts, including differing model performance for different user groups, as well as privacy implications. | ||
| 2020.lrec-1.117 ***** Consequently *****, besides significant phenomena from the perspective of diachronic linguistics, this treebank also poses several challenging technical issues for the current and future syntactic annotation of Latin in the UD framework. | ||
| 2020.emnlp-main.678 ***** Consequently *****, while the methods proposed in literature perform well for generic date-time extraction from texts, they don't fare as well on task specific date-time entity extraction where only a subset of the date-time entities present in the text are pertinent to solving the task. | ||
| C16-1008 ***** Consequently *****, past word occurrence can contribute to estimation of the number of current patients. | ||
| 2021.acl-long.466 ***** Consequently *****, we propose to utilize two auxiliary tasks, Number Ranking (NR) and Importance Ranking (IR), to supervise the encoder to capture the different relations | ||
| intrinsic | 88 | |
| S19-1006 Our results support representation transfer as a scalable approach for modular cross-lingual alignment of neural sentence embeddings, where we observe better performance compared to joint models in ***** intrinsic ***** and extrinsic evaluations, particularly with smaller sets of parallel data. | ||
| D19-1462 On this dataset, ***** intrinsic ***** evaluations on the resolution of ellipsis and co-reference show that the GECOR model significantly outperforms the sequence-to-sequence (seq2seq) baseline model in terms of EM, BLEU and F1 while extrinsic evaluations on the downstream dialogue task demonstrate that our multi-task learning framework with GECOR achieves a higher success rate of task completion than TSCP, a state-of-the-art end-to-end task-oriented dialogue model. | ||
| 2021.insights-1.1 We show that after correcting a bug in the CBOW gradient update, one can learn CBOW word embeddings that are fully competitive with SG on various ***** intrinsic ***** and extrinsic tasks, while being many times faster to train. | ||
| 2020.lrec-1.31 Our results suggest that the optimal number of repetitions in crowdsourcing setups, in which any additional repetitions do no longer cause an adequate increase of overall correlation coefficients, lies between seven and nine for ***** intrinsic ***** and extrinsic quality factors. | ||
| N18-1148 In this paper, we identify and differentiate between two relevant data generating scenarios (***** intrinsic ***** vs. extrinsic labels), introduce a simple but novel method which emphasizes the importance of calibration, and then analyze and experimentally validate the appropriateness of various methods for each of the two scenarios | ||
| collocation | 88 | |
| L10-1520 In this paper, we present a fine-grained three-dimensional typology of ***** collocation ***** errors that has been derived in an empirical study from the learner corpus CEDEL2 compiled by a team at the Autonomous University of Madrid. | ||
| L12-1470 We approach ***** collocation ***** extraction as a classification problem where the task is to classify a given n-gram as either a ***** collocation ***** (positive) or a non-***** collocation ***** (negative). | ||
| 2020.mwe-1.1 Each ***** collocation ***** is enriched with information that facilitates its downstream exploitation in NLP tasks such as machine translation, word sense disambiguation, natural language generation, relation classification, and so forth. | ||
| L10-1612 Besides classical two-word ***** collocation *****s, we will focus on the case of complex ***** collocation *****s (3 words or more) for which a recursive design is presented in the form of ***** collocation ***** of ***** collocation *****s. | ||
| L08-1260 We present methodology and resources obtained in three main project phases which are: dictionary-based acquisition of ***** collocation ***** lexicon, feasibility study for corpus-based lexicon enlargement phase, corpus-based lexicon enlargement and ***** collocation ***** description | ||
| predictive | 88 | |
| P19-1625 However, most computational studies have examined only one or a handful of contextual factors ***** predictive ***** of switching. | ||
| N18-2096 We rely on this corpus to build ***** predictive ***** models to infer non-English languages that users speak exclusively from their English tweets. | ||
| 2020.coling-industry.15 Our approach outperforms a range of baselines and achieves a compression rate of 97.4% with less than 3.7% degradation in ***** predictive ***** performance. | ||
| P17-2102 In this work we build ***** predictive ***** models to classify 130 thousand news posts as suspicious or verified, and predict four sub-types of suspicious news – satire, hoaxes, clickbait and propaganda. | ||
| W18-1209 The positive effect of adding subword information to word embeddings has been demonstrated for ***** predictive ***** models | ||
| concepts | 88 | |
| D19-1548 However, previous approaches only consider the relations between directly connected ***** concepts ***** while ignoring the rich structure in AMR graphs. | ||
| L12-1426 We describe then the process of term validation and construction of a glossary of terms of Italian Linguistics; afterwards, we outline the identification of synonymic chains and the main criteria of ontology design: top classes of ontology are Concept (containing taxonomy of ***** concepts *****) and Terms (containing terms of the glossary as instances), while ***** concepts ***** are linked through part-whole and involved-role relation, both borrowed from Wordnet. | ||
| L06-1199 Most approaches are based on the assumption that verbs typically indicate semantic relations between ***** concepts *****. | ||
| C16-1313 In this paper, we propose a new approach to obtain the relationship between ***** concepts ***** by exploiting the syntactic dependencies between words in the image captions. | ||
| L16-1586 A typology of semantic relations between ***** concepts ***** is also proposed | ||
| combinatory categorial grammar | 88 | |
| W19-4005 We present the first open-source graphical annotation tool for ***** combinatory categorial grammar ***** (CCG), and the first set of detailed guidelines for syntactic annotation with CCG, for four languages: English, German, Italian, and Dutch. | ||
| 2020.emnlp-main.487 Supertagging is conventionally regarded as an important task for ***** combinatory categorial grammar ***** (CCG) parsing, where effective modeling of contextual information is highly important to this task. | ||
| 2010.amta-papers.23 In our experiments two kinds of supertags are employed: those from lexicalized tree-adjoining grammar (LTAG) and ***** combinatory categorial grammar ***** (CCG). | ||
| 2021.cl-1.2 This class of permutations is exactly the class that can be expressed in ***** combinatory categorial grammar *****s (CCGs). | ||
| D17-1071 For the natural deduction proofs, we use ccg2lambda, a higher-order automatic inference system, which converts *****Combinatory Categorial Grammar***** (CCG) derivation trees into semantic representations and conducts natural deduction proofs. | ||
| supervised relation extraction | 88 | |
| 2021.naacl-main.2 We propose a multi-task, probabilistic approach to facilitate distantly ***** supervised relation extraction ***** by bringing closer the representations of sentences that contain the same Knowledge Base pairs. | ||
| 2020.coling-main.566 In recent years, distantly-***** supervised relation extraction ***** has achieved a certain success by using deep neural networks. | ||
| E17-2087 While it is natural to use both positive and negative training examples in ***** supervised relation extraction *****, the impact of positive examples on hypernym prediction was not studied so far. | ||
| 2020.findings-emnlp.113 We perform an extensive experimental study over multiple relation extraction benchmarks and demonstrate that RE-Flex outperforms competing un***** supervised relation extraction ***** methods based on pretrained language models by up to 27.8 F1 points compared to the next-best method. | ||
| 2021.emnlp-main.761 Distantly *****supervised relation extraction***** (RE) automatically aligns unstructured text with relation instances in a knowledge base (KB). | ||
| constraint | 87 | |
| 2021.acl-long.9 Pre-trained S2S models or a Copy Mechanism are trained to copy the surface tokens from encoders to decoders, but they cannot guarantee ***** constraint ***** satisfaction. | ||
| U18-1004 In this paper we propose an extensible and efficient framework for inducing relations via the use of ***** constraint ***** satisfaction. | ||
| P19-1172 Using an iterative discovery, ***** constraint *****, and training process, we build inflectional lexica in the target languages. | ||
| I17-2037 The major drawback of current approaches is that they look only at the similarity (***** constraint *****) between a question and a head, relation pair | ||
| 2011.freeopmt-1.3 Extensible Dependency Grammar ( XDG ; Debusmann , 2007 ) is a flexible , modular dependency grammar framework in which sentence analyses consist of multigraphs and processing takes the form of *****constraint***** satisfaction . | ||
| wide range | 87 | |
| S18-1105 The system takes as starting point emotIDM, an irony detection model that explores the use of affective features based on a ***** wide range ***** of lexical resources available for English, reflecting different facets of affect. | ||
| Q18-1030 In this paper, we present a sequence tagging framework and apply it to word segmentation for a ***** wide range ***** of languages with different writing systems and typological characteristics. | ||
| 2020.isa-1.3 The code and its analytic expansions represent a cross-linguistically ***** wide range ***** of phenomena of languages and language structures. | ||
| 2020.emnlp-tutorials.4 After establishing these foundations, we will cover a ***** wide range ***** of techniques for improving efficiency, including knowledge distillation, quantization, pruning, more efficient architectures, along with case studies and practical implementation tricks. | ||
| W16-4016 The corpus serves as a general resource for a ***** wide range ***** of re-search addressing natural conversation between humans in their full complexity. | ||
| event detection | 87 | |
| 2021.wnut-1.28 Furthermore, we show that our approach significantly outperforms ***** event detection ***** baselines, highlighting the importance of aggregating information across tweets for our task. | ||
| W17-5803 To date, various Twitter-based ***** event detection ***** systems have been proposed. | ||
| L06-1471 The development of this corpus was motivated by the need to have both metadata and syntactic structure annotated in order to support synergistic work on speech parsing and structural ***** event detection *****. | ||
| P19-1429 Current neural ***** event detection ***** approaches focus on trigger-centric representations, which work well on distilling discrimination knowledge, but poorly on learning generalization knowledge. | ||
| 2021.emnlp-main.26 Over the past decade, the field of natural language processing has developed a wide array of computational methods for reasoning about narrative, including summarization, commonsense inference, and ***** event detection *****. | ||
| attribution | 86 | |
| L16-1287 Previous studies have applied manual content analysis to this problem but in this paper we present novel work to automate the analysis of ***** attribution ***** bias through using machine learning algorithms. | ||
| 2021.acl-long.71 In this paper, we formally define the feature group ***** attribution ***** problem and outline a set of axioms that any intuitive feature group ***** attribution ***** method should satisfy. | ||
| 2021.blackboxnlp-1.39 On BERT-based models for passage reranking, we quantitatively demonstrate the framework's veracity in extracting ***** attribution ***** maps, from which we perform detailed, token-wise analysis about how predictions are made. | ||
| 2021.emnlp-main.645 Experiments for explanation faithfulness across five datasets, show that models trained with SaLoss consistently provide more faithful explanations across four different feature ***** attribution ***** methods compared to vanilla BERT | ||
| 2021.eval4nlp-1.18 Authorship ***** attribution ***** is the task of assigning an unknown document to an author from a set of candidates. | ||
| propose | 86 | |
| 2021.nlp4posimpact-1.3 In order to obtain a clearer view in this respect, we first ***** propose ***** a working definition of NLP4SG and identify some primary aspects that are crucial for NLP4SG, including, e.g., areas, ethics, privacy and bias. | ||
| N19-1292 Unlike previous works which are purely extractive or generative, we first ***** propose ***** a new multi-task learning framework that jointly learns an extractive model and a generative model. | ||
| 2020.coling-main.593 The transition-based systems in the past studies ***** propose ***** a series of actions, to build a right-heavy binarized tree for the RST parsing. | ||
| 2020.emnlp-main.524 To this end, we first ***** propose ***** a method to automatically construct a parallel corpus by transforming a large number of similes collected from Reddit to their literal counterpart using structured common sense knowledge. | ||
| 2021.emnlp-main.728 To resolve this unsatisfactory state of affairs we here ***** propose ***** a training scheme that learns a shared latent representation of emotion independent from different label formats, natural languages, and even disparate model architectures | ||
| policy gradient | 86 | |
| 2020.findings-emnlp.98 For unlabeled data, we leverage a self-critical *****policy gradient***** method with the difference between the scores obtained by Monte-Carlo sampling and greedy decoding as the reward function, while the scores are the negative K-L divergence between output distributions of original video data and augmented video data. | ||
| 2020.wnut-1.32 Unlike previous methods, we employ the discriminator output as penalization instead of using *****policy gradients*****, and we propose a global discriminator to avoid the Monte-Carlo search. | ||
| P19-1535 These assessments are integrated as a compound reward to guide the evolution of dialogue strategy via *****policy gradient*****. | ||
| W17-2603 After teacher forcing for standard maximum likelihood training, we fine-tune the model using *****policy gradient***** techniques to maximize several rewards that measure question quality. | ||
| N19-1360 Training the architecture using *****policy gradient***** leads to further improvements in performance, reaching a sequence-level accuracy of 88.7% on artificial data and 74.8% on real data. | ||
| distributions | 85 | |
| 2021.naacl-main.142 This issue is more pronounced as the imbalance occurs in both word and sense ***** distributions *****. | ||
| P19-2005 We propose to model reviewer biases from their review texts and rating ***** distributions *****, and learn a bias-aware opinion representation. | ||
| 2021.ranlp-1.79 We propose a novel WSD dataset and show that personalizing a WSD system with knowledge of an author's sense ***** distributions ***** or predominant senses can greatly increase its performance. | ||
| 2021.acl-long.391 Unfortunately, they often suffer from low accuracy because of the margin bias problem caused by the large difference between representation ***** distributions ***** of labels in SSTC. | ||
| 2021.eacl-main.9 In light of recent work discouraging the use of attention ***** distributions ***** for explaining a model's behaviour, we show that attention ***** distributions ***** can nevertheless provide insights into the local behaviour of attention heads | ||
| inferences | 85 | |
| 2020.lrec-1.671 Standardized science questions require combining an average of 6 facts, and as many as 16 facts, in order to answer and explain, but most existing datasets for multi-hop reasoning focus on combining only two facts, significantly limiting the ability of multi-hop inference algorithms to learn to generate large ***** inferences *****. | ||
| N18-1140 We find that ***** inferences ***** using domain knowledge and object tracking are the most frequently required skills, and that recognizing omitted information and spatio-temporal reasoning are the most difficult for the machines. | ||
| 2021.conll-1.28 In this work, we introduce the Naturally-Occurring Presuppositions in English (NOPE) Corpus to investigate the context-sensitivity of 10 different types of presupposition triggers and to evaluate machine learning models' ability to predict human ***** inferences *****. | ||
| 2021.emnlp-main.776 We further probe into the adversarial robustness and qualitative ***** inferences ***** we draw from HypMix that elucidate the efficacy of the Riemannian hyperbolic manifolds for interpolation-based data augmentation. | ||
| Q18-1001 In this paper we argue that crime drama exemplified in television programs such as CSI: Crime Scene Investigation is an ideal testbed for approximating real-world natural language understanding and the complex ***** inferences ***** associated with it | ||
| modeling | 85 | |
| 2020.nlp4convai-1.7 In this paper, we present DLGNet, a transformer-based model for dialogue ***** modeling *****. | ||
| 2020.lrec-1.52 Given enough annotated data, such a resource would support multiple ***** modeling ***** methods including information extraction with template language generation, information retrieval type language generation, or sequence to sequence ***** modeling *****. | ||
| L08-1133 A typical phrase-based SMT system makes use of more and longer phrases with context ***** modeling *****, including phrases that were not seen very frequently in training. | ||
| 2020.acl-main.225 We show that this approach, which we call infilling by language ***** modeling *****, can enable LMs to infill entire sentences effectively on three different domains: short stories, scientific abstracts, and lyrics. | ||
| D19-1205 Dialogue Acts play an important role in conversation ***** modeling ***** | ||
| session | 85 | |
| 2020.emnlp-main.408 In this paper, we propose a semantic representation for such task-oriented conversational systems that can represent concepts such as co-reference and context carryover, enabling comprehensive understanding of queries in a ***** session *****. | ||
| 2020.lrec-1.77 The provided dataset – Voice Assistant Conversations in the wild (VACW) – includes the transcripts of both visitors requests and Alexa answers, identified topics and ***** session *****s as well as acoustic characteristics automatically extractable from the visitors' audio files. | ||
| L16-1592 If until now researchers have primarily focused on leveraging personalized content to identify latent information such as gender, nationality, location, or age of the author, this study seeks to establish a structured way of extracting pos***** session *****s, or items that people own or are entitled to, as a way to ultimately provide insights into people's behaviors and characteristics. | ||
| P18-2081 However, previous work on roll-call prediction has been limited to single ***** session ***** settings, thus not allowing for generalization across ***** session *****s. | ||
| L08-1013 In this paper we present the characteristics of the data which was recorded in three ***** session *****s resulting in a total of 75 dialogues and about 14 hours of audio and video data. | ||
| claim verification | 85 | |
| W18-5516 The shared task organizers provide a large-scale dataset for the consecutive steps involved in ***** claim verification *****, in particular, document retrieval, fact extraction, and claim classification. | ||
| 2021.nlp4if-1.4 Fact Extraction and VERification (FEVER) is a recently introduced task that consists of the following subtasks (i) document retrieval, (ii) sentence retrieval, and (iii) ***** claim verification *****. | ||
| 2020.emnlp-main.627 Existing models either (i) concatenate all the evidence sentences, leading to the inclusion of redundant and noisy information; or (ii) process each claim-evidence sentence pair separately and aggregate all of them later, missing the early combination of related sentences for more accurate ***** claim verification *****. | ||
| 2021.ranlp-1.56 This article describes research on ***** claim verification ***** carried out using a multiple GAN-based model. | ||
| 2020.fever-1.1 In this paper, we explore the potential of simplifying the system design and reducing training computation by proposing a joint training setup in which a single sequence matching model is trained with compounded labels that give supervision for both sentence selection and ***** claim verification ***** subtasks, eliminating the duplicate computation that occurs when models are designed and trained separately. | ||
| frameworks | 84 | |
| L08-1554 First, we group the events that constitute an event structure into event clusters and then, we use supervised learning ***** frameworks ***** to classify the relations that exist between events from the same cluster | ||
| 2020.conll-shared.8 Among the five ***** frameworks *****, we address only the abstract meaning representation framework and propose a joint state model for the graph-sequence iterative inference of (Cai and Lam, 2020) for a simplified graph-sequence inference. | ||
| W17-0814 We further explore the behavior of the ***** frameworks ***** with automatic training data generation. | ||
| J18-2001 We can also compare parsers' predictions to each other across ***** frameworks *****. | ||
| 2020.emnlp-main.290 To tackle these shortcomings, we propose two joint ***** frameworks ***** for ECPE: 1) multi-label learning for the extraction of the cause clauses corresponding to the specified emotion clause (CMLL) and 2) multi-label learning for the extraction of the emotion clauses corresponding to the specified cause clause (EMLL) | ||
| informativeness | 84 | |
| C18-1077 Our approach exploits the compositional capabilities of corpus-based and lexical resource-based word embeddings to develop the features reflecting coverage, diversity, ***** informativeness *****, and coherence of summaries. | ||
| 2020.emnlp-main.337 In this work, we show that even a handful of summaries is sufficient to bootstrap generation of the summary text with all expected properties, such as writing style, ***** informativeness *****, fluency, and sentiment preservation. | ||
| 2020.lrec-1.31 Correlating results of crowd and laboratory ratings reveals high applicability of crowdsourcing for the factors overall quality, grammaticality, non-redundancy, referential clarity, focus, structure & coherence, summary usefulness, and summary ***** informativeness *****. | ||
| 2020.coling-main.497 We conduct experiments in two widely-used sentence summarization datasets and experimental results show that our model outperforms the state-of-the-art methods in both automatic evaluation scores and ***** informativeness ***** metrics. | ||
| D18-1448 However, we have found through experiments that multimodal output can significantly improve user satisfaction for ***** informativeness ***** of summaries | ||
| linguists | 84 | |
| 2020.coling-tutorials.7 First, we will acquaint the attendees with the process and the challenges of language documentation, showing how the needs of the language communities and the documentary ***** linguists ***** map to specific NLP tasks. | ||
| 2010.amta-government.11 The Language Learning Broker (LLB) tool from the Technology Development Group (TDG) is a distributed system that supports dictionary/terminology management, personalized dictionaries, and a workflow between ***** linguists ***** and linguist management. | ||
| L12-1108 They tend to find their readers among language learners, language teachers, ***** linguists ***** and lexicographers. | ||
| L06-1458 Computational ***** linguists ***** get more accurate parses; the knowledge extracted from these parses becomes more reliable; theoretical ***** linguists ***** are presented with new data in a field that has been intensely discussed and yet remains in a state that is not satisfactory from a practical point of view. | ||
| L14-1397 The question is, in fact remarkably hard to answer, and many ***** linguists ***** consider it unanswerable | ||
| nested | 84 | |
| 2020.findings-emnlp.114 One of the main challenges is to identify ***** nested ***** structured events that are associated with non-indicative trigger words. | ||
| W19-1904 Our proposed model achieves state-of-the-art results in medical entity recognition datasets, using both ***** nested ***** and hierarchical mentions. | ||
| 2018.lilt-16.1 To do so, we present the RNNs with a set of random strings having a given maximum nesting depth and test its ability to predict the kind of closing parenthesis when facing deeper ***** nested ***** strings. | ||
| N18-1079 We propose a novel recurrent neural network-based approach to simultaneously handle ***** nested ***** named entity recognition and ***** nested ***** entity mention detection. | ||
| P18-3006 However, in practice, there are many domains, such as the biomedical domain, in which there are ***** nested *****, overlapping, and discontinuous entity mentions | ||
| modelling | 84 | |
| L12-1263 Data used for acoustic and language ***** modelling ***** are also described here. | ||
| L16-1042 In this paper we evaluate a number of measures of corpus similarity, including a method based on topic ***** modelling ***** which has not been previously evaluated for this task. | ||
| 2021.nlp4convai-1.19 Attention-based pre-trained language models such as GPT-2 brought considerable progress to end-to-end dialogue ***** modelling *****. | ||
| W18-3911 In this paper, we examine two transfer learning techniques of fine-tuning and layer substitution for language ***** modelling ***** of British Sign Language. | ||
| K19-1084 Our experiments focus on language ***** modelling ***** under synthetic conditions and show a strong perplexity reduction of using the second autoregressive model over the standard one | ||
| Sarcasm | 84 | |
| 2020.acl-main.349 ***** Sarcasm ***** is a sophisticated linguistic phenomenon to express the opposite of what one really means. | ||
| 2020.emnlp-main.201 *****Sarcasm***** detection is an important task in affective computing , requiring large amounts of labeled data . | ||
| 2020.nlpbt-1.3 *****Sarcasm***** detection in social media with text and image is becoming more challenging . | ||
| 2020.figlang-1.12 In this paper , we present the results obtained by BERT , BiLSTM and SVM classifiers on the shared task on *****Sarcasm***** Detection held as part of The Second Workshop on Figurative Language Processing . | ||
| C16-1151 *****Sarcasm***** detection is a key task for many natural language processing tasks . | ||
| dialect | 84 | |
| 2021.naacl-main.184 We also demonstrate the downstream applicability of ***** dialect ***** feature detection both as a measure of ***** dialect ***** density and as a ***** dialect ***** classifier. | ||
| L10-1244 Corpus annotation encompasses manual segmentation, an orthographic transcription, and labelling with speech mode, ***** dialect *****, and noise type. | ||
| L16-1321 Data acquisition in ***** dialect *****ology is typically a tedious task, as ***** dialect ***** samples of spoken language have to be collected via questionnaires or interviews. | ||
| 2020.vardial-1.3 We also test the use of this dataset for ***** dialect ***** classification by training a few baseline models comparing statistical and neural approaches. | ||
| 2020.lrec-1.174 The dataset contains tweets written in both Modern Standard Arabic and Saudi ***** dialect ***** | ||
| linear | 84 | |
| W16-3927 The models described in this paper are based on ***** linear ***** chain conditional random fields (CRFs), use the BIEOU encoding scheme, and leverage random feature dropout for up-sampling the training data. | ||
| 2020.coling-main.301 Our second method refines word representations by aligning original and re-fined embedding spaces based on local tangent space instead of performing weighted locally ***** linear ***** combination twice. | ||
| S17-1024 We provide a top-down parsing algorithm for RGL that runs in time ***** linear ***** in the size of the input graph. | ||
| W17-5234 The result of stage1 serves as the input of stage2, so the two different type models (***** linear ***** and non-***** linear *****) in stage2 can describe the input in two opposite aspects. | ||
| 2018.gwc-1.29 We showed that a hyponymy extraction method based on ***** linear ***** regression classifiers trained on clusters of vectors can be successfully applied on large scale | ||
| distributional semantic | 84 | |
| L14-1590 More recently, the topic of compositionality in the framework of ***** distributional semantic ***** representations has come to the surface and was investigated for building the semantic representation of phrases or even sentences from the representation of their words. | ||
| 2021.naacl-main.199 We investigate the possibilities and limitations of using ***** distributional semantic ***** models for analyzing philosophical data by means of a realistic use-case | ||
| C18-2003 This tool uniquely combines state-of-the-art ***** distributional semantic *****s with a nuanced model of human emotions, two information streams we deem beneficial for a data-driven interpretation of texts in the humanities. | ||
| D18-1023 We construct a multilingual common semantic space based on ***** distributional semantic *****s, where words from multiple languages are projected into a shared space via which all available resources and knowledge can be shared across multiple languages. | ||
| 2020.coling-main.173 Semantic models derived from visual information have helped to overcome some of the limitations of solely text-based ***** distributional semantic ***** models. | ||
| resources | 84 | |
| 2019.gwc-1.16 We conclude that the two ***** resources ***** actually differ from each other quite more than expected, both vocabulary and structure-wise. | ||
| L14-1638 Such knowledge ***** resources ***** can be derived from automatic parses of raw corpora, but unfortunately parsing still has not achieved a high enough performance for precise knowledge acquisition. | ||
| 2019.iwslt-1.26 We study here a related setting, multi-domain adaptation, where the number of domains is potentially large and adapting separately to each domain would waste training ***** resources *****. | ||
| L16-1072 We have seen that many ***** resources ***** exist which are useful for MT and similar work, but the majority are for (academic) research or educational use only, and as such not available for commercial use. | ||
| 2020.lrec-1.338 These ***** resources ***** are still in the early days of their exploration. | ||
| propaganda detection | 84 | |
| D19-5012 This paper describes our system (MIC-CIS) details and results of participation in the fine grained ***** propaganda detection ***** shared task 2019. | ||
| 2020.semeval-1.245 The article describes a fast solution to ***** propaganda detection ***** at SemEval-2020 Task 11, based on feature adjustment. | ||
| 2021.semeval-1.141 Among the tasks motivated by the proliferation of misinformation, ***** propaganda detection ***** is particularly challenging due to the deficit of fine-grained manual annotations required to train machine learning models. | ||
| 2020.semeval-1.197 This paper summarizes our studies on ***** propaganda detection ***** techniques for news articles in the SemEval-2020 task 11. | ||
| D19-5015 To this goal, we explore the task of sentence-level ***** propaganda detection *****, and experiment with both handcrafted features and learned dense semantic representations. | ||
| rumor detection | 84 | |
| 2020.coling-main.476 Experimental results on the TWITTER and PHEME datasets show that the proposed approach consistently improves ***** rumor detection ***** performance. | ||
| 2020.emnlp-main.727 This framework can more accurately learn the representation of an event in the initial stage and enable early ***** rumor detection *****. | ||
| P19-1113 The experiments on two datasets show that our proposed model outperforms the state-of-the-art ***** rumor detection ***** approaches. | ||
| P18-1184 Automatic ***** rumor detection ***** is technically very challenging. | ||
| 2021.nlp4if-1.6 This study presents a new dataset on *****rumor detection***** in Finnish language news headlines. | ||
| latent topic | 84 | |
| Q15-1010 Our model uses topic models to identify *****latent topics***** and their key linguistic features in input documents, induces constraints from this information and maps sentences to their dominant information structure categories through a constrained unsupervised model. | ||
| 2021.socialnlp-1.7 Using annotations of 5188 tweets from 291 annotators, we investigate how annotator perceptions of racism in tweets vary by annotator racial identity and two text features of the tweets: relevant keywords and *****latent topics***** identified through structural topic modeling. | ||
| W18-5113 Experimental results show that bidirectional GRU networks trained on word-level features, with *****Latent Topic***** Clustering modules, is the most accurate model scoring 0.805 F1. | ||
| P17-2084 Topical PageRank (TPR) uses *****latent topic***** distribution inferred by Latent Dirichlet Allocation (LDA) to perform ranking of noun phrases extracted from documents. | ||
| J18-4008 To address this issue, we organize microblog messages as conversation trees based on their reposting and replying relations, and propose an unsupervised model that jointly learns word distributions to represent: (1) different roles of conversational discourse, and (2) various *****latent topics***** in reflecting content information. | ||
| weighting | 83 | |
| W18-6412 We introduce new RNN-variant, mixed RNN/Transformer ensembles, data selection and ***** weighting *****, and extensions to back-translation. | ||
| E17-1007 We investigate an extensive number of such unsupervised measures, using several distributional semantic models that differ by context type and feature ***** weighting *****. | ||
| P19-1140 Each channel encodes KGs via different relation ***** weighting ***** schemes with respect to self-attention towards KG completion and cross-KG attention for pruning exclusive entities respectively, which are further combined via pooling techniques. | ||
| N19-1288 Furthermore, the representation of a group of bags in the training set which share the same relation label is calculated by ***** weighting ***** bag representations using a similarity-based inter-bag attention module. | ||
| N18-1146 In each domain our algorithms pick words that are associated with narrative persuasion; more predictive and less confound-related than those of standard feature ***** weighting ***** and lexicon induction techniques like regression and log odds | ||
| adequacy | 83 | |
| 2020.findings-emnlp.82 Surprisingly, we also find that ***** adequacy ***** appears to be less important, as shown by the high results of a strong sampling approach, which even beats human paraphrases when used with sentence-level BLEU. | ||
| 2020.coling-main.522 Human-generated non-literal translations reflect the richness of human languages and are sometimes indispensable to ensure ***** adequacy ***** and fluency. | ||
| 2014.amta-wptp.3 We evaluate the post-edited sentences according to a bilingual ***** adequacy ***** metric, and find that 96.5% of those sentences post-edited by only a monolingual post-editor are judged to be completely correct. | ||
| 2020.lrec-1.179 This ***** adequacy ***** is critical in the case of a child since her/his cognitive and linguistic skills are still under development. | ||
| W19-4019 The annotation of both treebanks, the Turkish PUD Treebank and TNC-UD, was carried out based on the decisions concerning linguistic ***** adequacy ***** of re-annotation of the Turkish IMST-UD Treebank (Türk et | ||
| HLT | 83 | |
| L14-1154 Despite the growth in the number of linguistic data centers around the world, their accomplishments and expansions and the advances they have help enable, the language resources that exist are a small fraction of those required to meet the goals of Human Language Technologies (***** HLT *****) for the worlds languages and the promises they offer: broad access to knowledge, direct communication across language boundaries and engagement in a global community. | ||
| L10-1087 The JOS language resources are meant to facilitate developments of ***** HLT ***** and corpus linguistics for the Slovene language and consist of the morphosyntactic specifications, defining the Slovene morphosyntactic features and tagset; two annotated corpora (jos100k and jos1M); and two web services (a concordancer and text annotation tool). | ||
| L10-1642 The number of catalogs the ***** HLT ***** researcher must search, with their different formats, make it possible to overlook an existing resource. | ||
| 2010.amta-government.4 This approach will accelerate ***** HLT ***** development, contain sustainment cost, minimize training, and brings the MT, OCR, ASR, audio/video, entity extraction, analytic tools and database under one umbrella, thus reducing the total cost of ownership. | ||
| 2010.amta-government.11 To provide the US Government analyst with dynamic tools that adapt to these changing domains, these ***** HLT ***** systems must support customizable lexicons | ||
| referential | 83 | |
| 2020.sigdial-1.22 We analyze a corpus of ***** referential ***** communication through the lens of quantitative models of speaker reasoning. | ||
| 2020.lrec-1.387 The ***** referential ***** grounding allows us to analyze the framing of these incidents in different languages and across different texts. | ||
| P17-1023 Therefore, we investigate models of ***** referential ***** word meaning that link visual to lexical information which we assume to be given through distributional word embeddings. | ||
| 2020.lrec-1.292 Eye4Ref is a rich multimodal dataset of eye-movement recordings collected from ***** referential *****ly complex situated settings where the linguistic utterances and their visual ***** referential ***** world were available to the listener. | ||
| 2021.wmt-1.91 We obtain new results using ***** referential ***** translation machines (RTMs) with predictions mixed to obtain a better mixture of experts prediction | ||
| CCG | 83 | |
| N19-1020 In ***** CCG *****, many right-branching derivations can be replaced by semantically equivalent left-branching incremental derivations. | ||
| P19-1013 We propose a new domain adaptation method for Combinatory Categorial Grammar ( CCG ) parsing , based on the idea of automatic generation of *****CCG***** corpora exploiting cheaper resources of dependency trees . | ||
| 2021.iwcs-1.4 We present a method for computing all quantifer scopes that can be extracted from a single *****CCG***** derivation . | ||
| 2012.amta-caas14.2 A preliminary implementation of AraMWE , a hybrid project that includes a statistical component and a CCG symbolic component to extract and treat MWEs and idioms in Arabic and Eng- lish parallel texts is presented , together with a general sketch of the system , a thorough description of the statistical component and a proof of concept of the *****CCG***** component . | ||
| 2021.emnlp-main.826 This paper proposes a new representation for *****CCG***** derivations . | ||
| fluency | 83 | |
| 2020.acl-main.333 Our metrics consist of (1) GPT-2 based context coherence between sentences in a dialogue, (2) GPT-2 based ***** fluency ***** in phrasing, (3) n-gram based diversity in responses to augmented queries, and (4) textual-entailment-inference based logical self-consistency. | ||
| C18-1082 In this study it was found that compared to human-written texts, computer-generated texts were rated slightly lower on style-related text components (***** fluency ***** and clarity) and slightly higher in terms of the correctness of given information. | ||
| K18-1031 Even though word-overlap metrics like ROUGE are computed with the help of hand-written references, our referenceless methods obtain a significantly higher correlation with human ***** fluency ***** scores on a benchmark dataset of compressed sentences. | ||
| P19-1503 We show that by using a product-of-experts criteria these are enough for maintaining continuous contextual matching while maintaining output ***** fluency *****. | ||
| 2020.conll-1.19 The main finding is that good comprehensibility, similarly to good ***** fluency *****, can mask a number of adequacy errors | ||
| rationales | 83 | |
| 2021.emnlp-main.807 On a new dataset of annotated sequential ***** rationales *****, greedy ***** rationales ***** are most similar to human ***** rationales *****. | ||
| 2021.emnlp-main.645 An open problem is how to improve the faithfulness of explanations (***** rationales *****) for the predictions of these models. | ||
| W19-4807 In this work, we show that learning with ***** rationales ***** can also improve the quality of the machine's explanations as evaluated by human judges. | ||
| C18-2032 Naturally, the extracted ***** rationales ***** serve as the introspection explanation for the prediction result of the model, enhancing the transparency of the model. | ||
| C18-1098 However, existing generative networks used to extract ***** rationales ***** come with a trade-off between extracting diversified ***** rationales ***** and achieving good classification results. | ||
| adjectives | 83 | |
| 2020.findings-emnlp.242 In experiments on the Flickr8K Audio Captions Corpus, we find that our model improves over approaches that use global visual features, that the proposals enable the model to recover entities and other related words, such as ***** adjectives *****, and that improvements are due to the model's ability to localize the correct proposals. | ||
| L14-1450 There are 51 target nouns, 51 ***** adjectives *****, and 51 verbs randomly selected from 3 frequency groups based on the lemma frequency list of the German WaCKy corpus. | ||
| L08-1534 Moreover, in a contrastive perspective, the possibilities of creating ***** adjectives ***** out of nouns are not the same in every language. | ||
| Q15-1014 First, objects can be seen as bundles of attributes, typically expressed as adjectival modifiers (a dog is something furry, brown, etc.), and thus a function trained to map visual representations of objects to nominal labels can implicitly learn to map attributes to ***** adjectives ***** | ||
| 2020.sltu-1.22 We also considered the interaction of ***** adjectives ***** with other grammatical means, especially other part of speeches, e.g. | ||
| structure | 83 | |
| K18-1001 However, the simple graphical model ***** structure ***** belies the often complex non-local constraints between output labels. | ||
| 2019.gwc-1.16 We conclude that the two resources actually differ from each other quite more than expected, both vocabulary and ***** structure *****-wise. | ||
| P17-1105 The outputs are represented as abstract syntax trees (ASTs) and constructed by a decoder with a dynamically-determined modular ***** structure ***** paralleling the ***** structure ***** of the output tree. | ||
| L14-1662 Extracting Linked Data following the Semantic Web principle from un***** structure *****d sources has become a key challenge for scientific research. | ||
| 2019.icon-1.3 The resultant discourse ***** structure ***** of Thirukkural can be indexed and further be used by Summary Generation Systems, IR Systems and QA Systems. | ||
| written | 83 | |
| W18-6312 We reassess a recent study (Hassan et al., 2018) that claimed that machine translation (MT) has reached human parity for the translation of news from Chinese into English, using pairwise ranking and considering three variables that were not taken into account in that previous study: the language in which the source side of the test set was originally ***** written *****, the translation proficiency of the evaluators, and the provision of inter-sentential context. | ||
| 2020.lrec-1.33 For instance, in “The FBI alleged in court documents that Zazi had admitted having a hand***** written ***** recipe for explosives on his computer”, do people believe that Zazi had a hand***** written ***** recipe for explosives? | ||
| 2016.lilt-14.5 We present two schemas for Portuguese, one for spoken Brazilian Portuguese and one for ***** written ***** European Portuguese. | ||
| L16-1513 the clinical subcorpus, consisting of ***** written ***** texts produced by speakers with various types of language disorders, and the healthy speakers subcorpus, as well as by the levels of its annotation, it offers an opportunity for different lines of research. | ||
| W16-4910 This paper discusses how to adapt two new word embedding features to build a more efficient Chinese Grammatical Error Diagnosis ( CGED ) systems to assist Chinese foreign learners ( CFLs ) in improving their *****written***** essays . | ||
| ML | 82 | |
| 2021.wnut-1.18 We present a large-scale human evaluation of two popular grammatical theories, Matrix-Embedded Language (***** ML *****) and Equivalence Constraint (EC). | ||
| 2020.iwltp-1.9 We present an software platform and API that combines various ***** ML ***** and NLP approaches for the analysis and enrichment of textual content. | ||
| 2020.semeval-1.284 This paper discusses how ***** ML ***** based classifiers can be enhanced disproportionately by adding small amounts of qualitative linguistic knowledge. | ||
| C18-1001 The supervised ***** ML ***** component leverages features such as word embeddings over referring expressions, parts of speech, and grammatical and semantic roles. | ||
| W19-3714 We propose a hybrid system combining a rule-based approach and light ***** ML ***** techniques | ||
| target language | 82 | |
| P19-2006 However, several studies strived to overcome divergences in the annotations between English AMRs and those of their ***** target language *****s by refining the annotation specification. | ||
| D18-1270 This enables our approach to: (a) augment the limited supervision in the ***** target language ***** with additional supervision from a high-resource language (like English), and (b) train a single entity linking model for multiple languages, improving upon individually trained models for each language. | ||
| 1999.mtsummit-1.47 The system is meant for a source language (SL) speaker who does not know the ***** target language ***** (TL). | ||
| W16-3714 To this end, we explore the use of distinctive feature weights, lexical tone confusions, and a two-step clustering algorithm to learn projections of phoneme segments from mismatched multilingual transcriber languages to the ***** target language *****. | ||
| P18-1075 Bilingual tasks, such as bilingual lexicon induction and cross-lingual classification, are crucial for overcoming data sparsity in the ***** target language *****. | ||
| automatic evaluation | 82 | |
| 2020.acl-main.450 QAGS has substantially higher correlations with these judgments than other ***** automatic evaluation ***** metrics. | ||
| 2021.acl-short.112 We perform both human evaluation and ***** automatic evaluation ***** of dialogs generated by our method. | ||
| 2021.eval4nlp-1.12 Reference-based ***** automatic evaluation ***** metrics are notoriously limited for NLG due to their inability to fully capture the range of possible outputs. | ||
| D18-1429 In this work, we show that current ***** automatic evaluation ***** metrics based on n-gram similarity do not always correlate well with human judgments about answerability of a question. | ||
| 2021.emnlp-main.589 Though researchers have attempted to use metrics for language generation tasks (e.g., perplexity, BLEU) or some model-based reinforcement learning methods (e.g., self-play evaluation) for ***** automatic evaluation *****, these methods only show very weak correlation with the actual human evaluation in practice. | ||
| bidirectional encoder representation | 82 | |
| 2020.lrec-1.157 In the former approach, we used Japanese-specific linguistic features, including character-type features such as “kanji” and “hiragana.” In the latter approach, we used two models: a long short-term memory (LSTM) model (Hochreiter and Schmidhuber, 1997) and a ***** bidirectional encoder representation *****s from transformers (BERT) model (Devlin et al., 2019), which achieved the highest accuracy in various natural language processing tasks in 2018. | ||
| W19-3206 The systems for the two subtasks are based on ***** bidirectional encoder representation *****s from transformers (BERT), and achieves promising results. | ||
| 2021.rocling-1.22 Due to the development of deep learning, the natural language processing tasks have made great progresses by leveraging the ***** bidirectional encoder representation *****s from Transformers (BERT). | ||
| 2021.smm4h-1.10 For both tasks we used models based on ***** bidirectional encoder representation *****s from transformers (BERT). | ||
| S19-2142 Finally, the fourth subsystem is a ***** bidirectional encoder representation *****s from transformers (BERT) model. | ||
| Firstly | 81 | |
| 2020.nlp4convai-1.13 ***** Firstly *****, we identify and fix dialogue state annotation errors across 17.3% of the utterances on top of MultiWOZ 2.1. | ||
| 2021.naacl-main.92 ***** Firstly *****, we connect the phenomenon of hallucinations under source perturbation to the Long-Tail theory of Feldman, and present an empirically validated hypothesis that explains hallucinations under source perturbation. | ||
| L14-1372 ***** Firstly *****, a generic standard Arabic AM is used along with the biased LM and the graphemic PM in a fast speech recognition pass. | ||
| 2021.acl-short.117 ***** Firstly *****, We evaluate our pre-trained model on various pronoun resolution datasets without any finetuning. | ||
| 2021.emnlp-tutorials.4 ***** Firstly *****, we introduce standard benchmarks in multi-domain and multilingual QA | ||
| Linked | 81 | |
| L14-1628 In this paper we present the publication of BabelNet 2.0, a wide-coverage multilingual encyclopedic dictionary and ontology, as ***** Linked ***** Data. | ||
| L14-1235 Intranet documents from five universities formed our organization specific corpora and we used open domain knowledge bases like Wikipedia, ***** Linked ***** Open Data, and web pages from the Internet as the organization independent data sources. | ||
| L14-1691 This paper presents ***** Linked ***** Health Answers, a natural language question answering systems that utilizes health data drawn from the ***** Linked ***** Data Cloud | ||
| 2020.ldl-1.3 The increasing recognition of the utility of *****Linked***** Data as a means of publishing lexical resource has helped to underline the need for RDF based data models which have the flexibility and expressivity to be able to represent the most salient kinds of information contained in such resources as structured data , including , notably , information relating to time and the temporal dimension . | ||
| 2020.globalex-1.1 The OntoLex vocabulary enjoys increasing popularity as a means of publishing lexical resources with RDF and as *****Linked***** Data . | ||
| iterative | 81 | |
| L12-1441 Based on our experience with ***** iterative ***** guideline refinement we propose to carefully characterize the thematic scope of the annotation by positive and negative coding lists and allow for alternative, short vs. long mention span annotations. | ||
| D18-2009 Par4Sem is a tool, which supports an adaptive, ***** iterative *****, and interactive process where the underlying machine learning models are updated for each iteration using new training examples from usage data. | ||
| C16-1267 The clustering method combines a Vector Space Models (VSM) and the results of a Latent Dirichlet Allocation (LDA), whose results are merged in each ***** iterative ***** step. | ||
| 2020.coling-main.204 In order to combine the two generation tasks, we propose a multi-agent communication framework that regards the topic description generator and the story generator as two agents and learn them simultaneously via ***** iterative ***** updating mechanism. | ||
| 2020.emnlp-main.87 In this paper we investigate the ***** iterative ***** generation of synthetic QA pairs as a way to realize unsupervised self adaptation | ||
| valency | 81 | |
| L10-1598 This work presents a method of linking verbs and their ***** valency ***** frames in VerbaLex database developed at the Centre for NLP at the Faculty of Informatics Masaryk University to the frames in Berkeley FrameNet. | ||
| D18-1159 Finally, we explore the potential of extending ***** valency ***** patterns beyond their traditional domain by confirming their helpfulness in improving PP attachment decisions. | ||
| L16-1082 Although Czech ― as an inflectional language encoding syntactic relations via morphological cases ― provides an excellent opportunity to study the distribution of ***** valency ***** complements in the syntactic structure with complex predicates, this distribution has not been described so far. | ||
| L14-1305 On the material of the Prague Czech-English Dependency Treebank we observe sentences in which an Addressee argument in one language is linked translationally to a Patient argument in the other one, and make generalizations about the theoretical grounds of the argument non-correspondences and its relations to the ***** valency ***** theory beyond the annotation practice | ||
| L10-1490 The paper briefly presents the model underlying the Bulgarian FrameNet ( BulFrameNet ): each lexical entry consists of a lexical unit ; a semantic frame from the English FrameNet , expressing abstract semantic structure ; a grammatical class , defining the inflexional paradigm ; a *****valency***** frame describing ( some of ) the syntactic and lexical - semantic combinatory properties ( an optional component ) ; and ( semantically and syntactically ) annotated examples . | ||
| sense | 81 | |
| L16-1524 The former involves four closely related language pairs with different language pair similarities, and the latter focuses on ***** sense ***** connectivity between non-pivot words and pivot words. | ||
| D17-1034 We leverage reinforcement learning to enable joint training on the proposed modules, and introduce various exploration techniques on ***** sense ***** selection for better robustness. | ||
| C18-2031 We derive a topic model based on nnDDC, which generates probability distributions over semantic units for any input on ***** sense *****-, word- and text-level. | ||
| 2020.emnlp-main.584 The Word-in-Context dataset (WiC) addresses the dependence on ***** sense ***** inventories by reformulating the standard disambiguation task as a binary classification problem; but, it is limited to the English language | ||
| D19-6008 This paper explores the use of Bidirectional Encoder Representations from Transformers(BERT) along with external relational knowledge from ConceptNet to tackle the problem of common***** sense ***** inference. | ||
| aggregation | 80 | |
| W19-4722 To this end, we combine diachronic word embeddings with appropriate visualization and exploratory techniques such as clustering and relative entropy for meaningful ***** aggregation ***** of data and diachronic comparison. | ||
| 2020.emnlp-main.646 Human-written texts contain frequent generalizations and semantic ***** aggregation ***** of content. | ||
| 2021.emnlp-main.223 More specifically, the clients request the user model and news representations from the server, and send their locally computed gradients to the server for ***** aggregation *****. | ||
| 2020.lrec-1.29 The result shows that the proposed method estimates the quality of speech more effectively than a vote ***** aggregation *****, measured by correlation with a fine-grained classification by experts | ||
| D19-1626 In this work , we propose an *****aggregation***** method to combine the Bidirectional Encoder Representations from Transformer ( BERT ) with a MatchLSTM layer for Sequence Matching . | ||
| entropy | 80 | |
| 2021.ranlp-1.18 We propose to help the summarizer to learn from a limited amount of data by limiting the ***** entropy ***** of the input texts. | ||
| L14-1507 We evaluate the accuracy of the proposed mapping using cluster similarity metrics based on ***** entropy *****. | ||
| N18-2010 Many simple NLG models are based on recurrent neural networks (RNN) and sequence-to-sequence (seq2seq) model, which basically contains a encoder-decoder structure; these NLG models generate sentences from scratch by jointly optimizing sentence planning and surface realization using a simple cross ***** entropy ***** loss training criterion. | ||
| 2019.jeptalnrecital-tia.5 In this paper, we consolidate the status of termino-conceptual sphere and propose a way to characterise the structure of termino-conceptual system by using ***** entropy *****. | ||
| R17-1061 Another type of conventionalized phrases can be revealed using two factors: low ***** entropy ***** of phrase associations and low intersection of component word and phrase associations | ||
| Task | 80 | |
| S19-2089 In English, we achieved an F1-Score of 0.466 for ***** Task ***** A and 0.462 for ***** Task ***** B; In Spanish, we achieved scores of 0.617 and 0.612 on ***** Task ***** A and ***** Task ***** B, respectively. | ||
| 2020.wmt-1.113 Our submitted models achieve significant improvement over the baselines for ***** Task ***** 1 (Sentence-Level Direct Assessment; EN-DE only), and ***** Task ***** 3 (Document-Level Score). | ||
| W19-4207 Among submissions to ***** Task ***** 1, our models rank second and third. | ||
| 2021.semeval-1.35 This paper describes our system participated in ***** Task ***** 7 of SemEval-2021: Detecting and Rating Humor and Offense | ||
| 2020.alvr-1.4 *****Task***** success is the standard metric used to evaluate referential visual dialogue systems . | ||
| Python | 80 | |
| 2021.emnlp-demo.5 ET is open-source, built on different ***** Python ***** Web technologies and has Web demonstrations available on-line. | ||
| D17-2001 The lexicon (structured in terms of frames) as well as annotated sentences can be processed programatically, or browsed with human-readable displays via the interactive ***** Python ***** prompt. | ||
| 2021.sigdial-1.26 Thus, the toolkit aims to make working with authentic conversational speech data in ***** Python ***** more accessible and to provide the user with comprehensive options to work with representations of talk in appropriate detail for any downstream task. | ||
| 2021.naacl-main.211 PLBART is pre-trained on an extensive collection of Java and ***** Python ***** functions and associated NL text via denoising autoencoding | ||
| C18-2012 We present WOMBAT , a *****Python***** tool which supports NLP practitioners in accessing word embeddings from code . | ||
| discrete | 80 | |
| W19-1501 In this paper, we tackle this problem by utilizing state access patterns of StackLSTM to homogenize computations with regard to different ***** discrete ***** operations. | ||
| 2021.eacl-main.209 Our approach outperforms the fully ***** discrete *****, fully continuous, and static mixture model on topic coherence in low resource settings. | ||
| 2020.semeval-1.19 Here, a meaning shift is composed of two aspects, a) ***** discrete ***** changes observed between different word senses, and b) more subtle changes of meaning representation that are not captured in those ***** discrete ***** changes. | ||
| 2021.emnlp-main.115 In this work, we address this gap and leverage ***** discrete ***** attacks for online augmentation, where adversarial examples are generated at every training step, adapting to the changing nature of the model. | ||
| P18-1190 A neural variational inference framework is proposed for training, where gradients are directly backpropagated through the ***** discrete ***** latent variable to optimize the hash function | ||
| morphological inflection | 80 | |
| 2020.sigmorphon-1.22 Cross-lingual transfer between typologically related languages has been proven successful for the task of ***** morphological inflection *****. | ||
| 2020.alta-1.15 The paper investigates repetitive loops, a common problem in contemporary text generation (such as machine translation, language modelling, ***** morphological inflection *****) systems. | ||
| 2021.naacl-main.435 Sequence-to-sequence models have delivered impressive results in word formation tasks such as ***** morphological inflection *****, often learning to model subtle morphophonological details with limited training data. | ||
| 2021.insights-1.13 The method is directly applicable to ***** morphological inflection ***** generation if unlabeled word forms are available | ||
| P19-1146 Experiments on ***** morphological inflection ***** and machine translation reveal consistent gains over dense models. | ||
| corpus annotation | 80 | |
| L10-1245 Finally, in section 4, we examine technical aspects of LSF video data editing and ***** corpus annotation *****, in the perspective of setting up a corpus-based formalized description of LSF. | ||
| L08-1167 We also note implications for large scale ***** corpus annotation ***** projects that deal with similarly subjective phenomena. | ||
| L10-1388 After ***** corpus annotation ***** conclusion, we report some of the annotation results and some comments on the improvements there should be made in an annotation tool to better support such kind of annotation task. | ||
| L10-1378 The Transformer automatically outputs format-compliant FrameNet versions, including modified ***** corpus annotation ***** files that can be used for automatic processing | ||
| L10-1605 To the best of our knowledge, this is the first attempt at a large-scale ***** corpus annotation ***** of Polish named entities. | ||
| semantic similarity | 80 | |
| 2021.sigdial-1.49 But, the effects of minimizing an alternate training objective that fosters a model to generate alternate response and score it on ***** semantic similarity ***** has not been well studied. | ||
| 2021.acl-short.124 Firstly, a concept-sentence attention module is developed to select the most appropriate concept from multiple concepts of each entity by calculating the ***** semantic similarity ***** between sentences and concepts. | ||
| Q17-1022 The effectiveness of our approach is demonstrated with state-of-the-art results on ***** semantic similarity ***** datasets in six languages. | ||
| K19-1085 Contrary to previous works, which typically focus on the ***** semantic similarity ***** between a question and its answer, our hypothesis is that question-answer pairs are often in analogical relation to each other. | ||
| P18-2009 We also show that the learned embeddings perform well on the task of sentence ***** semantic similarity ***** prediction. | ||
| Social | 80 | |
| S19-2210 *****Social***** media has an increasing amount of information that both customers and companies can benefit from . | ||
| S19-2118 This paper describes our system submissions as part of our participation ( team name : JU_ETCE_17_21 ) in the SemEval 2019 shared task 6 : OffensEval : Identifying and Catego- rizing Offensive Language in *****Social***** Media . | ||
| 2010.amta-government.5 *****Social***** media and tools for communication over the Internet have expanded a great deal in recent years . | ||
| D19-5003 *****Social***** media has reportedly been ( ab)used by Russian troll farms to promote political agendas . | ||
| 2021.wnut-1.28 *****Social***** media is an essential tool to share information about crisis events , such as natural disasters . | ||
| phoneme | 79 | |
| L08-1205 To conclude, an analysis of the annotation differences with respect to the *s label (i.e. a label that is used to annotate undistinguishable spelling behaviour), ***** phoneme ***** labels, reading strategy and error labels is given. | ||
| W17-5403 We show an 11% improvement in ***** phoneme ***** error rate over an approach based on adapting high-resource monolingual g2p models to low-resource languages. | ||
| D19-6121 The advantages of the multimodal model generalize to wholly unseen languages, reduc- ing ***** phoneme ***** error rate on our out-of-domain test set to 6.39% from the unimodal 8.21%, a more than 20% relative decrease. | ||
| W18-2414 These models are based on initial alignments between grapheme source and ***** phoneme ***** target sequences. | ||
| L16-1120 Starting from the recordings and transcriptions of numerous singers, diarization and ***** phoneme ***** alignment experiments have been made to extract the singing voice from the recordings and create ***** phoneme ***** alignments | ||
| Phrase | 79 | |
| 2021.emnlp-main.846 ***** Phrase ***** representations derived from BERT often do not exhibit complex phrasal compositionality, as the model relies instead on lexical similarity to determine semantic relatedness. | ||
| 2021.econlp-1.7 We present how to privately train NLP models and desirable privacy utility trade-offs and evaluate it on the Financial ***** Phrase ***** Bank dataset | ||
| 2020.emnlp-main.125 *****Phrase***** alignment is the basis for modelling sentence pair interactions , such as paraphrase and textual entailment recognition . | ||
| 2021.emnlp-main.513 *****Phrase***** grounding aims to map textual phrases to their associated image regions , which can be a prerequisite for multimodal reasoning and can benefit tasks requiring identifying objects based on language . | ||
| L16-1109 *****Phrase***** chunking remains an important natural language processing ( NLP ) technique for intermediate syntactic processing . | ||
| translators | 79 | |
| 2021.acl-long.153 Besides, WSLS exhibits strong transferability on attacking Baidu and Bing online ***** translators *****. | ||
| L06-1017 The semantics of relationships are vague because the principal users of these relationships are industrial actors (***** translators ***** of technical handbooks, terminologists, data-processing specialists, etc.). | ||
| 2021.emnlp-main.799 Quality of QE is crucial, as incorrect QE might lead to ***** translators ***** missing errors or wasting time on already correct MT output. | ||
| 2016.amta-researchers.3 A TM contains translation units (TU) which are made up of source and target language segments; ***** translators ***** use the target segments in the TU suggested by the CAT tool by converting them into the desired translation. | ||
| 2020.signlang-1.18 Though our current database is small, we hope for ***** translators ***** to invest themselves and help us to keep it expanding. | ||
| ellipsis | 79 | |
| 2001.mtsummit-papers.44 Some Japanese clauses contain more than one argument ***** ellipsis *****, and yet this fact has not adequately been accounted for in the study of ***** ellipsis ***** resolution in the current literature, which predominantly focus resolving one ***** ellipsis ***** per sentence. | ||
| 2021.acl-long.438 Conversational Question Simplification (CQS) aims to simplify self-contained questions into conversational ones by incorporating some conversational characteristics, e.g., anaphora and ***** ellipsis *****. | ||
| W19-3310 However, there's only a few corpora annotating the ***** ellipsis *****, which draws back the automatic detection and recovery of the ***** ellipsis *****. | ||
| 2016.lilt-13.1 The key insight guiding the work is that not all cases of ***** ellipsis ***** are equally difficult: some can be detected and resolved with high confidence even before we are able to build systems with human-level semantic and pragmatic understanding of text | ||
| 2020.acl-main.5 However, in multi-domain scenarios, ***** ellipsis ***** and reference are frequently adopted by users to express values that have been mentioned by slots from other domains. | ||
| multilingual neural machine | 79 | |
| 2021.wmt-1.86 We prepared state-of-the-art ***** multilingual neural machine ***** translation systems for three languages (i.e. | ||
| 2020.emnlp-main.476 In this study, we revisit the ***** multilingual neural machine ***** translation model that only share modules among the same languages (M2) as a practical alternative to 1-1 to satisfy industrial requirements. | ||
| 2017.iwslt-1.15 In this paper, we proposed two strategies which can be applied to a ***** multilingual neural machine ***** translation system in order to better tackle zero-shot scenarios despite not having any parallel corpus. | ||
| 2020.findings-emnlp.283 We present a probabilistic framework for ***** multilingual neural machine ***** translation that encompasses supervised and unsupervised setups, focusing on unsupervised translation. | ||
| 2020.wmt-1.98 We describe parBLEU, parCHRF++, and parESIM, which augment baseline metrics with automatically generated paraphrases produced by PRISM (Thompson and Post, 2020a), a ***** multilingual neural machine ***** translation system. | ||
| deep reinforcement | 79 | |
| D19-1042 To solve the mismatch between training and inference as well as modeling label dependencies in a more principled way, we formulate HTC as a Markov decision process and propose to learn a Label Assignment Policy via ***** deep reinforcement ***** learning to determine where to place an object and when to stop the assignment process. | ||
| 2021.naacl-main.316 Building on these shortcomings, we propose a ***** deep reinforcement ***** learning approach that makes time-aware decisions to trade stocks while optimizing profit using textual data. | ||
| D17-1237 This paper addresses this challenge by formulating the task in the mathematical framework of options over Markov Decision Processes (MDPs), and proposing a hierarchical ***** deep reinforcement ***** learning approach to learning a dialogue manager that operates at different temporal scales. | ||
| P18-1053 In this paper, we show how to integrate these goals, applying ***** deep reinforcement ***** learning to deal with the task. | ||
| C18-1107 The proposed structured ***** deep reinforcement ***** learning is based on graph neural networks (GNN), which consists of some sub-networks, each one for a node on a directed graph. | ||
| User | 78 | |
| 2021.hackashop-1.13 *****User***** commenting is a valuable feature of many news outlets , enabling them a contact with readers and enabling readers to express their opinion , provide different viewpoints , and even complementary information . | ||
| D19-5510 *****User***** reviews provide a significant source of information for companies to understand their market and audience . | ||
| I17-1072 *****User***** experience is essential for human - computer dialogue systems . | ||
| 2020.findings-emnlp.174 *****User***** modeling is critical for many personalized web services . | ||
| W18-5007 *****User***** Simulators are one of the major tools that enable offline training of task - oriented dialogue systems . | ||
| detecting | 78 | |
| P19-4004 In political science, NLP methods have been extensively used for a number of analyses types and tasks, including inferring policy position of actors from textual evidence, ***** detecting ***** topics in political texts, and analyzing stylistic aspects of political texts (e.g., assessing the role of language ambiguity in framing the political agenda). | ||
| 2020.nlpcovid19-acl.16 Focussing on outcomes that we believe will be useful for Public Health Organizations, we analyse them in three different ways: identifying the topics discussed during the period, ***** detecting ***** rumours, and predicting the source of the tweets. | ||
| 2021.acl-long.528 Inter-GPS first parses the problem text and diagram into formal language automatically via rule-based text parsing and neural object ***** detecting *****, respectively. | ||
| 2021.latechclfl-1.14 Our findings on two major scientific discoveries in chemistry and astronomy of the 18th century reveal that modelling both the introduction and diffusion of scientific terms in a historical corpus as Hawkes Processes allows ***** detecting ***** patterns of influence between authors on a long-term scale. | ||
| N18-1110 When the vulnerable asset is the user, ***** detecting ***** these potential attacks before they cause serious damages is extremely important | ||
| dialect identification | 78 | |
| 2020.wanlp-1.28 This paper presents the ArabicProcessors team's deep learning system designed for the NADI 2020 Subtask 1 (country-level ***** dialect identification *****) and Subtask 2 (province-level ***** dialect identification *****). | ||
| 2020.vardial-1.1 The campaign included three shared tasks each focusing on a different challenge of language and ***** dialect identification *****: Romanian Dialect Identification (RDI), Social Media Variety Geolocation (SMG), and Uralic Language Identification (ULI). | ||
| W19-1406 We participated in all language/***** dialect identification ***** tasks, as well as the Moldavian vs. Romanian cross-dialect topic identification (MRC) task. | ||
| W18-3909 We therefore conclude that our multiple kernel learning method is the best approach to date for Arabic ***** dialect identification *****. | ||
| W19-4629 Arabic ***** dialect identification ***** is an inherently complex problem, as Arabic dialect taxonomy is convoluted and aims to dissect a continuous space rather than a discrete one. | ||
| random fields | 78 | |
| L06-1069 This paper presents a framework for Thai morphological analysis based on the theoretical background of conditional ***** random fields *****. | ||
| 2020.figlang-1.27 In this paper we present a novel resource-inexpensive architecture for metaphor detection based on a residual bidirectional long short-term memory and conditional ***** random fields *****. | ||
| L16-1684 We investigate a variety of algorithms including neural nets, conditional ***** random fields ***** and self-learning techniques in order to find the best-fitted approach to tackle data sparsity. | ||
| W18-2402 This paper presents a method of designing specific high-order dependency factor on the linear chain conditional ***** random fields ***** (CRFs) for named entity recognition (NER). | ||
| W17-2345 A comparative analysis is also done which reveals the complementary behavior of neural networks and conditional ***** random fields ***** in clinical entity detection. | ||
| open - domain qa | 78 | |
| P19-1436 In this paper, we introduce query-agnostic indexable representations of document phrases that can drastically speed up *****open-domain QA*****. | ||
| 2021.emnlp-main.698 We implement eight representative control methods and *****open-domain QA***** methods as baselines. | ||
| N19-1030 We evaluate our model on multiple *****open-domain QA***** datasets, notably achieving the level of the state-of-the-art on the AI2 Reasoning Challenge (ARC) dataset. | ||
| 2021.acl-long.164 Furthermore, on *****open-domain QA***** (Quasar-T and SearchQA), the combination of the CNN with ALBERT or RoBERTa achieved stronger performance than SOTA and the original TLMs. | ||
| P19-1414 We show that they also improve a state-of-the-art distantly supervised *****open-domain QA***** (DS-QA) method on publicly available English datasets, even though the target task is not a why-QA. | ||
| machine translation ( MT | 78 | |
| W16-3717 Neural machine translation ( NMT ) models have recently been shown to be very successful in *****machine translation ( MT***** ) . | ||
| D19-1353 When performing cross - language information retrieval ( CLIR ) for lower - resourced languages , a common approach is to retrieve over the output of *****machine translation ( MT***** ) . | ||
| 2020.eamt-1.18 The improvement in the quality of *****machine translation ( MT***** ) for both majority and minority languages in recent years is resulting in its steady adoption . | ||
| 2020.eamt-1.49 Document - level ( doc - level ) human eval - uation of *****machine translation ( MT***** ) has raised interest in the community after a fewattempts have disproved claims of human parity ( Toral et al . , 2018 ; Laubli et al . ,2018 ) . | ||
| L12-1592 We describe the Shared Task on Applying Machine Learning Techniques to Optimise the Division of Labour in Hybrid Machine Translation ( ML4HMT ) which aims to foster research on improved system combination approaches for *****machine translation ( MT***** ) . | ||
| templates | 77 | |
| 2020.acl-main.531 In order to learn the syntactic structure of the target sentences, we adopt constituency-based parse tree to generate candidate ***** templates *****. | ||
| 1998.amta-papers.4 Specifically we describe a two-step process for creating candidate thematic grids for Mandarin Chinese verbs, using the English verb heading the VP in the subde_nitions to separate senses, and roughly parsing the verb complement structure to match thematic structure ***** templates *****. | ||
| W18-6505 We further propose a training method based on diverse ensembling to encourage models to learn distinct sentence ***** templates ***** during training. | ||
| L16-1307 We describe the limits of using this corpus (size, non-representativeness, similarity of roles across ***** templates *****) and propose a new, partially-annotated corpus in English which remedies some of these shortcomings. | ||
| N18-1003 This is achieved by using entity and template seeds jointly (as opposed to just one as in previous work), by expanding entities and ***** templates ***** in parallel and in a mutually constraining fashion in each iteration and by introducing higherquality similarity measures for ***** templates ***** | ||
| morphemes | 77 | |
| P19-2060 The proposed scheme focuses on segmenting inflections as single words instead of separating the auxiliary verbs and other ***** morphemes ***** from the stems. | ||
| 2008.amta-papers.19 To reduce the morpheme-level translation ambiguity, we group the ***** morphemes ***** into morpheme phrases and propose the use of domain information for translation candidate selection. | ||
| W19-4610 A challenge in applying natural language processing techniques to these languages is the data sparsity problem that arises from their rich internal morphology, where the substructure is inherently non-concatenative and ***** morphemes ***** are interdigitated in word formation. | ||
| D18-1530 Sandhi splitting is the process of splitting a given compound word into its constituent ***** morphemes *****. | ||
| C16-1033 MA&D is particularly challenging in morphologically rich languages (MRLs), where the ambiguous space-delimited tokens ought to be disambiguated with respect to their constituent ***** morphemes *****, each morpheme carrying its own tag and a rich set features | ||
| VarDial | 77 | |
| W19-1421 We describe our approaches for the German Dialect Identification (GDI) and the Cuneiform Language Identification (CLI) tasks at the ***** VarDial ***** Evaluation Campaign 2019. | ||
| W17-1201 We present the results of the ***** VarDial ***** Evaluation Campaign on Natural Language Processing (NLP) for Similar Languages, Varieties and Dialects, which we organized as part of the fourth edition of the ***** VarDial ***** workshop at EACL'2017. | ||
| W18-3901 A total of 24 teams submitted runs across the five shared tasks, and contributed 22 system description papers, which were included in the ***** VarDial ***** workshop proceedings and are referred to in this report. | ||
| W17-1217 This paper presents the cic_ualg's system that took part in the Discriminating between Similar Languages (DSL) shared task, held at the ***** VarDial ***** 2017 Workshop | ||
| 2021.vardial-1.11 This paper describes the system developed by the Laboratoire d'analyse statistique des textes for the Dravidian Language Identification ( DLI ) shared task of *****VarDial***** 2021 . | ||
| Seq2Seq | 77 | |
| 2020.findings-emnlp.23 However, ***** Seq2Seq ***** enforces an unnecessary order on the unordered triplets and involves a large decoding length associated with error accumulation. | ||
| 2020.acl-main.705 The finetuned BERT (teacher) is exploited as extra supervision to improve conventional ***** Seq2Seq ***** models (student) for better text generation performance. | ||
| 2021.naacl-main.449 Although the task of dialog response generation is generally seen as a sequence to sequence (***** Seq2Seq *****) problem, researchers in the past have found it challenging to train dialog systems using the standard ***** Seq2Seq ***** models. | ||
| W18-1004 ***** Seq2Seq ***** based neural architectures have become the go-to architecture to apply to sequence to sequence language tasks. | ||
| 2021.acl-long.472 Abstractive summarization for long - document or multi - document remains challenging for the *****Seq2Seq***** architecture , as Seq2Seq is not good at analyzing long - distance relations in text . | ||
| classifying | 77 | |
| 2021.wnut-1.34 Stance detection (SD) entails ***** classifying ***** the sentiment of a text towards a given target, and is a relevant sub-task for opinion mining and social media analysis. | ||
| 2020.coling-main.510 On the other hand, a Confusion-Aware Training (CAT) method is proposed to explicitly learn to distinguish relations by playing a pushing-away game between ***** classifying ***** a sentence into a true relation and its confusing relation. | ||
| 2021.smm4h-1.9 We achieved the first rank in both ***** classifying ***** adverse drug effects and COVID-19 self-report tasks. | ||
| 2020.cllrd-1.1 Citizen linguists contribute language data and judgments by participating in research tasks such as ***** classifying ***** regional accents from audio clips, recording audio of picture descriptions and answering personality questionnaires to create baseline data for NLP research into autism and neurodegenerative conditions. | ||
| 2021.sigdial-1.17 We propose a preliminary study, ***** classifying ***** utterances into major, minor and off-topics, which further extends into a system initiative for diversion rectification | ||
| auxiliary | 77 | |
| 2021.emnlp-main.455 This paper presents GradTS, an automatic ***** auxiliary ***** task selection method based on gradient calculation in Transformer-based models. | ||
| 2020.acl-main.495 To remedy that, we train the model with ***** auxiliary ***** supervision and propose particular choices for module architecture that yield much better faithfulness, at a minimal cost to accuracy. | ||
| 2021.naacl-main.42 In this paper, we propose MetaXL, a meta-learning based framework that learns to transform representations judiciously from ***** auxiliary ***** languages to a target one and brings their representation spaces closer for effective transfer. | ||
| 2021.naacl-main.296 Previous works focus on detecting these biases, reducing bias in data representations, and using ***** auxiliary ***** training objectives to mitigate bias during fine-tuning. | ||
| 2020.emnlp-main.572 Based on BERT, we learn domain-invariant feature representations by using part-of-speech features and syntactic dependency relations to construct ***** auxiliary ***** tasks, and jointly perform word-level instance weighting in the framework of sequence labeling | ||
| sequences | 77 | |
| N19-1298 To exploit the representation behind the RbSP structure effectively, we develop a combined deep neural model with a LSTM network on word ***** sequences ***** and a CNN on RbSP. | ||
| 2021.internlp-1.2 The agent is equipped with a learning mechanism for mapping new commands to ***** sequences ***** of simple actions, as well as the ability to incorporate user input into written responses. | ||
| C16-1288 This article proposes a novel character-aware neural machine translation (NMT) model that views the input ***** sequences ***** as ***** sequences ***** of characters rather than words. | ||
| 2020.iwslt-1.33 We also modify the typical framing of this task by predicting punctuation for ***** sequences ***** rather than individual tokens, which makes for more efficient training and inference. | ||
| Q19-1019 We focus on graph-to-sequence learning, which can be framed as transducing graph structures to ***** sequences ***** for text generation | ||
| morphological segmentation | 77 | |
| J18-2005 This article presents a probabilistic hierarchical clustering model for ***** morphological segmentation *****. | ||
| W19-5343 We explore the potential benefits of (i) ***** morphological segmentation ***** (both unsupervised and rule-based), given the agglutinative nature of Kazakh, (ii) data from two additional languages (Turkish and Russian), given the scarcity of English–Kazakh data and (iii) synthetic data, both for the source and for the target language. | ||
| W19-4222 We propose unsupervised approaches for ***** morphological segmentation ***** of low-resource polysynthetic languages based on Adaptor Grammars (AG) (Eskander et al., 2016). | ||
| L12-1248 We present an annotation and ***** morphological segmentation ***** scheme for Egyptian Colloquial Arabic (ECA) in which we annotate user-generated content that significantly deviates from the orthographic and grammatical rules of Modern Standard Arabic and thus cannot be processed by the commonly used MSA tools | ||
| W18-5808 However, while LIMS worked best on average and outperforms other state-of-the-art unsupervised ***** morphological segmentation ***** approaches, it did not provide the optimal AG configuration for five out of the six languages. | ||
| positional encoding | 77 | |
| 2020.findings-emnlp.59 Compared with commonly used ***** positional encoding ***** schemes, CSPAN can exploit the interaction between semantics and word positions in a more interpretable and adaptive manner, and the classification performance can be notably improved while simultaneously preserving a compact model size and high convergence rate. | ||
| 2021.wmt-1.12 In the systems submitted, we primarily considered wider networks, deeper networks, relative ***** positional encoding *****, and dynamic convolutional networks in terms of model structure, while in terms of training, we investigated contrastive learning-reinforced domain adaptation, self-supervised training, and optimization objective switching training methods. | ||
| 2019.iwslt-1.20 While absolute and relative ***** positional encoding ***** perform equally strong overall, we show that relative ***** positional encoding ***** is vastly superior (4.4% to 11.9% BLEU) when translating a sentence that is longer than any observed training sentence. | ||
| 2021.ranlp-1.176 This study proposes an utterance position-aware approach for a neural network-based dialogue act recognition (DAR) model, which incorporates ***** positional encoding ***** for utterance's absolute or relative position. | ||
| 2021.ranlp-1.172 To address these issues, we propose global ***** positional encoding ***** for dependency tree, a new scheme that facilitates syntactic relation modeling between any two words with keeping exactness and without immediate neighbor constraint | ||
| monolingual word | 77 | |
| D17-1264 Since our approach relies on the quality of *****monolingual word***** embeddings, we also propose to enhance vector representations of both the source and target language with linguistic information. | ||
| 2020.globalex-1.13 In this paper we describe the system submitted to the ELEXIS *****Monolingual Word***** Sense Alignment Task. | ||
| W17-4211 We leverage press agency newswire and *****monolingual word***** alignment techniques to build meaningful and linguistically varied clusters of articles from the web in the perspective of a broader event type detection task. | ||
| Q18-1014 We propose an unsupervised approach for learning a bilingual dictionary for a pair of languages given their independently-learned *****monolingual word***** embeddings. | ||
| D17-3007 With the increasing use of *****monolingual word***** vectors, there is a need for word vectors that can be used as efficiently across multiple languages as monolingually. | ||
| stochastic | 76 | |
| 2021.emnlp-main.525 Commonly, rationales are modeled as ***** stochastic ***** binary masks, requiring sampling-based gradient estimators, which complicates training and requires careful hyperparameter tuning. | ||
| 2021.eacl-main.185 Designing profitable trading strategies is complex as stock movements are highly ***** stochastic *****; the market is influenced by large volumes of noisy data across diverse information sources like news and social media. | ||
| 1997.iwpt-1.2 The exception of SCFGs seemed promising, with all the hype around Hidden Markov Models and other ***** stochastic ***** methods, but it remained to be confirmed for RNAs longer than 200 bases. | ||
| P19-1580 When pruning heads using a method based on ***** stochastic ***** gates and a differentiable relaxation of the L0 penalty, we observe that specialized heads are last to be pruned. | ||
| 2020.emnlp-main.676 Stock forecasting is complex, given the ***** stochastic ***** dynamics and non-stationary behavior of the market | ||
| disfluency | 76 | |
| Q14-1011 The joint model performed better on both tasks, with a parse accuracy of 90.5% and 84.0% accuracy at ***** disfluency ***** detection. | ||
| N18-1007 We find that different types of acoustic-prosodic features are individually helpful, and together give statistically significant improvements in parse and ***** disfluency ***** detection F1 scores over a strong text-only baseline. | ||
| L06-1340 In addition to the word spoken, the prosodic content of the speech has been proved quite valuable in a variety of spoken language processing tasks such as sentence segmentation and tagging, ***** disfluency ***** detection, dialog act segmentation and tagging, and speaker recognition. | ||
| 2020.findings-emnlp.186 By contrast, this paper aims to investigate the task of end-to-end speech recognition and ***** disfluency ***** removal | ||
| C18-1299 While the *****disfluency***** detection has achieved notable success in the past years , it still severely suffers from the data scarcity . | ||
| grounding | 76 | |
| 2020.acl-main.644 Our examination of the dataset shows that Refer360 manifests linguistically rich phenomena in a language ***** grounding ***** task that poses novel challenges for computational modeling of language, vision, and navigation. | ||
| 2021.naacl-main.105 We identify a set of novel pretraining tasks: column ***** grounding *****, value ***** grounding ***** and column-value mapping, and leverage them to pretrain a text-table encoder. | ||
| W17-5533 This hybrid DM architecture affords incremental processing of uncertain input, a flexible, mixed-initiative information ***** grounding ***** process that can be adapted to users' cognitive capacities and interactive idiosyncrasies, and generic mechanisms that foster transitions in the joint discourse state that are understandable and controllable by those users, in order to effect a robust interaction for users with varying capacities. | ||
| 2020.challengehml-1.5 In this paper, a cross-situational learning based ***** grounding ***** framework is proposed that allows ***** grounding ***** of words and phrases through corresponding percepts without human supervision and online, i.e. it does not require any explicit training phase, but instead updates the obtained mappings for every new encountered situation. | ||
| 2020.emnlp-main.655 This dataset is annotated with pre-existing user knowledge, message-level dialog acts, ***** grounding ***** to Wikipedia, and user reactions to messages | ||
| memes | 76 | |
| 2020.semeval-1.158 The data consist text that extracted from ***** memes ***** and the images of ***** memes *****. | ||
| 2021.dravidianlangtech-1.43 As ***** memes ***** are in images forms with embedded text, it can quickly spread hate, offence and violence. | ||
| 2021.woah-1.21 The task include two subtasks relating to distinct challenges in the fine-grained detection of hateful ***** memes *****: (1) the protected category attacked by the meme and (2) the attack type. | ||
| 2020.semeval-1.116 this paper proposed a parallel-channel model to process the textual and visual information in ***** memes ***** and then analyze the sentiment polarity of ***** memes ***** | ||
| S19-1018 In doing so, it provides a reformalisation (in TTR) of enthy***** memes ***** and topoi as networks rather than functions, and information state update rules for conditionals. | ||
| ambiguity | 76 | |
| N18-5011 After a brief introduction to structural ***** ambiguity *****, users are challenged to complete a sentence in a way that tricks the computer into guessing an incorrect interpretation. | ||
| D18-1029 Moreover, fine-grained typological features such as exponence, flexivity, fusion, and inflectional synthesis are borne out to be responsible for the proliferation of low-frequency phenomena which are organically difficult to model by statistical architectures, or for the meaning ***** ambiguity ***** of character n-grams. | ||
| W19-3615 Prepositional Phrase (PP) attachment is a classical problem in NLP for languages like English, which suffer from structural ***** ambiguity *****. | ||
| 1993.iwpt-1.11 However, this analysis is very difficult because of word sense ***** ambiguity ***** and structural ***** ambiguity *****. | ||
| 2020.dmr-1.5 The fact that some arguments are not explicitly mentioned in a sentence gives rise to ***** ambiguity ***** in language understanding, and renders it difficult for machines to interpret text correctly | ||
| vector space | 76 | |
| 2020.semeval-1.30 It consists of preparing a semantic ***** vector space ***** for each corpus, earlier and later; computing a linear transformation between earlier and later spaces, using Canonical Correlation Analysis and orthogonal transformation;and measuring the cosines between the transformed vector for the target word from the earlier corpus and the vector for the target word in the later corpus. | ||
| L14-1031 The second method makes use of recent advances in distributional similarity representation to transfer existing norms to their closest neighbors in a high-dimensional ***** vector space *****. | ||
| W18-3003 (iii) Projecting the ***** vector space ***** using Linear Discriminant Analysis, which eliminates the expanded dimension(s) with semantic knowledge. | ||
| P19-1162 While these approaches offer great geometric insights into unintended biases in the embedding ***** vector space *****, they fail to offer an interpretable meaning for how the embeddings could cause discrimination in downstream NLP applications. | ||
| 2021.acl-long.139 Separately embedding the individual knowledge sources into ***** vector space *****s has demonstrated tremendous successes in encoding the respective knowledge, but how to jointly embed and reason with both knowledge sources to fully leverage the complementary information is still largely an open problem. | ||
| large | 76 | |
| L10-1362 Question answering (QA) systems aim at retrieving precise information from a ***** large ***** collection of documents. | ||
| P19-1513 Experiments show that our model outperforms several baselines by a ***** large ***** margin. | ||
| 2020.sltu-1.7 Overall, we show that the proposed multilingual graphemic hybrid ASR with various data augmentation can not only recognize any within training set languages, but also provide ***** large ***** ASR performance improvements. | ||
| C16-1095 In the absence of ***** large ***** annotated corpora, parallel corpora, treebanks, bilingual lexica, etc., we found the following to be effective: exploiting distributional regularities in monolingual data, projecting information across closely related languages, and utilizing human linguist judgments. | ||
| 2020.acl-main.175 The supervised training of high - capacity models on *****large***** datasets containing hundreds of thousands of document - summary pairs is critical to the recent success of deep learning techniques for abstractive summarization . | ||
| real - world | 76 | |
| 2021.mtsummit-research.9 We revisit the topic of translation direction in the data used for training neural machine translation systems and focusing on a *****real - world***** scenario with known translation direction and imbalances in translation direction : the Canadian Hansard . | ||
| 2020.emnlp-tutorials.5 Understating spatial semantics expressed in natural language can become highly complex in *****real - world***** applications . | ||
| 2004.amta-papers.3 An adaptable statistical or hybrid MT system relies heavily on the quality of word - level alignments of *****real - world***** data . | ||
| 2020.emnlp-main.526 Existing language models excel at writing from scratch , but many *****real - world***** scenarios require rewriting an existing document to fit a set of constraints . | ||
| N19-2016 In this paper , we investigate the challenges of using reinforcement learning agents for question - answering over knowledge graphs for *****real - world***** applications . | ||
| chatbots | 75 | |
| N19-1091 Thus it is imperative for customer care agents and ***** chatbots ***** engaging with humans to be personal, cordial and emphatic to ensure customer satisfaction and retention. | ||
| 2020.coling-main.429 This hybrid method improves the response quality of ***** chatbots ***** and makes them more controllable and interpretable. | ||
| 2021.mrqa-1.5 In clinical studies, ***** chatbots ***** mimicking doctor-patient interactions are used for collecting information about the patient's health state. | ||
| 2021.acl-long.545 In datasets with intents and example utterances from 200 professional ***** chatbots *****, we saw decreases in the equal error rate (EER) in more than 40% of the ***** chatbots ***** in comparison to the baseline of the same algorithm without the meta-knowledge. | ||
| 2021.humeval-1.9 Dialogue systems like ***** chatbots *****, and tasks like question-answering (QA) have gained traction in recent years; yet evaluating such systems remains difficult | ||
| CLARIN | 75 | |
| L14-1388 All researchers are therefore invited to start using the elements in the ***** CLARIN ***** infrastructure offered by ***** CLARIN *****-NL. | ||
| L08-1045 The necessary infrastructure for managing, exploring and enriching language resources via the Web will need to be delivered by projects like ***** CLARIN ***** and DARIAH. | ||
| L10-1038 I argue that the way the ***** CLARIN *****-NL project has been set-up can serve as an excellent example for other national ***** CLARIN ***** projects, for the following reasons: (1) it is a mix between a programme and a project; (2) it offers opportunities to seriously test standards and protocols currently proposed by ***** CLARIN *****, thus providing evidence-based requirements and desiderata for the ***** CLARIN ***** infrastructure and ensuring compatibility of ***** CLARIN ***** with national data and tools; (3) it brings the intended users (humanities researchers) and the technology providers (infrastructure specialists and language and speech technology researchers) together in concrete cooperation projects, with a central role for the users research questions,, thus ensuring that the infrastructure will provide functionality that is needed by its intended users. | ||
| L08-1414 The complete architecture is designed based on a few well-known components .This is considered the basis for building a research infrastructure for Language Resources as is planned within the ***** CLARIN ***** project. | ||
| L14-1219 Within the ***** CLARIN ***** project Data Curation Service the data was made into a spoken language resource and made available to other researchers | ||
| inflection | 75 | |
| D17-1074 State-of-the-art neural models adapted from the ***** inflection ***** task are able to learn the range of derivation patterns, and outperform a non-neural baseline by 16.4%. | ||
| 2021.emnlp-main.159 Our experiments with the Paradigm Cell Filling Problem over eight typologically different languages show that in languages with relatively simple morphology, orthographic regularities on their own allow ***** inflection ***** models to achieve respectable accuracy. | ||
| W17-4120 We present a novel supervised approach to ***** inflection ***** generation for verbs in Spanish. | ||
| 2021.sigmorphon-1.18 In this framework, the ***** inflection ***** target form is specified by providing an example ***** inflection ***** of another word in the language. | ||
| W19-3714 In a second step, we combine known names with wild cards to increase recognition recall by also capturing ***** inflection ***** variants | ||
| cognate | 75 | |
| W19-4202 We propose ***** cognate ***** projection as a method of crosslingual transfer for inflection generation in the context of the SIGMORPHON 2019 Shared Task. | ||
| W19-3647 Evidence from the speech corpora supports a more complex vocalic inventory than attested in previous auditory/manual-based accounts – thus reinforcing the resourcefulness of the algorithms for the current data and ***** cognate ***** varieties. | ||
| L06-1290 In an age when demand for innovative and motivating language teaching methodologies is at a very high level, TREAT - the Trilingual REAding Tutor - combines the most advanced natural language processing (NLP) techniques with the latest second and third language acquisition (SLA/TLA) research in an intuitive and user-friendly environment that has been proven to help adult learners (native speakers of L1) acquire reading skills in an unknown L3 which is related to (***** cognate ***** with) an L2 they know to some extent. | ||
| 2019.gwc-1.51 In this paper, we detect ***** cognate ***** word pairs among ten Indian languages with Hindi and use deep learning methodologies to predict whether a word pair is ***** cognate ***** or not. | ||
| C18-1134 Bayesian linguistic phylogenies are standardly based on *****cognate***** matrices for words referring to a fix set of meaningstypically around 100 - 200 . | ||
| algorithms | 75 | |
| W19-4403 Also, the test result of CAT can provide valuable feedback to AIG ***** algorithms *****. | ||
| 2020.findings-emnlp.406 (2018) proposed a meta-algorithm that captures beam-aware training ***** algorithms ***** and suggests new ones, but unfortunately did not provide empirical results. | ||
| L14-1309 On the research side, there has been an increasing interest in ***** algorithms ***** and approaches that are able to capture the polarity of opinions expressed by users on products, institutions and services. | ||
| L12-1092 This research is part of a larger-scale project to produce annotation schemes, language resources, ***** algorithms *****, and applications for Classical and Modern Standard Arabic. | ||
| 2020.eamt-1.19 However this matching and retrieving process is still limited to ***** algorithms ***** based on edit distance which we have identified as a major drawback in Translation Memories systems | ||
| stylistic | 75 | |
| 2021.eacl-main.203 Written language contains ***** stylistic ***** cues that can be exploited to automatically infer a variety of potentially sensitive author information. | ||
| W17-4902 However, applying ***** stylistic ***** variations is still by and large a manual process, and there have been little efforts towards automating it. | ||
| 2021.acl-long.52 Our analysis tests popular wisdom about ***** stylistic ***** elements in high-engagement podcasts, corroborating some pieces of advice and adding new perspectives on others. | ||
| 2021.latechclfl-1.7 In this work, we design an end-to-end model for poetry generation based on conditioned recurrent neural network (RNN) language models whose goal is to learn ***** stylistic ***** features (poem length, sentiment, alliteration, and rhyming) from examples alone. | ||
| D19-1179 The dataset is collected from human annotators with solid control of input denotation: not only preserving original meaning between text, but promoting ***** stylistic ***** diversity to annotators | ||
| task | 75 | |
| 2008.amta-papers.17 This work exposes the limitations of descriptive statistics generally used in this area, mainly correlation analysis, when using automated metrics for assessments in ***** task ***** handling purposes. | ||
| S19-2106 Our best systems in each of the three OffensEval ***** task *****s placed in the middle of the comparative evaluation, ranking 57th of 103 in ***** task ***** A, 39th of 75 in ***** task ***** B, and 44th of 65 in ***** task ***** C. | ||
| 2021.nlp4if-1.12 A total of ten teams submitted systems for ***** task ***** 1, and one team participated in ***** task ***** 2; nine teams also submitted a system description paper. | ||
| 2020.blackboxnlp-1.20 Downstream ***** task *****-based comparisons are often difficult to interpret due to differences in ***** task ***** structure, while probing ***** task ***** evaluations often look at only a few attributes and models. | ||
| P19-1077 One of the key contributions of our proposal is its applicability to the case in which markables are nested, as is the case with coreference markables; but the GWAP and several of the proposed markable detectors are ***** task ***** and language-independent and are thus applicable to a variety of other annotation scenarios | ||
| conceptual | 75 | |
| 2020.coling-main.270 Metaphor as a cognitive mechanism in human's ***** conceptual ***** system manifests itself an effective way for language communication. | ||
| U18-1005 In this paper we propose a novel approach to ***** conceptual ***** modelling where the domain experts will be able to specify and construct a model using a restricted form of natural language. | ||
| 2020.coling-main.173 Researchers have demonstrated that text and image-based representations encode complementary semantic information, which when combined provide a more complete representation of word meaning, in particular when compared with data on human ***** conceptual ***** knowledge. | ||
| S17-2038 The MERALI system approaches ***** conceptual ***** similarity through a simple, cognitively inspired, heuristics; it builds on a linguistic resource, the TTCS-e, that relies on BabelNet, NASARI and ConceptNet | ||
| P19-1010 Combined with a decoder copy mechanism, this approach provides a ***** conceptual *****ly simple mechanism to generate logical forms with entities. | ||
| graph convolution | 75 | |
| 2020.emnlp-main.435 Recent studies on event detection (ED) have shown that the syntactic dependency graph can be employed in ***** graph convolution ***** neural networks (GCN) to achieve state-of-the-art performance. | ||
| 2021.emnlp-main.324 Moreover, a novel multi-choice relation constructor is introduced by leveraging ***** graph convolution ***** to capture the dependencies among video moment choices for the best choice selection. | ||
| 2020.findings-emnlp.2 We first organize the original long answer text into a medical concept graph with ***** graph convolution ***** networks to better understand the internal structure of the text and the correlation between medical concepts. | ||
| D19-1582 For this reason, this paper proposes a new method for event detection, which uses a dependency tree based ***** graph convolution ***** network with aggregative attention to explicitly model and aggregate multi-order syntactic representations in sentences. | ||
| D19-5723 After that, we propose a neural network model which consists of the bidirectional long short-term memories and an attention ***** graph convolution ***** neural network to learn relation extraction features from the graph | ||
| wikipedia | 75 | |
| 2020.lrec-1.55 The goal of our research is to test the performance of CQA systems under low-resource conditions which are common for most non-English languages: small amounts of native annotations and other limitations linked to low resource languages, like lack of crowdworkers or smaller ***** wikipedia *****s. | ||
| L12-1195 Contributive resources, such as ***** wikipedia *****, have proved to be valuable in Natural Language Processing or Multilingual Information Retrieval applications.This article focusses on Wiktionary, the dictionary part of the collaborative resources sponsored by the Wikimedia foundation. | ||
| 2020.emnlp-main.674 In this paper we propose Neural ***** wikipedia ***** Quality Monitor (NwQM), a novel deep learning model which accumulates signals from several key information sources such as article text, meta data and images to obtain improved Wikipedia article representation. | ||
| 2020.emnlp-demos.4 We publicize the source code, demonstration, and the pretrained embeddings for 12 languages at https://***** wikipedia *****2vec.github.io/. | ||
| K19-1052 The source code of the proposed model is available online at https://github.com/***** wikipedia *****2vec/***** wikipedia *****2vec. | ||
| word sense induction | 75 | |
| E17-1009 On the example of ***** word sense induction ***** and disambiguation (WSID), we show that it is possible to develop an interpretable model that matches the state-of-the-art models in accuracy. | ||
| 2020.semeval-1.29 This paper presents an approach to lexical semantic change detection based on Bayesian ***** word sense induction ***** suitable for novel word sense identification. | ||
| 2020.coling-main.107 generation of plausible words that can replace a particular target word in a given context, is an extremely powerful technology that can be used as a backbone of various NLP applications, including ***** word sense induction ***** and disambiguation, lexical relation extraction, data augmentation, etc. | ||
| D18-1174 In experimental evaluation disambiguated skip-gram improves state-of-the are results in several ***** word sense induction ***** benchmarks. | ||
| I17-1024 Based on the idea of fuzzy clustering, we introduce a random process to integrate these two types of senses and design two non-parametric methods for ***** word sense induction *****. | ||
| offensive | 75 | |
| 2021.semeval-1.34 Subtask2 is an ***** offensive ***** rating prediction task. | ||
| S19-2116 In the first method, we establish a probabilistic model to evaluate the sentence ***** offensive *****ness level and target level according to different sub-tasks. | ||
| 2020.wildre-1.2 This exchange is not free from ***** offensive *****, trolling or malicious contents targeting users or communities. | ||
| 2020.osact-1.17 For that purpose, we develop an effective method for automatic data augmentation and show the utility of training both ***** offensive ***** and hate speech models off (i.e., by fine-tuning) previously trained affective models (i.e., sentiment and emotion). | ||
| S19-2110 OffensEval addresses the problem of identifying and categorizing ***** offensive ***** language in social media in three subtasks; whether or not a content is ***** offensive ***** (subtask A), whether it is targeted (subtask B) towards an individual, a group, or other entities (subtask C). | ||
| Natural Language Processing ( NLP | 75 | |
| 2021.hackashop-1.7 *****Natural Language Processing ( NLP***** ) is defined by specific , separate tasks , with each their own literature , benchmark datasets , and definitions . | ||
| P19-2041 Using pre - trained word embeddings in conjunction with Deep Learning models has become the de facto approach in *****Natural Language Processing ( NLP***** ) . | ||
| N18-2019 Hate speech detection is a critical , yet challenging problem in *****Natural Language Processing ( NLP***** ) . | ||
| 2020.nl4xai-1.4 In this paper we report progress on a novel explainable artificial intelligence ( XAI ) initiative applying *****Natural Language Processing ( NLP***** ) with elements of codesign to develop a text classifier for application in psychotherapy training . | ||
| 2020.vardial-1.1 This paper presents the results of the VarDial Evaluation Campaign 2020 organized as part of the seventh workshop on *****Natural Language Processing ( NLP***** ) for Similar Languages , Varieties and Dialects ( VarDial ) , co - located with COLING 2020 . | ||
| reproducibility | 74 | |
| 2021.argmining-1.5 al, 2019), and systematically analyze its ***** reproducibility *****. | ||
| 2020.emnlp-main.437 To facilitate ***** reproducibility ***** and future work, we release our code and trained models. | ||
| 2021.emnlp-main.531 Human evaluation for summarization tasks is reliable but brings in issues of ***** reproducibility ***** and high costs. | ||
| 2020.inlg-1.29 However, replicability of human evaluation experiments and ***** reproducibility ***** of their results is currently under-addressed, and this is of particular concern for NLG where human evaluations are the norm. | ||
| 2021.acl-long.384 For ***** reproducibility *****, we demonstrate similar benefits on the publicly available AMI dataset | ||
| cognates | 74 | |
| L06-1420 From a set of known ***** cognates *****, the method induces rules capturing regularities of orthographic mutations that a word undergoes when migrating from one language into the other. | ||
| P17-1181 Using global constraints to perform rescoring is complementary to state of the art methods for performing ***** cognates ***** detection and results in significant performance improvements beyond current state of the art performance on publicly available datasets with different language pairs and various conditions such as different levels of baseline state of the art performance and different data size conditions, including with more realistic large data size conditions than have been evaluated with in the past. | ||
| 2019.gwc-1.51 We also observe the behaviour of, to an extent, unrelated Indian language pairs and release the lists of detected ***** cognates ***** among them as well. | ||
| D18-1320 In this work, we propose a multimodal approach to predict the pronunciation of Cantonese logographic characters, using neural networks with a geometric representation of logographs and pronunciation of ***** cognates ***** in historically related languages. | ||
| W19-4720 By comparing current meanings of ***** cognates ***** in different languages, we hope to uncover information about their previous meanings, and about how they diverged in their respective languages from their common original etymon | ||
| DNN | 74 | |
| 2017.iwslt-1.9 The individual subsystems are built by using different speaker-adaptive feature combination (e.g., lMEL with i-vector or bottleneck speaker vector), acoustic models (GMM or ***** DNN *****) and speaker adaptation (MLLR or fMLLR). | ||
| P17-2008 Deep Neural Network (***** DNN *****) may also be used to provide uncertainty using Monte-Carlo Dropout (MCD). | ||
| 2021.rocling-1.30 The masking-based speech enhancement method pursues a multiplicative mask that applies to the spectrogram of input noise-corrupted utterance, and a deep neural network (***** DNN *****) is often used to learn the mask. | ||
| W17-5047 In order to incorporate the various kinds of text-based features and a speech-based i-vector feature, we design two ***** DNN ***** based ensemble classifiers for late fusion and early fusion, respectively | ||
| 2016.iwslt-1.8 The best performance in word error rate ( WER ) , was achieved when the English language was used as the source one in the multi - task MLAN scheme , achieving a relative improvement of 9.4 % in respect to the baseline *****DNN***** model . | ||
| FEVER | 74 | |
| 2021.emnlp-main.558 In several cross-domain experiments between the ***** FEVER ***** and FNC fact verification datasets, we show that our approach learns the best delexicalization strategy for the given training dataset, and outperforms state-of-the-art classifiers that rely on the original data. | ||
| 2020.emnlp-main.629 Fact-verification systems are well explored in the NLP literature with growing attention owing to shared tasks like ***** FEVER *****. | ||
| 2020.acl-main.549 We evaluate our system on ***** FEVER *****, a benchmark dataset for fact checking, and find that rich structural information is helpful and both our graph-based mechanisms improve the accuracy. | ||
| 2021.ranlp-1.56 The respective F1 scores after applying the proposed method on ***** FEVER ***** 1.0 and ***** FEVER ***** 2.0 datasets are 0.65+-0.018 and 0.65+-0.051. | ||
| 2021.naacl-main.121 The document recall of WikiAPI retriever (Hanselowski et al., 2018) which is 90.0% on ***** FEVER *****, drops to 72.2% on the colloquial claims | ||
| oracle | 74 | |
| L14-1205 For several tenses, such as the French “imparfait”, the tense-aware SMT system improves significantly over the baseline and is closer to the ***** oracle ***** system. | ||
| P18-2075 For parsers where a dynamic ***** oracle ***** is available (including a novel ***** oracle ***** which we define for the transition system of Dyer et al., 2016), policy gradient typically recaptures a substantial fraction of the performance gain afforded by the dynamic ***** oracle *****. | ||
| N19-1018 Moreover, we introduce a provably correct dynamic ***** oracle ***** for the new transition system, and present the first experiments in discontinuous constituency parsing using a dynamic ***** oracle *****. | ||
| D18-1264 In this paper , we propose a new rich resource enhanced AMR aligner which produces multiple alignments and a new transition system for AMR parsing along with its *****oracle***** parser . | ||
| E17-1037 To analyze the limitations and the future directions of the extractive summarization paradigm , this paper proposes an Integer Linear Programming ( ILP ) formulation to obtain extractive oracle summaries in terms of ROUGE - N. We also propose an algorithm that enumerates all of the *****oracle***** summaries for a set of reference summaries to exploit F - measures that evaluate which system summaries contain how many sentences that are extracted as an oracle summary . | ||
| citation | 74 | |
| 2020.wosp-1.11 Identification of the purpose and influence of ***** citation ***** is significant in assessing the impact of a publication. | ||
| R17-1002 Current ***** citation ***** networks, which link papers by ***** citation ***** relationships (reference and citing paper), are useful to quantitatively understand the value of a piece of scientific work, however they are limited in that they do not provide information about what specific part of the reference paper the citing paper is referring to. | ||
| 2020.sdp-1.17 In this work we leverage the open access ACL Anthology collection in combination with the Semantic Scholar bibliometric database to create a large corpus of scholarly documents with associated ***** citation ***** information and we propose a new ***** citation ***** prediction model called SChuBERT | ||
| 2020.sdp-1.11 While most previous approaches represent context using solely text surrounding the ***** citation *****, we propose enhancing context representation with global information. | ||
| L10-1326 The speech analysis/synthesis algorithm is based in the Multiband Ex***** citation ***** technique, but uses a novel phase information representation the Relative Phase Shift (RPSs). | ||
| social media text | 74 | |
| 2021.wnut-1.53 Our results show that while word-level, intrinsic, performance evaluation is behind other methods, our model improves performance on extrinsic, downstream tasks through normalization compared to models operating on raw, unprocessed, ***** social media text *****. | ||
| N18-4018 While some work has been done on code-mixed ***** social media text ***** and in emotion prediction separately, our work is the first attempt which aims at identifying the emotion associated with Hindi-English code-mixed ***** social media text *****. | ||
| W18-4410 Using the encoding generated by Emoti-KATE, a 3-way classification is performed for every ***** social media text ***** in the dataset. | ||
| W18-5903 Our experiments show promising results for identifying depression from ***** social media text *****s. | ||
| W18-1105 While relevant research has been done independently on code-mixed ***** social media text *****s and hate speech detection, our work is the first attempt in detecting hate speech in Hindi-English code-mixed ***** social media text *****. | ||
| style | 74 | |
| L06-1036 It is also our goal to relate both prosody and pragmatics to emotion, ***** style ***** and attitude. | ||
| 2020.coling-main.197 This assumes that it is possible to separate ***** style ***** from content. | ||
| L12-1611 We focus specifically on speech and gesture interaction which can enhance the quality of life***** style ***** of people living in assistive environments, be they seniors or people with physical or cognitive disabilities. | ||
| W19-2309 By measuring ***** style ***** transfer quality, meaning preservation, and the fluency of generated outputs, we demonstrate that our method is able both to produce high-quality output while maintaining the flexibility to suggest syntactically rich stylistic edits. | ||
| 2012.amta-monomt.7 We aim to use statistical machine translation technology to correct grammar errors and *****style***** issues in monolingual text . | ||
| hierarchical phrase | 74 | |
| 2008.iwslt-papers.7 In this paper, we investigate several soft constraints in the extraction of *****hierarchical phrases***** and whether these help as additional scores in the decoding to prune unneeded phrases. | ||
| 2011.iwslt-evaluation.21 Our model is considerably more compact and produces slightly higher BLEU scores than the original *****hierarchical phrase*****-based model in Japanese-English translation on the parallel corpus of the NTCIR-7 patent translation task. | ||
| 2009.iwslt-evaluation.10 We developed three different systems: a statistical phrase-based system using the Moses toolkit, an Statistical Post-Editing system and a *****hierarchical phrase*****-based system based on Joshua. | ||
| 2011.iwslt-evaluation.4 Furthermore, we explore target-side syntactic augmentation for an *****Hierarchical Phrase*****-Based (HPB) SMT model. | ||
| 2016.iwslt-1.5 For ten directions we also include *****hierarchical phrase*****-based MT. | ||
| web | 74 | |
| Q15-1011 We explore whether web links can replace a curated encyclopaedia , obtaining entity prior , name , context , and coherence models from a corpus of *****web***** pages with links to Wikipedia . | ||
| 2011.freeopmt-1.8 This document describes a project aimed at building a new *****web***** interface to the Apertium machine translation platform , including pre - editing and post - editing environments . | ||
| 2009.freeopmt-1.9 Some machine translation services like Google Ajax Language API have become very popular as they make the collaboratively created contents of the *****web***** 2.0 available to speakers of many languages . | ||
| 2020.ccl-1.106 Clickbait is a form of *****web***** content designed to attract attention and entice users to click on specific hyperlinks . | ||
| 2021.acl-demo.5 ASCENT is a fully automated methodology for extracting and consolidating commonsense assertions from *****web***** contents ( Nguyen et al . , 2021 ) . | ||
| concatenation | 73 | |
| L08-1023 This framework combines a metagrammar compiler and a parser based on range ***** concatenation ***** grammar (RCG) to respectively check the consistency and the correction of the grammar. | ||
| D19-6011 We have experimented both (a) improving the fine-tuning of pre-trained language models on a task with a small dataset size, by leveraging datasets of similar tasks; and (b) incorporating the distributional representations of a KG onto the representations of pre-trained language models, via simply ***** concatenation ***** or multi-head attention. | ||
| N18-1118 A simple strategy of decoding the ***** concatenation ***** of the previous and current sentence leads to good performance, and our novel strategy of multi-encoding and decoding of two sentences leads to the best performance (72.5% for coreference and 57% for coherence/cohesion), highlighting the importance of target-side context. | ||
| 2021.bucc-1.3 We measure the accuracy of our methods in low-resource settings by comparing the results against manually curated test data for English–Icelandic, and by evaluating an MT system trained on the ***** concatenation ***** of the parallel data extracted by our approach and an existing data set. | ||
| 2021.iwpt-1.23 We also adopt a finetuning strategy where we first train a language-generic parser on the ***** concatenation ***** of data from all available languages, and then, in a second step, finetune on each individual language separately | ||
| debiasing | 73 | |
| 2021.ltedi-1.5 In this paper, we recommend six measures and one augment measure based on the observations of the bias in data, annotations, text representations and ***** debiasing ***** techniques. | ||
| N19-1061 We present a series of experiments to support this claim, for two ***** debiasing ***** methods. | ||
| 2021.woah-1.12 We empirically show that our method yields lower false positive rate in both lexical and dialectal attributes than previous ***** debiasing ***** methods. | ||
| 2020.nlposs-1.5 FEE will aid practitioners in fast track analysis of existing ***** debiasing ***** methods on their embedding models. | ||
| 2021.wassa-1.6 Our findings are: (a) individual classifiers for topic and author gender are indeed biased; (b) ***** debiasing ***** with adversarial training works for topic, but breaks down for author gender; (c) gender ***** debiasing ***** results differ across languages | ||
| LR | 73 | |
| 1999.mtsummit-1.52 The paper aims at providing an overview of the situation of Language Resources (***** LR *****) in Europe, in particular as emerging from a few European projects regarding the construction of large-scale harmonised resources to be used for many applicative purpose, also of multilingual nature. | ||
| L06-1039 As yet, these are 14 ***** LR ***** in total: two training S***** LR ***** for ASR (English and Spanish), three development ***** LR ***** and three evaluation ***** LR ***** for ASR (English, Spanish, Mandarin), and three development ***** LR ***** and three evaluation ***** LR ***** for SLT (English-Spanish, Spanish-English, Mandarin-English). | ||
| L10-1642 The accidental re-creation of an ***** LR ***** that already exists is a nearly unforgivable waste of scarce resources that is unfortunately not so easy to avoid. | ||
| L12-1505 Such analysis will also help anticipate and forecast sustainability for a ***** LR ***** before taking any decisions concerning design and production | ||
| 2000.iwpt-1.21 The first published *****LR***** algorithm for Tree Adjoining Grammars ( TAGs [ Joshi and Schabes , 1996 ] ) was due to Schabes and Vijay - Shanker [ 1990 ] . | ||
| OffensEval | 73 | |
| 2020.semeval-1.252 We present our submission and results for SemEval-2020 Task 12: Multilingual Offensive Language Identification in Social Media (***** OffensEval ***** 2020) where we participated in offensive tweet classification tasks in English, Arabic, Greek, Turkish and Danish. | ||
| 2020.semeval-1.284 As an example we present the Danish classifier Smatgrisene, our contribution to the recent ***** OffensEval ***** Challenge 2020. | ||
| S19-2010 We present the results and the main findings of SemEval-2019 Task 6 on Identifying and Categorizing Offensive Language in Social Media (***** OffensEval *****). | ||
| S19-2107 The ***** OffensEval ***** shared task includes three sub-tasks namely Offensive language identification, Automatic categorization of offense types and Offense target identification. | ||
| 2020.semeval-1.211 This paper describes the participation of SINAI team at Task 12: ***** OffensEval ***** 2: Multilingual Offensive Language Identification in Social Media | ||
| idiomatic | 73 | |
| C16-1259 Some expressions can be ambiguous between ***** idiomatic ***** and literal interpretations depending on the context they occur in, e.g., `sales hit the roof' vs. `hit the roof of the car'. | ||
| L16-1135 Infinitive-verb compounds form a challenge for writers of German, because spelling regulations are different for literal and ***** idiomatic ***** uses. | ||
| D19-1545 Programmers typically organize executable source code using high-level coding patterns or ***** idiomatic ***** structures such as nested loops, exception handlers and recursive blocks, rather than as individual code tokens. | ||
| L08-1259 The paper demonstrates this approach on different MWE types, starting from simple syntactic structures, followed by more complicated cases and including fully ***** idiomatic ***** expressions. | ||
| W18-4910 We then discuss challenges in translating ***** idiomatic ***** MD text that led to creating multi-word expression lexicon entries whose meanings could not be fully derived from the individual words | ||
| prosody | 73 | |
| L12-1600 Further, they are statistically analyzed and put into context in a multimodal feature analysis, involving measures of ***** prosody *****, voice quality and motion energy. | ||
| L16-1332 This paper describes the recording of a speech corpus focused on ***** prosody ***** of people with intellectual disabilities. | ||
| N18-1007 For this study with known sentence boundaries, error analyses show that the main benefit of acoustic-prosodic features is in sentences with disfluencies, attachment decisions are most improved, and transcription errors obscure gains from ***** prosody *****. | ||
| L10-1317 On the other hand, two systems get significantly better results than the rest: one is based on statistical parametric synthesis and the other one is a concatenative system that makes use of a sinusoidal model to modify both ***** prosody ***** and smooth spectral joints. | ||
| W16-4017 On these, we intend to build our model of free verse ***** prosody *****, which will help to understand, differentiate and relate the different styles of free verse poetry | ||
| recent advances | 73 | |
| L14-1031 The second method makes use of ***** recent advances ***** in distributional similarity representation to transfer existing norms to their closest neighbors in a high-dimensional vector space. | ||
| 2021.acl-long.2 We construct a dataset containing thousands of funny papers and use it to learn classifiers, combining findings from psychology and linguistics with ***** recent advances ***** in NLP. | ||
| 2020.nlposs-1.11 Despite the ***** recent advances ***** in applying language-independent approaches to various natural language processing tasks thanks to artificial intelligence, some language-specific tools are still essential to process a language in a viable manner. | ||
| 2020.coling-main.115 Even with the ***** recent advances ***** in NLP research space, the state-of-the-art QA systems fall short of understanding implicit intents of real-world Business Intelligence (BI) queries in enterprise systems, since Natural Language Understanding still remains an AI-hard problem. | ||
| N18-6003 In the third part, we describe ***** recent advances ***** in knowledge base reasoning. | ||
| labeled data | 73 | |
| 2020.coling-main.435 One critical issue of zero anaphora resolution (ZAR) is the scarcity of ***** labeled data *****. | ||
| C16-1249 However, the classification performance greatly suffers when the size of the ***** labeled data ***** is limited. | ||
| 2020.findings-emnlp.98 In the proposed study, we make the first attempt to train the video captioning model on ***** labeled data ***** and un***** labeled data ***** jointly, in a semi-supervised learning manner. | ||
| D19-5722 Although both approaches are unsupervised, in the sense that they do not need any ***** labeled data *****, they achieved promising results. | ||
| 2021.naacl-main.159 Active Learning (AL) strategies reduce the need for huge volumes of ***** labeled data ***** by iteratively selecting a small number of examples for manual annotation based on their estimated utility in training the given model. | ||
| number | 73 | |
| 2006.bcs-1.1 Processing of Colloquial Arabic is a relatively new area of research, and a ***** number ***** of interesting challenges pertaining to spoken Arabic dialects arise. | ||
| 2020.sdp-1.20 We were able to obtain ~267,000 unique research papers through our fully-automated framework using ~76,000 queries, resulting in almost 200,000 more papers than the ***** number ***** of queries. | ||
| D17-1084 It first retrieves a few relevant equation system templates and aligns ***** number *****s in math word problems to those templates for candidate equation generation. | ||
| 2019.iwslt-1.26 We study here a related setting, multi-domain adaptation, where the ***** number ***** of domains is potentially large and adapting separately to each domain would waste training resources. | ||
| E17-2054 We show that a model capitalizing on a `fuzzy' measure of similarity is effective for learning quantifiers, whereas the learning of exact cardinals is better accomplished when information about ***** number ***** is provided. | ||
| search | 73 | |
| L06-1015 That is, the retrieved documents from both systems are shown to the judges without any information about the***** search ***** techniques. | ||
| 2006.bcs-1.1 Processing of Colloquial Arabic is a relatively new area of re***** search *****, and a number of interesting challenges pertaining to spoken Arabic dialects arise. | ||
| 2020.sdp-1.20 We were able to obtain ~267,000 unique re***** search ***** papers through our fully-automated framework using ~76,000 queries, resulting in almost 200,000 more papers than the number of queries. | ||
| 2020.sdp-1.2 I will discuss the status and future of arXiv, and possibilities and plans to make more effective use of the re***** search ***** database to enhance ongoing re***** search ***** efforts. | ||
| 2020.acl-main.51 The problem of comparing two bodies of text and ***** search *****ing for words that differ in their usage between them arises often in digital humanities and computational social science. | ||
| Natural Language Processing ( NLP ) | 73 | |
| 2020.emnlp-main.19 Transformer models have advanced the state of the art in many *****Natural Language Processing ( NLP )***** tasks . | ||
| 2021.acl-long.150 *****Natural Language Processing ( NLP )***** systems learn harmful societal biases that cause them to amplify inequality as they are deployed in more and more situations . | ||
| W18-5412 Sequential neural networks models are powerful tools in a variety of *****Natural Language Processing ( NLP )***** tasks . | ||
| N19-1157 Brown and Exchange word clusters have long been successfully used as word representations in *****Natural Language Processing ( NLP )***** systems . | ||
| 2020.acl-main.686 Transformers are ubiquitous in *****Natural Language Processing ( NLP )***** tasks , but they are difficult to be deployed on hardware due to the intensive computation . | ||
| extractor | 72 | |
| 2020.emnlp-main.516 Our system uses a supervised NER model trained on the source domain, as a feature ***** extractor *****. | ||
| 2020.fever-1.7 We propose, instead, a model-agnostic framework that consists of two modules: (1) a span ***** extractor *****, which identifies the crucial information connecting claim and evidence; and (2) a classifier that combines claim, evidence, and the extracted spans to predict the veracity of the claim. | ||
| L12-1198 In addition to this, we assigned to the extracted entries of the lexicon a confidence score based on the relative frequency and evaluated the ***** extractor ***** on domain specific data. | ||
| D19-1030 By representing all entity mentions, event triggers, and contexts into this complex and structured multilingual common space, using graph convolutional networks, we can train a relation or event ***** extractor ***** from source language annotations and apply it to the target language. | ||
| R19-1136 Experimental results show (1) that the attention mechanism in encoder-free models acts as a strong feature ***** extractor *****, (2) that the word embeddings in encoder-free models are competitive to those in conventional models, (3) that non-contextualized source representations lead to a big performance drop, and (4) that encoder-free models have different effects on alignment quality for German-English and Chinese-English | ||
| disambiguate | 72 | |
| L10-1425 One reason for this may be that few annotated resources are available that ***** disambiguate ***** expressions in context. | ||
| W16-4914 Ultimately, we aim to prove the viability of a new integrated rule-based MT approach to ***** disambiguate ***** students' intended meaning in a CALL system. | ||
| 2020.lrec-1.474 Additionally, the method developed uses information about the token irrespective of its context unlike most of the other techniques that heavily rely on the word's context to ***** disambiguate ***** its set of candidate analyses. | ||
| W17-4603 Results suggest that comma-ambiguous sentences are easier to ***** disambiguate ***** than PP-attachment-ambiguous sentences, possibly due to the presence of clear prosodic boundaries, namely silent pauses. | ||
| 2020.findings-emnlp.405 In this paper, we propose a meta-learning framework for few-shot word sense disambiguation (WSD), where the goal is to learn to ***** disambiguate ***** unseen words from only a few labeled instances | ||
| Wiktionary | 72 | |
| L14-1114 We describe a method to automatically extract a German lexicon from ***** Wiktionary ***** that is compatible with the finite-state morphological grammar SMOR. | ||
| 2020.lrec-1.572 In this paper, we present MucLex, a German lexicon for the Natural Language Generation task of surface realisation, based on the crowd-sourced online lexicon ***** Wiktionary *****. | ||
| L16-1498 ***** Wiktionary ***** is a large-scale resource for cross-lingual lexical information with great potential utility for machine translation (MT) and many other NLP tasks, especially automatic morphological analysis and generation. | ||
| L08-1139 We present two application programming interfaces for Wikipedia and ***** Wiktionary ***** which are especially designed for mining the rich lexical semantic information dispersed in the knowledge bases, and provide efficient and structured access to the available knowledge. | ||
| L14-1469 This paper introduces GLAFF, a large-scale versatile French lexicon extracted from ***** Wiktionary *****, the collaborative online dictionary | ||
| leveraging | 72 | |
| 2020.findings-emnlp.249 We tackle the challenge of cross-lingual training of neural document ranking models for mono-lingual retrieval, specifically ***** leveraging ***** relevance judgments in English to improve search in non-English languages. | ||
| 2021.ranlp-1.168 Extensive experiments conducted on two available propagandist resources (i.e., NLP4IF'19 and SemEval'20-Task 11 datasets) show that the proposed approach, ***** leveraging ***** different language models and the investigated linguistic features, achieves very promising results on propaganda classification, both at sentence- and at fragment-level. | ||
| 2020.emnlp-main.481 Our methodology enables any software developer to add a new language capability to a QA system for a new domain, ***** leveraging ***** machine translation, in less than 24 hours. | ||
| 2020.acl-main.226 In this paper, we propose a framework to decouple the challenge and address these three aspects respectively, ***** leveraging ***** the power of existing large-scale pre-trained models such as BERT and GPT-2. | ||
| 2020.acl-main.202 In particular, we study several distillation strategies and propose a stage-wise optimization scheme ***** leveraging ***** teacher internal representations, that is agnostic of teacher architecture, and show that it outperforms strategies employed in prior works | ||
| correctness | 72 | |
| 2021.sigdial-1.53 Evaluation results on factual ***** correctness ***** suggest such coreference-aware models are better at tracing the information flow among interlocutors and associating accurate status/actions with the corresponding interlocutors and person mentions. | ||
| 2004.amta-papers.19 We also describe methods that utilize the question sentence available to a question-answering system to improve translation ***** correctness *****. | ||
| 2020.acl-main.458 We further propose a training strategy which optimizes a neural summarization model with a factual ***** correctness ***** reward via reinforcement learning. | ||
| 2000.iwpt-1.13 In this paper we will present a bottom-up parsing method for Minimalist Grammars, prove its ***** correctness *****, and discuss its complexity. | ||
| 2021.sigdial-1.21 The ***** correctness ***** obtained by the three types of CRPs were consistent with the results of the subjective assessment | ||
| CAT | 72 | |
| W19-4403 The results show that the combination of AIG and ***** CAT ***** can construct test items efficiently and reduce test cost significantly. | ||
| W19-8711 This paper analyzes the current situation in the translation industry in respect to those tools and their relationship with ***** CAT ***** tools. | ||
| 2021.calcs-1.19 Even after controlling for the extra training data introduced, ***** CAT ***** improves model accuracy when the model is prevented from relying on lexical overlaps (+3.45), with a negligible drop (-0.15 points) in performance on the original XNLI test set. | ||
| 2012.amta-tutorials.4 This tutorial will present a survey of how machine translation is integrated into current *****CAT***** tools and illustrate how the technology can be used appropriately and profitably by the professional translator . | ||
| 2003.mtsummit-tttt.5 This paper describes the approach used for introducing *****CAT***** tools and MT systems into a course offered in translation curricula at the Universite de Montreal ( Canada ) . | ||
| OntoNotes | 72 | |
| L16-1145 Inspired by the ***** OntoNotes ***** approach with adaptations to the tasks to reflect the goals and scope of the BOLT project, this effort has introduced more annotation types of informal and free-style genres in English, Chinese and Egyptian Arabic. | ||
| W17-1507 The CORBON 2017 Shared Task, organised as part of the Coreference Resolution Beyond ***** OntoNotes ***** workshop at EACL 2017, presented a new challenge for multilingual coreference resolution: we offer a projection-based setting in which one is supposed to build a coreference resolver for a new language exploiting little or even no knowledge of it, with our languages of interest being German and Russian. | ||
| 2021.nodalida-main.14 We evaluate this resource and demonstrate its compatibility with the English ***** OntoNotes ***** annotations by training state-of-the-art mono-, bi- and multilingual deep learning models, finding both that the corpus allows highly accurate recognition of ***** OntoNotes ***** types at 93% F-score and that a comparable level of tagging accuracy can be achieved by a bilingual Finnish-English NER model. | ||
| D19-1588 We apply BERT to coreference resolution, achieving a new state of the art on the GAP (+11.5 F1) and ***** OntoNotes ***** (+3.9 F1) benchmarks. | ||
| D17-1282 This method surpasses current state-of-the-art on ***** OntoNotes ***** 5.0 with automatically generated parses | ||
| chunking | 72 | |
| 1995.iwpt-1.10 The ***** chunking ***** and raising actions can be done in linear time. | ||
| L08-1408 The current version of the tool suite provides functions ranging from tokenization to ***** chunking ***** and Named Entity Recognition (NER). | ||
| L08-1457 In this paper we discuss a rule-based approach to ***** chunking ***** sentences in Croatian, implemented using local regular grammars within the NooJ development environment. | ||
| Q14-1016 Our approach generalizes a standard ***** chunking ***** representation to encode MWEs containing gaps, thereby enabling efficient sequence tagging algorithms for feature-rich discriminative models. | ||
| 2003.mtsummit-papers.43 However, we will show that the sentence partitioning has little side effect, if any, in our approach, because we use only the ***** chunking ***** results for the transfer | ||
| Written | 72 | |
| L14-1360 Six domains in Balanced Corpus of Contemporary ***** Written ***** Japanese have part-of-speech and pronunciation annotation as well. | ||
| 2021.eacl-main.203 *****Written***** language contains stylistic cues that can be exploited to automatically infer a variety of potentially sensitive author information . | ||
| W18-1113 *****Written***** text transmits a good deal of nonverbal information related to the author 's identity and social factors , such as age , gender and personality . | ||
| S18-1023 *****Written***** communication lacks the multimodal features such as posture , gesture and gaze that make it easy to model affective states . | ||
| 2020.semeval-1.190 This paper describes the system designed by ERNIE Team which achieved the first place in SemEval-2020 Task 10 : Emphasis Selection For *****Written***** Text in Visual Media . | ||
| parallel data | 72 | |
| 2020.wmt-1.51 Finally, we make use of additional monolingual data by creating synthetic ***** parallel data ***** through back-translation. | ||
| W18-6315 While Phrase-Based MT can seamlessly integrate very large language models trained on billions of sentences, the best option for Neural MT developers seems to be the generation of artificial ***** parallel data ***** through back-translation - a technique that fails to fully take advantage of existing datasets. | ||
| 2017.iwslt-1.12 The proposed approach is based on localized embedding projection of distributed representations which utilizes monolingual embeddings and approximate nearest neighbors queries to transform ***** parallel data ***** across language variants. | ||
| 2021.bsnlp-1.8 This will be the first ***** parallel data *****set in this domain, and one of the first Simple Russian datasets in general. | ||
| 2021.acl-long.73 Upon the availability of English AMR dataset and English-to- X ***** parallel data *****sets, in this paper we propose a novel cross-lingual pre-training approach via multi-task learning (MTL) for both zeroshot AMR parsing and AMR-to-text generation. | ||
| lexical semantic | 72 | |
| D19-1357 While the meanings of defining words are important in dictionary definitions, it is crucial to capture the ***** lexical semantic ***** relations between defined words and defining words. | ||
| L14-1499 The annotation will serve as training and test data for classifiers for CMCs, and the CMC definitions developed throughout this study will be used in extending VerbNet to handle representations of sentences in which a verb is used in a syntactic context that is atypical for its ***** lexical semantic *****s. | ||
| 2003.jeptalnrecital-poster.17 However, the theory of contextual ***** lexical semantic *****s implies that larger segments of text, namely non-compositional multiwords, are more appropriate for this role. | ||
| 2019.gwc-1.32 An effective conversion method was proposed in the literature to obtain a ***** lexical semantic ***** space from a ***** lexical semantic ***** graph, thus permitting to obtain WordNet embeddings from WordNets. | ||
| W18-4903 I argue that a lexicon-free ***** lexical semantic *****s—defined in terms of units and supersense tags—is an appetizing direction for NLP, as it is robust, cost-effective, easily understood, not too language-specific, and can serve as a foundation for richer semantic structure. | ||
| multi - hop question | 72 | |
| N19-1405 Ideally, a model should not be able to perform well on a *****multi-hop question***** answering task without doing multi-hop reasoning. | ||
| D19-1455 First, the controller can softly decompose the *****multi-hop question***** into multiple single-hop sub-questions to promote compositional reasoning behavior of the main network. | ||
| 2020.emnlp-main.712 Has there been real progress in *****multi-hop question*****-answering? | ||
| 2020.findings-emnlp.416 *****Multi-hop Question***** Generation (QG) aims to generate answer-related questions by aggregating and reasoning over multiple scattered evidence from different paragraphs. | ||
| 2020.insights-1.12 Large pretrained language models (LM) have been used successfully for *****multi-hop question***** answering. | ||
| discontinuous | 71 | |
| J17-3001 This permits exploration of a large variety of parsing algorithms for ***** discontinuous ***** structures, with different properties. | ||
| 2020.acl-main.520 Through extensive experiments on three biomedical data sets, we show that our model can effectively recognize ***** discontinuous ***** mentions without sacrificing the accuracy on continuous mentions. | ||
| 2020.iwpt-1.9 We present new experiments that transfer techniques from Probabilistic Context-free Grammars with Latent Annotations (PCFG-LA) to two grammar formalisms for ***** discontinuous ***** parsing: linear context-free rewriting systems and hybrid grammars. | ||
| 2000.iwpt-1.20 We also briefly hint at future lines of research aimed at more efficient ways of probabilistic parsing with ***** discontinuous ***** constituents. | ||
| S19-2001 UCCA poses a challenge for existing parsing techniques, as it exhibits reentrancy (resulting in DAG structures), ***** discontinuous ***** structures and non-terminal nodes corresponding to complex semantic units | ||
| punctuation | 71 | |
| C18-1293 The results support our hypothesis that ***** punctuation ***** marks are persistent and robust indicators of the native language of the author, which do not diminish in influence even when a high proficiency level in a non-native language is achieved. | ||
| Q19-1023 We show that our generative model can be used to beat baselines on ***** punctuation ***** restoration. | ||
| L14-1651 When adapting a model to another dataset the most useful feature was ***** punctuation *****. | ||
| 2011.iwslt-evaluation.9 The automatic speech transcription input usually has no or wrong ***** punctuation ***** marks, therefore these marks were especially removed from the source training data for the SLT system training. | ||
| R19-1013 For example, we achieve a BLEU score of 26.70 on the IWSLT15 English–Vietnamese translation task simply by using relative differences in ***** punctuation ***** as a regularizer | ||
| Simultaneous | 71 | |
| 2020.emnlp-main.184 ***** Simultaneous ***** machine translation (SiMT) aims to translate a continuous input text stream into another language with the lowest latency and highest quality possible. | ||
| C16-2007 ***** Simultaneous ***** interpretation allows people to communicate spontaneously across language boundaries, but such services are prohibitively expensive for the general public | ||
| 2020.emnlp-tutorials.6 *****Simultaneous***** translation , which performs translation concurrently with the source speech , is widely useful in many scenarios such as international conferences , negotiations , press releases , legal proceedings , and medicine . | ||
| 2020.ngt-1.20 We describe our submission to the 2020 Duolingo Shared Task on *****Simultaneous***** Translation And Paraphrase for Language Education ( STAPLE ) . | ||
| P19-1289 *****Simultaneous***** translation , which translates sentences before they are finished , is use- ful in many scenarios but is notoriously dif- ficult due to word - order differences . | ||
| tweet | 71 | |
| W18-6124 For example, in ***** tweet ***** stance classification – where a ***** tweet ***** is categorized according to a viewpoint it espouses – the expressed viewpoint depends on latent beliefs held by the user. | ||
| N19-1140 Using our predicted severity scores, we show that it is possible to achieve a Precision@50 of 0.86 when forecasting high severity vulnerabilities, significantly outperforming a baseline that is based on ***** tweet ***** volume. | ||
| C16-1129 As time passes words can acquire meanings they did not previously have, such as the `twitter post' usage of `***** tweet *****'. | ||
| 2020.wnut-1.56 We attempted a few techniques, and we briefly explain here two models that showed promising results in ***** tweet ***** classification tasks: DistilBERT and FastText. | ||
| S19-2120 We used a pre-trained Word Embeddings in ***** tweet ***** data, including information about emojis and hashtags | ||
| semantic relatedness | 71 | |
| 2021.louhi-1.6 Moreover, we show how they can be used out-of-the-box for improved unsupervised detection of hypernyms, while retaining robust performance on various ***** semantic relatedness ***** benchmarks. | ||
| 2021.gwc-1.7 The algorithms not only ruled out most cases of homonymy but also were efficacious in distinguishing between closer and indirect ***** semantic relatedness *****. | ||
| 2020.coling-main.301 Experimental results obtained from standard ***** semantic relatedness ***** and semantic similarity tasks show that our methods outperform various state-of-the-art baselines for word representation refinement. | ||
| R19-1061 We show that the local ***** semantic relatedness ***** is mostly sufficient to successfully identify correct senses when an extensive knowledge base and a proper weighting scheme are used. | ||
| 2021.eacl-main.172 We hypothesize we can combine these datasets according to the ***** semantic relatedness ***** between the relation types to overcome the problem of lack of training data | ||
| logical | 71 | |
| 2020.nl4xai-1.9 While the problem of natural language generation from ***** logical ***** formulas has a long tradition, thus far little attention has been paid to ensuring that the generated explanations are optimally effective for the user. | ||
| 2021.deelio-1.4 Given a dialog history context, our model first builds knowledge graphs from the context as an imitation of human's ability to form ***** logical ***** relationships between known and unknown topics during a conversation. | ||
| 2020.findings-emnlp.190 In this work, we formulate high-fidelity NLG as generation from ***** logical ***** forms in order to obtain controllable and faithful generations. | ||
| 2021.eacl-main.72 LASAGNE uses a transformer model for generating the base ***** logical ***** forms, while the Graph Attention model is used to exploit correlations between (entity) types and predicates to produce node representations | ||
| W17-7902 Market pressure on translation productivity joined with techno***** logical ***** innovation is likely to fragment and decontextualise translation jobs even more than is cur-rently the case. | ||
| continuous speech recognition | 71 | |
| L06-1014 In this paper building statistical language models for Persian language using a corpus and incorporating them in Persian ***** continuous speech recognition ***** (CSR) system are described. | ||
| L08-1506 Many researches including large vocabulary ***** continuous speech recognition ***** and extraction of important sentences against lecture contents are necessary in order to realize the above system. | ||
| L12-1573 Generally the existing monolingual corpora are not suitable for large vocabulary ***** continuous speech recognition ***** (LVCSR) of code-switching speech. | ||
| 2013.iwslt-evaluation.15 In this paper, German and English large vocabulary ***** continuous speech recognition ***** (LVCSR) systems developed by the RWTH Aachen University for the IWSLT-2013 evaluation campaign are presented. | ||
| 1991.iwpt-1.16 Current *****continuous speech recognition***** systems essentially ignore unknown words . | ||
| question answering system | 71 | |
| C18-1178 As a result, it is difficult to build “real-world” ***** question answering system *****s that are operationally deployable. | ||
| D19-5309 While automated ***** question answering system *****s are increasingly able to retrieve answers to natural language questions, their ability to generate detailed human-readable explanations for their answers is still quite limited. | ||
| W16-4404 There are some open domain ***** question answering system *****s, such as IBM Waston, which take the unstructured text data as input, in some ways of humanlike thinking process and a mode of artificial intelligence. | ||
| L06-1172 Therefore, the performance evaluation of each of these components is of great importance in order to check their impact in the global performance, and to conclude whether these components are necessary, need to be improved or substituted.This paper describes some experiments performed in order to evaluate several components of the ***** question answering system ***** Esfinge.We describe the experimental set up and present the results of error analysis based on runtime logs of Esfinge. | ||
| L08-1394 This paper presents the sequential evaluation of the *****question answering system***** SQuaLIA . | ||
| understanding | 71 | |
| 2021.calcs-1.20 Multilingual language models have shown decent performance in multilingual and cross-lingual natural language ***** understanding ***** tasks. | ||
| C18-1105 In this paper, we study the problem of data augmentation for language ***** understanding ***** in task-oriented dialogue system. | ||
| 2021.eacl-main.133 Although progresses have been achieved, existing methods are heuristically motivated and theoretical ***** understanding ***** of such embeddings is comparatively underdeveloped. | ||
| 2020.nlpcss-1.14 This limits their use for ***** understanding ***** the dynamics, patterns and prevalence of online abuse. | ||
| L08-1019 Computational models can help us to shed new light on the real structure of event type classes as well as to gain a better ***** understanding ***** of context-driven semantic shifts. | ||
| variance | 70 | |
| 2020.acl-main.615 We also find that ***** variance ***** in model performance can be explained largely by the resulting entropy of the model. | ||
| 2020.emnlp-main.257 We present a series of experiments across diverse languages which show that ***** variance ***** in performance across language pairs is not only due to typological differences, but can mostly be attributed to the size of the monolingual resources available, and to the properties and duration of monolingual training (e.g. “under-training”). | ||
| L10-1302 This paper gives an overview of an interdisciplinary research project that is concerned with the application of computational linguistics methods to the analysis of the structure and ***** variance ***** of rituals, as investigated in ritual science. | ||
| 2021.adaptnlp-1.23 Domain ***** variance ***** increases from the lower to the upper layers for vanilla PLMs; ii) Models continuously pretrained on domain-specific data (DAPT)(Gururangan et al., 2020) exhibit more ***** variance ***** than their pretrained PLM counterparts; and that iii) Distilled models (e.g., DistilBERT) also show greater domain ***** variance *****. | ||
| L16-1272 An error analysis indicates that, while there is a strong relationship between lexical choices and strength labels, there can be substantial ***** variance ***** in the choices made by different authors | ||
| 2015 | 70 | |
| C18-1199 We use a multimodal version of the SNLI dataset (Bowman et al., ***** 2015 *****) and we compare “blind” and visually-augmented models of textual entailment. | ||
| C16-1212 In particular, the recently proposed attention models (Rocktäschel et al., ***** 2015 *****; Wang and Jiang, ***** 2015 *****) achieves state-of-the-art accuracy by computing soft word alignments between the premise and hypothesis sentences. | ||
| 2020.findings-emnlp.264 A common remedy to this is knowledge distillation (Hinton et al., ***** 2015 *****), leading to faster inference. | ||
| 2016.gwc-1.19 While gender identities in the Western world are typically regarded as binary, our previous work (Hicks et al., ***** 2015 *****) shows that there is more lexical variety of gender identity and the way people identify their gender. | ||
| W17-5308 Moreover, they achieve the new state-of-the-art encoding result on the original SNLI dataset (Bowman et al., ***** 2015 *****) | ||
| disfluencies | 70 | |
| L08-1014 We detail our choice of overlapping tags and our definition of ***** disfluencies *****; the observed ratios of the different overlapping tags are examined, as well as their correlation with of the speaker role and propose two measures to characterise speakers interacting attitude: the attack/resist ratio and the attack density. | ||
| 2020.lrec-1.441 The audio data is read speech and thus low in ***** disfluencies *****. | ||
| L14-1372 Using confidence scores, it was possible to reject the majority of mis-aligned segments resulting in alignment accuracy of 99.0-99.8% depending on the speech domain and the amount of ***** disfluencies *****. | ||
| L14-1017 The main focus of the revision process consisted on annotating and revising structural metadata events, such as ***** disfluencies ***** and punctuation marks. | ||
| 2021.eacl-main.150 The rising popularity of voice assistants presents a growing need to handle naturally occurring ***** disfluencies ***** | ||
| incrementally | 70 | |
| P19-1191 We present a PaperRobot who performs as an automatic research assistant by (1) conducting deep understanding of a large collection of human-written papers in a target domain and constructing comprehensive background knowledge graphs (KGs); (2) creating new ideas by predicting links from the background KGs, by combining graph attention and contextual text attention; (3) ***** incrementally ***** writing some key elements of a new paper based on memory-attention networks: from the input title along with predicted related entities to generate a paper abstract, from the abstract to generate conclusion and future work, and finally from future work to generate a title for a follow-on paper. | ||
| P17-1086 To bridge this gap, we start with a core programming language and allow users to “naturalize” the core language ***** incrementally ***** by defining alternative, more natural syntax and increasingly complex concepts in terms of compositions of simpler ones. | ||
| N19-2001 We propose an approach that ***** incrementally ***** builds a subset vocabulary from the word lattice. | ||
| L06-1061 The engine ***** incrementally ***** extracts high co-occurring named entities with the seed by using a common search engine. | ||
| 2020.findings-emnlp.346 However , these efforts still suffer from two types of latencies : ( a ) the computational latency ( synthesizing time ) , which grows linearly with the sentence length , and ( b ) the input latency in scenarios where the input text is *****incrementally***** available ( such as in simultaneous translation , dialog generation , and assistive technologies ) . | ||
| paradigms | 70 | |
| Q19-1021 Our measurements are taken on large morphological ***** paradigms ***** from 36 typologically diverse languages. | ||
| L06-1140 The MCL when applied to a graph of word associations has the effect of producing concept areas in which words are grouped into the similar topics or similar meanings as ***** paradigms *****. | ||
| 2021.sigmorphon-1.11 Our system generates ***** paradigms ***** using morphological transformation rules which are discovered from raw data. | ||
| 2021.dravidianlangtech-1.3 In this paper, we explored the zero-shot learning and few-shot learning ***** paradigms ***** based on multilingual language models for offensive speech detection in code-mixed and romanized variants of three Dravidian languages - Malayalam, Tamil, and Kannada. | ||
| 2012.freeopmt-1.4 Previous work on an interactive system aimed at helping non-expert users to enlarge the monolingual dictionaries of rule-based machine translation (MT) systems worked by discarding those inflection ***** paradigms ***** that cannot generate a set of inflected word forms validated by the user | ||
| convolution | 70 | |
| C18-2023 We have leveraged the use of deep ***** convolution ***** recurrent neural network model to analyze crime articles to extract different crime related entities and events. | ||
| W19-1307 We propose ***** convolution ***** neural network (CNN) and bidirectional long-short term memory (biLSTM) (with and without Attention) models which take the generated bilingual embeddings as input. | ||
| 2021.rocling-1.16 We proposed an RCRNN-based SED system with residual connection and ***** convolution ***** block attention mechanism based on the mean-teacher framework of semi-supervised learning. | ||
| D18-1109 A ***** convolution ***** filter is typically implemented as a linear affine transformation followed by a non-linear function, which fails to account for language compositionality. | ||
| 2020.semeval-1.167 In this paper we describe our submission to the Sentimix Hindi-English task involving sentiment classification of code-mixed texts, and with an F1 score of 67.1%, we demonstrate that simple ***** convolution ***** and attention may well produce reasonable results | ||
| aligned | 70 | |
| L08-1576 In particular, synchronization is done on the basis of ***** aligned ***** anchor points. | ||
| L12-1345 RTs are classified according to whether or not there is a particular object or proposition in the speaker's turn for which the listener shows a positive or ***** aligned ***** stance. | ||
| 2020.lrec-1.338 Minimally, corpora from language documentation contain a transcription tier and an ***** aligned ***** translation tier, which means they constitute parallel corpora. | ||
| N19-1045 The problem of learning to translate between two vector spaces given a set of ***** aligned ***** points arises in several application areas of NLP. | ||
| W17-3526 We conduct our study within the framework of encoder-decoder networks, and we propose a hierarchical structure with ***** aligned ***** attention in the Long-Short Term Memory (LSTM) decoder | ||
| Humor | 70 | |
| S17-2063 This paper presents our submission to SemEval-2017 Task 6: #HashtagWars: Learning a Sense of ***** Humor *****. | ||
| P18-2093 ***** Humor ***** is one of the most attractive parts in human communication | ||
| 2021.acl-short.6 *****Humor***** recognition has been widely studied as a text classification problem using data - driven approaches . | ||
| 2020.semeval-1.106 This paper describes our system that was designed for *****Humor***** evaluation within the SemEval-2020 Task 7 . | ||
| C18-1159 *****Humor***** recognition is an interesting and challenging task in natural language processing . | ||
| syntactic annotation | 70 | |
| L12-1162 As for resources, we describe the Uppsala PErsian Corpus (UPEC) which is a modified version of the Bijankhan corpus with additional sentence segmentation and consistent tokenization modified for more appropriate ***** syntactic annotation *****. | ||
| 2020.lrec-1.114 The layer of ***** syntactic annotation ***** forms the first nucleus of an Italian historical treebank complying with the Universal Dependencies standard. | ||
| L12-1397 This paper describes the ***** syntactic annotation ***** process of the DECODA corpus. | ||
| L12-1106 In the future, we are planning to carry out a ***** syntactic annotation ***** of the HunOr corpus, which will further enhance the usability of the corpus in various NLP fields such as transfer-based machine translation or cross lingual information retrieval. | ||
| L14-1596 This article presents the methods, results, and precision of the ***** syntactic annotation ***** process of the Rhapsodie Treebank of spoken French | ||
| neural semantic parsing | 70 | |
| W19-0504 We show that (i) linguistic features can be beneficial for ***** neural semantic parsing ***** and (ii) the best method of adding these features is by using multiple encoders. | ||
| 2020.emnlp-main.323 AM dependency parsing is a linguistically principled method for ***** neural semantic parsing ***** with high accuracy across multiple graphbanks. | ||
| 2020.findings-emnlp.255 In particular, we investigated our model for solving two problems, ***** neural semantic parsing ***** and math word problem. | ||
| 2020.emnlp-main.118 A thorough experimental study on Unimer reveals that ***** neural semantic parsing ***** approaches exhibit notably different performance when they are trained to generate different meaning representations. | ||
| 2020.spnlp-1.3 In this work, we explore the possibility of generating synthetic data for ***** neural semantic parsing ***** using a pretrained denoising sequence-to-sequence model (i.e., BART) | ||
| volume | 70 | |
| 2021.acl-long.245 Generating code-switched text is a problem of growing interest, especially given the scarcity of corpora containing large ***** volume *****s of real code-switched text. | ||
| P19-1387 Wikipedia can easily be justified as a behemoth, considering the sheer ***** volume ***** of content that is added or removed every minute to its several projects. | ||
| 2021.dash-1.2 Experimental results show that the pro- posed workflow boosts the performance of the NLU model while significantly reducing the annotation ***** volume *****. | ||
| 2021.naacl-main.159 Active Learning (AL) strategies reduce the need for huge ***** volume *****s of labeled data by iteratively selecting a small number of examples for manual annotation based on their estimated utility in training the given model. | ||
| C16-1330 We demonstrate that these properties, such as the ***** volume *****, provenance, and balancing, play an important role with respect to system performance. | ||
| offensive language detection | 70 | |
| 2020.semeval-1.287 As an emerging growth of social media communication, ***** offensive language detection ***** has received more attention in the last years; we focus to perform the task on English, Danish and Greek. | ||
| 2020.lrec-1.625 This way of representing words results in stable improvements in ***** offensive language detection *****, when used as the only features or in combination with words or character n-grams. | ||
| 2020.trac-1.9 This also holds for aggression identification and ***** offensive language detection *****, where deep learning approaches consistently outperform less complex models, such as decision trees. | ||
| S19-2137 We present a neural network based approach of transfer learning for ***** offensive language detection *****. | ||
| 2020.semeval-1.289 This paper deals with ***** offensive language detection ***** in five different languages; English, Arabic, Danish, Greek and Turkish. | ||
| linguistic features | 70 | |
| R19-1040 In this paper, we present new methods for language classification which put to good use both syntax and fuzzy tools, and are capable of dealing with irrelevant ***** linguistic features ***** (i.e. | ||
| 2001.mtsummit-papers.21 More importantly, the classifiers yield the basis for error-analysis by providing a ranking of the importance of ***** linguistic features *****. | ||
| D18-1125 By incorporating a set of external ***** linguistic features *****, our approach outperforms the state-of-the-art by 1.7% absolute F1 gain. | ||
| W19-4519 In a study of users of a popular debate platform, we find first that different combinations of ***** linguistic features ***** are critical for predicting persuasion outcomes for UNDECIDED versus DECIDED members of the audience. | ||
| W18-0540 We developed solutions following three approaches: (i) a feature engineering method using lexical, n-gram and psycho***** linguistic features *****, (ii) a shallow neural network method using only word embeddings, and (iii) a Long Short-Term Memory (LSTM) language model, which is pre-trained on a large text corpus to produce a contextualized word vector. | ||
| gold standard | 70 | |
| L16-1580 This paper introduces a ***** gold standard ***** that can aid in this task. | ||
| L12-1492 The ***** gold standard ***** corpora are utilised to benchmark the methods used in the silver standard corpora generation process and released in a shared format. | ||
| 2021.rocling-1.51 All data sets with ***** gold standard *****s and scoring script are made publicly available to researchers. | ||
| L12-1307 In this paper we evaluate the impact of the phrase recognition step on the ability of the system to correctly reproduce the annotations of a ***** gold standard ***** in an unsupervised setting. | ||
| L04-1155 With the annotation tool presented here, a set of g***** gold standard *****sh can be collected, representing what should be extracted. | ||
| image captioning | 70 | |
| 2020.coling-main.210 While novel metrics are proposed every year, a few popular metrics remain as the de facto metrics to evaluate tasks such as ***** image captioning ***** and machine translation, despite their known limitations. | ||
| E17-1019 Moreover, we explore the utilization of the recently proposed Word Mover's Distance (WMD) document metric for the purpose of ***** image captioning *****. | ||
| D19-3043 machine translation, text summarization, ***** image captioning ***** and video description) usually relies heavily on task-specific metrics, such as BLEU and ROUGE. | ||
| N19-1315 In our experiments on six machine translation and two ***** image captioning ***** datasets, our method achieves faster reinforcement learning (~2.7x faster) with less GPU memory (~2.3x less) than the full-vocabulary counterpart. | ||
| N18-1198 We provide an in-depth analysis of end-to-end ***** image captioning ***** by exploring a variety of cues that can be derived from such object detections. | ||
| open source | 70 | |
| L16-1612 The toolkit is ***** open source *****, includes working examples and can be found on http://github.com/jorispelemans/scale. | ||
| L16-1711 This paper introduces an ***** open source *****, interoperable generic software tool set catering for the entire workflow of creation, migration, annotation, query and analysis of multi-layer linguistic corpora. | ||
| 2020.lt4gov-1.6 Legal-ES is an ***** open source ***** resource kit for legal Spanish. | ||
| L10-1592 We also report on our plans for making our custom-built software resources available to the community as ***** open source ***** software, and introduce an initiative to collaborate with software developers outside LDC. | ||
| L12-1160 We present a complex, ***** open source ***** tool for detailed machine translation error analysis providing the user with automatic error detection and classification, several monolingual alignment algorithms as well as with training and test corpus browsing. | ||
| aggression identification | 70 | |
| 2020.alw-1.10 We also develop computational models to incorporate emotions into textual cues to improve ***** aggression identification *****. | ||
| 2020.trac-1.9 This also holds for ***** aggression identification ***** and offensive language detection, where deep learning approaches consistently outperform less complex models, such as decision trees. | ||
| 2020.trac-1.8 The shared task was further divided into two sub-tasks: (a) ***** aggression identification ***** and (b) misogynistic ***** aggression identification *****. | ||
| 2020.trac-1.1 The task consisted of two sub-tasks - ***** aggression identification ***** (sub-task A) and gendered identification (sub-task B) - in three languages - Bangla, Hindi and English. | ||
| W18-4407 In this paper we label ***** aggression identification ***** into three categories: Overtly Aggressive, Covertly Aggressive and Non-aggressive. | ||
| for | 70 | |
| 2021.naacl-industry.15 Since Transformer models are huge in size , serving these models is a challenge *****for***** real industrial applications . | ||
| W17-4210 We emphasise that this phenomenon should be considered separately from recognised problematic headline types such as clickbait and sensationalism , arguing that existing natural language processing ( NLP ) methods applied to these related concepts are not appropriate *****for***** the automatic detection of headline incongruence , as an analysis beyond stylistic traits is necessary . | ||
| L06-1294 Our basic technique is to extract relationships between terms using the Ohsumed corpus , a large collection of abstracts from PubMed , and to compare the relationships extracted with those that would be expected *****for***** medical terms , given the structure of the WordNet ontology . | ||
| W17-5911 ESL learners are familiar with web search engines , but generic web search results may not be adequate *****for***** composing documents in a specific domain . | ||
| 2021.ranlp-srw.20 Because of their growing popularity , certain online forums have been created specifically to provide support , assistance , and opinions *****for***** people suffering from mental illness . | ||
| hashtags | 69 | |
| W17-3008 We extract a list of obscene words and ***** hashtags ***** using common patterns used in offensive and rude communications. | ||
| 2021.emnlp-main.616 Millions of ***** hashtags ***** are created on social media every day to cross-refer messages concerning similar topics. | ||
| C16-1284 Microblogging services allow users to create ***** hashtags ***** to categorize their posts. | ||
| S19-2120 We used a pre-trained Word Embeddings in tweet data, including information about emojis and ***** hashtags *****. | ||
| L14-1088 Our approach suggests that ***** hashtags ***** can be used to understand, not just the language of topics, but the deeper psychological and social meaning of a tweet | ||
| computed | 69 | |
| 2020.lrec-1.416 A frequency dictionary provides much sought after information about word frequency statistics, ***** computed ***** for each subcorpus as well as aggregate, disambiguating homographs based on their respective lemmas and morphosyntactic tags. | ||
| E17-4005 The list of parameters ***** computed ***** using the software was expanded due to the designed users' dictionaries. | ||
| 2021.alta-1.4 Then, we investigate the effects of sociopolitical variables on the ***** computed ***** bias series, such as the outgroup size in the host country and the rate of the population receiving unemployment benefits. | ||
| 2021.deelio-1.5 We first compare the sentence-level likelihood ***** computed ***** with BERT and the GPT-2's perplexity showing that the two metrics are correlated. | ||
| 2020.acl-main.360 The stabilized lottery ticket hypothesis states that networks can be pruned after none or few training iterations, using a mask ***** computed ***** based on the unpruned converged model | ||
| orthography | 69 | |
| L10-1382 This task was challenging for several reasons, which are common to a number of lesser-used languages: although Venetan is widely used as an oral language in everyday life, its written usage is very limited; efforts for defining a standard ***** orthography ***** and grammar are very recent and not well established; despite recent attempts to propose a unified ***** orthography *****, no Venetan standard is widely used. | ||
| 2020.lrec-1.752 Predictably, performance was twice as good in tweets with standard ***** orthography ***** than in tweets with spelling/casing irregularities or lack of sentence separation, the effect being more marked for morphology than for syntax. | ||
| W19-3650 Due to this, we need to face the errors when digitizing the sources and difficulties in sentence alignment, as well as the fact that does not exist a standard ***** orthography *****. | ||
| W19-3506 We used several kinds of feature extractions which are term frequency, ***** orthography *****, and lexicon features. | ||
| L06-1420 In this paper we propose a methodology for the automatic detection of cognates between two languages based solely on the ***** orthography ***** of words | ||
| parsed | 69 | |
| L10-1466 These second cascade ***** parsed ***** the named entity annotated corpus. | ||
| L08-1608 We then evaluate the results of a definition extraction system that uses patterns identified in this survey to extract from dependency ***** parsed ***** text. | ||
| 2020.lrec-1.863 The Bulgarian data is web crawled, extracted from the original HTML format, filtered by document type, tokenised, sentence split, tagged and lemmatised with a fine-grained version of the Bulgarian Language Processing Chain, dependency ***** parsed ***** with NLP- Cube, annotated with named entities (persons, locations, organisations and others), noun phrases, IATE terms and EuroVoc descriptors. | ||
| D18-1327 This is achieved by employing separate encoders for the sequential and ***** parsed ***** versions of the same source sentence; the resulting representations are then combined using a hierarchical attention mechanism. | ||
| R19-1059 Our discourse-aware summarizer can jointly learn the discourse structure and the salience score of a sentence by using novel hierarchical attention modules, which can be trained on automatically ***** parsed ***** discourse dependency trees | ||
| imbalanced | 69 | |
| 2020.emnlp-main.737 We argue that the sub-optimal text generation is mainly attributable to the ***** imbalanced ***** token distribution, which particularly misdirects the learning model when trained with the maximum-likelihood objective. | ||
| R19-1022 precision, recall, accuracy), dataset-size, and ***** imbalanced ***** data (in terms of the distribution of the number of class labels). | ||
| 2021.acl-long.277 However, it also incurs two major problems: noisy labels and ***** imbalanced ***** training data. | ||
| W17-0704 This suggests that ***** imbalanced ***** training data may result in automatic speech recognition errors consistent with those of speakers from populations over-represented in the training data. | ||
| 2020.emnlp-main.657 Our experiments show that GenNLI outperforms both discriminative and pretrained baselines across several challenging NLI experimental settings, including small training sets, ***** imbalanced ***** label distributions, and label noise | ||
| querying | 69 | |
| C16-2049 The system operates continuously in ambient mode, i.e. it generates speech transcriptions and identifies main keywords and keyphrases, while also ***** querying ***** its index to display relevant documents without explicit query. | ||
| C16-1196 Query terms are ranked with Word2Vec and TF-IDF and are continuously updated to allow for ongoing ***** querying ***** of a document collection | ||
| P17-1086 Our goal is to create a convenient natural language interface for performing well - specified but complex actions such as analyzing data , manipulating text , and *****querying***** databases . | ||
| 2020.bionlp-1.2 Novel contexts , comprising a set of terms referring to one or more concepts , may often arise in complex *****querying***** scenarios such as in evidence - based medicine ( EBM ) involving biomedical literature . | ||
| L10-1260 This paper presents a system for *****querying***** treebanks in a uniform way . | ||
| retrieved | 69 | |
| 2021.emnlp-main.579 However, non-parametric methods are prone to overfit the ***** retrieved ***** examples. | ||
| Q18-1029 The probability distribution over generated words is updated online depending on the translation history ***** retrieved ***** from the memory, endowing NMT models with the capability to dynamically adapt over time. | ||
| L12-1247 Individual web queries are posed for a lexicon that includes thousands of nouns and the ***** retrieved ***** data are aggregated. | ||
| 2021.ranlp-1.174 Previous works evaluate the entailment step based on the ***** retrieved ***** evidence, whereas we hypothesize that the entailment prediction can provide useful signals for evidence retrieval, in the sense that if a sentence supports or refutes a claim, the sentence must be relevant | ||
| P19-1533 Retrieve - and - edit based approaches to structured prediction , where structures associated with *****retrieved***** neighbors are edited to form new structures , have recently attracted increased interest . | ||
| Gaussian | 69 | |
| N19-1411 The variational autoencoder (VAE) imposes a probabilistic distribution (typically ***** Gaussian *****) on the latent space and penalizes the Kullback-Leibler (KL) divergence between the posterior and prior. | ||
| 2021.acl-srw.6 In our experimental setup, the hidden states of the LSTM-based speaker and listener were added with ***** Gaussian ***** noise, while the channel was subject to discrete random replacement. | ||
| D19-1124 To address the issues, we propose to use the Dirichlet distribution with flexible structures to characterize the latent variables in place of the traditional ***** Gaussian ***** distribution, called Dirichlet Latent Variable Hierarchical Recurrent Encoder-Decoder model (Dir-VHRED). | ||
| N19-1015 Distinct from existing variational auto-encoder (VAE) based approaches, which assume a simple ***** Gaussian ***** prior for latent code, our model specifies the prior as a ***** Gaussian ***** mixture model (GMM) parametrized by a neural topic module. | ||
| 2020.acl-main.99 In particular, we model utterance embeddings with a ***** Gaussian ***** mixture distribution and inject dynamic class semantic information into ***** Gaussian ***** means, which enables learning more class-concentrated embeddings that help to facilitate downstream outlier detection | ||
| Question Answering | 69 | |
| L06-1495 This paper describes our methodology for creating AnswerTime-Bank, a large corpus of questions and answers on which ***** Question Answering ***** systems can operate using complex temporal inference. | ||
| L14-1160 We present JUST.ASK, a publicly available ***** Question Answering ***** system, which is freely available. | ||
| L06-1191 Although significant advances have been made recently in the *****Question Answering***** technology , more steps have to be undertaken in order to obtain better results . | ||
| L12-1244 This paper presents CINTIL - QATreebank , a treebank composed of Portuguese sentences that can be used to support the development of *****Question Answering***** systems . | ||
| L06-1440 A critical step in *****Question Answering***** design is the definition of the models for question focus identification and answer extraction . | ||
| workflow | 69 | |
| L14-1630 The CLARIN-DK interface must guide the user to perform the necessary steps of a ***** workflow *****; even when the user is inexperienced and perhaps has an unclear conception of the requested results. | ||
| 2020.lrec-1.527 Such a ***** workflow ***** will increase its value when grounded to real-world activities, and visual grounding is a way to do so. | ||
| 2021.triton-1.18 The development of Translation Technologies, like Translation Memory and Machine Translation, has completely changed the translation industry and translator's ***** workflow ***** in the last decades. | ||
| W16-4014 In the Danish CLARIN-DK infrastructure, chaining language technology (LT) tools into a ***** workflow ***** is easy even for a non-expert user, because she only needs to specify the input and the desired output of the ***** workflow *****. | ||
| 2020.acl-demos.13 In this paper, we present BENTO, a ***** workflow ***** management platform with a graphic user interface (GUI) that is built on top of CodaLab, to facilitate the process of building clinical NLP pipelines | ||
| commonsense knowledge | 69 | |
| 2021.insights-1.2 For this, we incorporate ***** commonsense knowledge ***** into the prediction process using a graph convolution network with pre-trained language model embeddings as input. | ||
| P19-1193 However, this ***** commonsense knowledge ***** provides additional background information, which can help to generate essays that are more novel and diverse. | ||
| D17-1216 Reasoning with ***** commonsense knowledge ***** is critical for natural language understanding. | ||
| 2019.gwc-1.25 Our final objective is the extraction of some guidelines towards a better exploitation of this ***** commonsense knowledge ***** framework by the improvement of the included resources. | ||
| S18-1120 To incorporate ***** commonsense knowledge *****, we augment the input with relation embedding from the graph of general knowledge ConceptNet. | ||
| target | 69 | |
| 2020.fnp-1.1 FNS summarisation shared task is the first to ***** target ***** financial annual reports. | ||
| P19-2006 However, several studies strived to overcome divergences in the annotations between English AMRs and those of their ***** target ***** languages by refining the annotation specification. | ||
| 2020.semeval-1.30 It consists of preparing a semantic vector space for each corpus, earlier and later; computing a linear transformation between earlier and later spaces, using Canonical Correlation Analysis and orthogonal transformation;and measuring the cosines between the transformed vector for the ***** target ***** word from the earlier corpus and the vector for the ***** target ***** word in the later corpus. | ||
| D18-1270 This enables our approach to: (a) augment the limited supervision in the ***** target ***** language with additional supervision from a high-resource language (like English), and (b) train a single entity linking model for multiple languages, improving upon individually trained models for each language. | ||
| W19-5409 More specifically, one of the proposed approaches employs the translation knowledge between the two languages from two different translation directions; while the other one employs extra monolingual knowledge from both source and ***** target ***** sides, obtained by pre-training deep self-attention networks. | ||
| supervised learning | 69 | |
| 2020.findings-emnlp.98 In the proposed study, we make the first attempt to train the video captioning model on labeled data and unlabeled data jointly, in a semi-***** supervised learning ***** manner. | ||
| P17-1029 In this paper we propose multi-space variational encoder-decoders, a new model for labeled sequence transduction with semi-***** supervised learning *****. | ||
| 2021.eacl-main.124 Inspired by recent progress in unpaired sequence-to-sequence tasks, a self-***** supervised learning ***** model is introduced, called CAE-T5. | ||
| 2021.dialdoc-1.10 We simulate the dialogue between an agent and a user (modelled similar to an agent with ***** supervised learning ***** objective) to interact with each other. | ||
| D19-1408 Existing methods based on ***** supervised learning ***** require a large amount of well-labelled training data, which is difficult to obtain due to inconsistent perception of fine-grained emotion intensity. | ||
| amr parsing | 69 | |
| E17-1035 However, they have not fulfilled their promise on the *****AMR parsing***** task due to the data sparsity issue. | ||
| P18-1171 We evaluate our neural transition model on the *****AMR parsing***** task, and our parser outperforms other sequence-to-sequence approaches and achieves competitive results in comparison with the best-performing models. | ||
| R19-1014 Our system achieves a semantic triple (Smatch) precision that is competitive with other CCG-based *****AMR parsing***** approaches. | ||
| 2021.iwpt-1.5 This paper presents a novel approach to *****AMR parsing***** by combining heterogeneous data (tokens, concepts, labels) as one input to a transformer to learn attention, and use only attention matrices from the transformer to predict all elements in AMR graphs (concepts, arcs, labels). | ||
| D17-3006 We will also discuss how to make use of the framework to build other related models such as topic models and highlight its potential applications in some recent popular tasks (e.g., *****AMR parsing***** (Flanigan et al., 2014)).The framework has been extensively used by our research group for developing various structured prediction models, including models for information extraction (Lu and Roth, 2015; | ||
| summarizing | 68 | |
| 2020.fnp-1.23 This task focuses on ***** summarizing ***** annual financial reports which poses two main challenges as compared to typical news document summarization tasks : i) annual reports are more lengthier (average length about 80 pages) as compared to typical news documents, and ii) annual reports are more loosely structured e.g. comprising of tables, charts, textual data and images, which makes it challenging to effectively summarize. | ||
| 2020.fnp-1.21 For example, ***** summarizing ***** a news article is very different from ***** summarizing ***** a financial earnings report. | ||
| D19-1616 The synthetic data is effective for the low resource condition and is particularly helpful for our multilingual scenario where availability of ***** summarizing ***** data is still a challenging issue. | ||
| 2020.emnlp-main.336 Although much attention has been paid to ***** summarizing ***** structured text like news reports or encyclopedia articles, ***** summarizing ***** conversations—an essential part of human-human/machine interaction where most important pieces of information are scattered across various utterances of different speakers—remains relatively under-investigated. | ||
| 2021.repl4nlp-1.19 The remarkable success of word embeddings for this purpose suggests that high-quality representations can be obtained by ***** summarizing ***** the sentence contexts of word mentions | ||
| PubMed | 68 | |
| L06-1294 Our basic technique is to extract relationships between terms using the Ohsumed corpus, a large collection of abstracts from ***** PubMed *****, and to compare the relationships extracted with those that would be expected for medical terms, given the structure of the WordNet ontology. | ||
| 2021.bionlp-1.27 Using this probabilistic transformation of BM25 scores we show an improved performance on the ***** PubMed ***** Click dataset developed and presented in this study, as well as the 2007 TREC Genomics collection. | ||
| C16-1104 More than 50 thousand scientific publications in ***** PubMed ***** lack author-generated abstracts, and the relevancy judgements for these papers have to be based on their titles alone. | ||
| W18-5306 There are millions of articles in *****PubMed***** database . | ||
| D19-1259 We introduce PubMedQA , a novel biomedical question answering ( QA ) dataset collected from *****PubMed***** abstracts . | ||
| newswire | 68 | |
| 2020.nuse-1.8 Previous work on event extraction focused on ***** newswire *****, however we are interested in extracting events from spoken dialogue. | ||
| W16-4801 The challenge offered two subtasks: subtask 1 focused on the identification of very similar languages and language varieties in ***** newswire ***** texts, whereas subtask 2 dealt with Arabic dialect identification in speech transcripts. | ||
| L14-1382 Results on the French TimeBank are quite satisfaying as they are comparable to those obtained by HeidelTime in English and Spanish on ***** newswire ***** articles. | ||
| Q15-1011 Combining web link and Wikipedia models produces the best-known disambiguation accuracy of 88.7 on standard ***** newswire ***** test data | ||
| D18-1226 Recent research efforts have shown that neural architectures can be effective in conventional information extraction tasks such as named entity recognition , yielding state - of - the - art results on standard *****newswire***** datasets . | ||
| edit | 68 | |
| L16-1206 This fine-grained taxonomy of ***** edit ***** types enables us to differentiate ***** edit *****ing actions and find ***** edit *****or roles in Wikipedia based on their low-level ***** edit ***** types. | ||
| 2021.cmcl-1.27 These networks represent each word as a node and links are placed between words which are phonological neighbours, usually defined as a string ***** edit ***** distance of one. | ||
| L10-1622 In particular, we investigate the applicability of a DBN framework initially proposed by Filali and Bilmes (2005) to learn ***** edit ***** distance estimation parameters for use in pronunciation classification. | ||
| C18-1008 We present a neural transition-based model that uses a simple set of ***** edit ***** actions (copy, delete, insert) for morphological transduction tasks such as inflection generation, lemmatization, and reinflection. | ||
| P18-2062 Recent embedding - based methods in bilingual lexicon induction show good results , but do not take advantage of orthographic features , such as *****edit***** distance , which can be helpful for pairs of related languages . | ||
| softmax | 68 | |
| 2021.mtsummit-research.10 Neural machine translation (NMT) models are typically trained using a ***** softmax ***** cross-entropy loss where the ***** softmax ***** distribution is compared against the gold labels. | ||
| 2021.emnlp-main.246 Starting from the top ***** softmax ***** layer, layer-wise pruning proceeds in a top-down fashion until reaching the bottom word embedding layer. | ||
| W18-4930 TRAPACCS extends TRAPACC by replacing the ***** softmax ***** layer of the CNN with a support vector machine (SVM). | ||
| W18-2714 We improve the performance with an efficient mini-batching algorithm, and by fusing the ***** softmax ***** operation with the k-best extraction algorithm. | ||
| P18-2050 In this paper, we propose a simple and parameter-efficient adaptation technique that only requires adapting the bias of the output ***** softmax ***** to each particular user of the MT system, either directly or through a factored approximation | ||
| hypernym | 68 | |
| 2020.findings-emnlp.246 Although it has been shown that the Distributional Informativeness Hypothesis (DIH) holds on text, in which the DIH assumes that a context surrounding a hyponym is more informative than that of a ***** hypernym *****, it has never been tested on visual objects. | ||
| D19-5317 Hierarchy construction methods heavily rely on ***** hypernym ***** detection, however, the faceted relations are parent-to-child links but the ***** hypernym ***** relation is a multi-hop, i.e., ancestor-to-descendent link with a specific facet “type-of”. | ||
| 2020.alvr-1.1 By leveraging the parent-child structure of synsets in ImageNet, this dataset is extended to 10,462 synsets (and 7.1 million images) that have an Arabic label, which is either a match or a direct ***** hypernym *****, and to 17,438 synsets (and 11 million images) when a ***** hypernym ***** of a ***** hypernym ***** is included. | ||
| S18-1116 This report describes the system developed by the CRIM team for the ***** hypernym ***** discovery task at SemEval 2018. | ||
| P17-1128 Experiments on real-world datasets illustrate that our approach outperforms previous methods for Chinese ***** hypernym ***** prediction | ||
| formalisms | 68 | |
| L16-1565 A powerful query language, INESS Search, has been developed for search across ***** formalisms ***** in the INESS treebanks, including LFG c- and f-structures. | ||
| L10-1487 We first describe Alexina, the lexical framework in which the Lefff is developed as well as the linguistic notions and ***** formalisms ***** it is based on. | ||
| P17-1186 We then explore two multitask learning approaches—one that shares parameters across ***** formalisms *****, and one that uses higher-order structures to predict the graphs jointly. | ||
| L06-1137 In section 1, we introduce ***** formalisms ***** underlying PDT 2.0 and MultiNet, in section 2. | ||
| 2020.iwpt-1.10 To combine these two, we consider the approach of supertagging that requires lexicalized grammar ***** formalisms ***** | ||
| thesaurus | 68 | |
| L12-1633 For these seeds similar terms are extracted from the corpus using known ***** thesaurus ***** generation methods. | ||
| L12-1297 We conclude the paper by showing some experimental results to validate our method and by presenting our methodology of automatic ***** thesaurus ***** construction. | ||
| C16-1173 Following the work of Claveau and Kijak (2016), we use IR as an applicative framework to indirectly evaluate the generated ***** thesaurus *****. | ||
| 2019.gwc-1.28 This way, students can use wordnet as dictionary or ***** thesaurus ***** when writing specialised texts. | ||
| W16-4912 For the similarity of words, we use a Japanese ***** thesaurus ***** and dependency-based word embeddings | ||
| anaphoric | 68 | |
| 2017.iwslt-1.1 The Dialogue task, which calls for the integration of context information in machine translation, in order to resolve ***** anaphoric ***** references that typically occur in human-human dialogue turns. | ||
| L10-1295 The Live Memories corpus is an Italian corpus annotated for ***** anaphoric ***** relations. | ||
| 2021.codi-sharedtask.1 Using five conversational datasets, four of which have been newly annotated with a wide range of ***** anaphoric ***** relations: identity, bridging references and discourse deixis, we defined multiple subtasks focusing individually on these key relations. | ||
| L14-1245 However, erroneous ***** anaphoric ***** references (pronouns) were not always detected by the participants which poses a problem for automatic text summarizers. | ||
| 2016.lilt-14.2 One consequence is that the ***** anaphoric ***** potential of indefinites may extend beyond the standard limits of accessibility constraints | ||
| annotated dataset | 68 | |
| D19-1339 Our analysis provides a study of biases in NLG, bias metrics and correlated human judgments, and empirical evidence on the usefulness of our ***** annotated dataset *****. | ||
| 2020.emnlp-main.320 We conduct experiments on Propaganda Techniques Corpus, a large manually ***** annotated dataset ***** for fine-grained propaganda detection. | ||
| W19-4514 The ***** annotated dataset ***** that we produce provides the important knowledge needed for our ultimate goal of analyzing biochemistry articles. | ||
| 2021.emnlp-main.239 A number of GEC metrics have been used to evaluate proposed GEC systems; however, each system relies on either a comparison with one or more reference texts—in what is known as the gold standard for reference-based metrics—or a separate ***** annotated dataset ***** to fine-tune the reference-less metric. | ||
| 2021.emnlp-main.787 In doing so, we construct a human-labeled dataset of 4,721 bill-to-bill relationships at the subsection-level and release this ***** annotated dataset ***** to the research community | ||
| spans | 68 | |
| 2020.emnlp-main.396 We find, e.g., that span frequency is especially important for LSTMs, and that CRFs help when ***** spans ***** are infrequent and boundaries non-distinctive. | ||
| W17-6307 This paper applies parsing technology to the task of syntactic simplification of English sentences, focusing on the identification of text ***** spans ***** that can be removed from a complex sentence. | ||
| 2020.acl-demos.32 Prta (Propaganda Persuasion Techniques Analyzer) allows users to explore the articles crawled on a regular basis by highlighting the ***** spans ***** in which propaganda techniques occur and to compare them on the basis of their use of propaganda techniques. | ||
| 2020.codi-1.4 In addition, we find that the head-finding attention mechanism involved in creating the ***** spans ***** is crucial in encoding coreference knowledge. | ||
| E17-1087 We aim one step further and propose a method for textual language identification where languages can change arbitrarily and the goal is to identify the ***** spans ***** of each of the languages | ||
| semantic textual | 68 | |
| 2021.emnlp-main.309 As the experimental results show, the sentence representations produced by our model achieve the new state-of-the-art on several tasks, including Tatoeba en-zh similarity search (Artetxe andSchwenk, 2019b), BUCC en-zh bitext mining, and ***** semantic textual ***** similarity on 7 datasets. | ||
| 2021.gem-1.3 To achieve this, we translated the English STSb dataset into Turkish and presented the first ***** semantic textual ***** similarity dataset for Turkish as well. | ||
| D17-1303 We evaluate our models on image-description ranking for German and English, and on ***** semantic textual ***** similarity of image descriptions in English. | ||
| 2021.naacl-industry.24 Since the freely available pre-trained models are too large to be deployed in this environment, we utilize knowledge distillation and fine-tune the models on proprietary datasets for two real-world tasks: sentiment analysis and ***** semantic textual ***** similarity. | ||
| S17-2007 This paper presents three systems for ***** semantic textual ***** similarity (STS) evaluation at SemEval-2017 STS task | ||
| language acquisition | 68 | |
| Q13-1026 In the context of ***** language acquisition *****, this independence assumption discards cues that are important to the learner, e.g., the fact that consecutive utterances are likely to share the same referent (Frank et al., 2013). | ||
| 2020.acl-main.684 This is an interesting example of pragmatic ***** language acquisition ***** without any linguistic annotation. | ||
| 2021.cmcl-1.24 We evaluate learning using a series of tasks inspired by methods commonly used in laboratory studies of ***** language acquisition *****. | ||
| L10-1520 Collocations play a significant role in second ***** language acquisition *****. | ||
| N18-4009 The outcome of the computational task is connected to a position in second ***** language acquisition ***** research that holds all learners acquire English grammatical morphemes in the same order, regardless of native language background. | ||
| discourse structure | 68 | |
| 2019.icon-1.3 The resultant ***** discourse structure ***** of Thirukkural can be indexed and further be used by Summary Generation Systems, IR Systems and QA Systems. | ||
| P18-2071 Different from widely-used RST-DT and PDTB, SciDTB uses dependency trees to represent ***** discourse structure *****, which is flexible and simplified to some extent but do not sacrifice structural integrity. | ||
| 2020.autosimtrans-1.5 Specifically, we first parse the input document to obtain its ***** discourse structure *****. | ||
| 2021.acl-long.499 We conjecture that this is because of the difficulty for the decoder to capture the high-level semantics and ***** discourse structure *****s in the context beyond token-level co-occurrence. | ||
| D19-1235 We propose a novel approach that uses distant supervision on an auxiliary task (sentiment classification), to generate abundant data for RST-style ***** discourse structure ***** prediction. | ||
| identification | 68 | |
| 2020.wanlp-1.32 In this paper, several techniques with multiple algorithms are applied for Arabic dialects ***** identification ***** starting from removing noise till classification task using all Arabic countries as 21 classes. | ||
| W19-2507 We develop a stylometric feature set for ancient Greek that enables ***** identification ***** of texts as prose or verse. | ||
| W19-0506 At the same time, the studies reveal empirical evidence why contextual abstractness represents a valuable indicator for automatic non-literal language ***** identification *****. | ||
| P17-1115 Accurate ***** identification ***** and interpretation of metonymy can be directly beneficial to various NLP applications, such as Named Entity Recognition and Geographical Parsing. | ||
| 2020.lrec-1.550 Our focus is directed at the de-***** identification ***** of emails where personally identifying information does not only refer to the sender but also to those people, locations, dates, and other identifiers mentioned in greetings, boilerplates and the content-carrying body of emails. | ||
| text mining | 68 | |
| 2021.acl-long.507 We show that margin-based bi***** text mining ***** in a multilingual sentence space can be successfully scaled to operate on monolingual corpora of billions of sentences. | ||
| L16-1003 Unlike previous approaches to machine translation, the output quality in TraMOOC relies on a multimodal evaluation schema that involves crowdsourcing, error type markup, an error taxonomy for translation model comparison, and implicit evaluation via ***** text mining *****, i.e. | ||
| E17-1109 Entity extraction is one of the fundamental components for biomedical ***** text mining *****. | ||
| W17-2322 Relation extraction methods are essential for creating robust *****text mining***** tools to help researchers find useful knowledge in the vast published literature . | ||
| R17-1002 In the current context of scientific information overload , *****text mining***** tools are of paramount importance for researchers who have to read scientific papers and assess their value . | ||
| problem | 68 | |
| D19-1212 Multi-view learning algorithms are powerful representation learning tools, often exploited in the context of multimodal ***** problem *****s. | ||
| 2020.wanlp-1.16 Our system is developed for the Fairseq framework, which allows for a fast and easy use for any other sequence prediction ***** problem *****. | ||
| D19-6008 This paper explores the use of Bidirectional Encoder Representations from Transformers(BERT) along with external relational knowledge from ConceptNet to tackle the ***** problem ***** of commonsense inference. | ||
| 2020.aacl-main.29 To resolve the cold start ***** problem ***** in training, we propose a method using a pseudo data generator which generates pseudo texts and KB triples for learning an initial model. | ||
| P19-1182 This important ***** problem ***** has not been explored mostly due to lack of datasets and effective models. | ||
| low resource | 68 | |
| 2020.repl4nlp-1.16 We find that better models for ***** low resource ***** languages require more efficient pretraining techniques or more data. | ||
| 2020.loresmt-1.5 Workshop on Technologies for MT of Low Resource Languages (LoResMT 2020) organized shared tasks of ***** low resource ***** language pair translation using zero-shot NMT. | ||
| P18-5008 We will also present EL methods that work for both name tagging and linking in very ***** low resource ***** languages. | ||
| 2020.wmt-1.127 We present our submission to the very ***** low resource ***** supervised machine translation task at WMT20. | ||
| N18-3018 We demonstrate that our proposed methods significantly increase accuracy in ***** low resource ***** settings and enable rapid development of accurate models with less data. | ||
| summarization evaluation | 68 | |
| 2020.acl-main.124 We study unsupervised multi-document ***** summarization evaluation ***** metrics, which require neither human-written reference summaries nor human annotations (e.g. | ||
| 2021.acl-long.34 Extensive experiments show that our methods can significantly outperform existing methods on both multi-document and single-document ***** summarization evaluation *****. | ||
| 2020.eval4nlp-1.8 Current ***** summarization evaluation ***** datasets are single-domain and focused on a few domains for which naturally occurring summaries can be easily found, such as news and scientific articles. | ||
| L16-1130 The most widely used metric in ***** summarization evaluation ***** has been the ROUGE family. | ||
| D18-1087 For that, we have analyzed a couple of datasets as a case study, using several variants of the ROUGE metric that are standard in ***** summarization evaluation *****. | ||
| byte pair encoding | 68 | |
| 2021.eacl-main.159 We introduce a data augmentation technique based on ***** byte pair encoding ***** and a BERT-like self-attention model to boost performance on spoken language understanding tasks. | ||
| 2018.iwslt-1.14 We morphologically segmented Basque text with a novel approach that only requires a dictionary such as those used by spell checkers and proved that this segmentation approach outperforms the widespread ***** byte pair encoding ***** strategy for this task. | ||
| 2020.nlp4convai-1.7 Other contributing factors include the joint modeling of dialogue context and response, and the 100% tokenization coverage from the ***** byte pair encoding ***** (BPE). | ||
| 2020.sltu-1.13 character n-grams, morphemes obtained by unsupervised morphological segmentation and ***** byte pair encoding *****. | ||
| 2021.ranlp-1.17 We hypothesize that sub-word representations based on ***** byte pair encoding ***** (Sennrich et al., 2016) can be leveraged to represent morphologically-complex Wolastoqey words and overcome the challenge of not having large corpora available for training. | ||
| caption generation | 68 | |
| 2020.coling-main.280 We propose a way to build an image-specific representation of the geographic context and adapt the ***** caption generation ***** network to produce appropriate geographic names in the image descriptions. | ||
| D19-1517 We also establish a baseline of step ***** caption generation ***** for future comparison. | ||
| N18-1114 Instantiated in a model for image-***** caption generation *****, TPGN outperforms LSTM baselines when evaluated on the COCO dataset. | ||
| 2021.emnlp-main.419 Experimental results, including detailed ablation studies, on two large-scale publicly available datasets show that JoGANIC substantially outperforms state-of-the-art methods both on ***** caption generation ***** and named entity related metrics. | ||
| P17-1007 In recent years word-embedding models have gained great popularity due to their remarkable performance on several tasks, including word analogy questions and *****caption generation*****. | ||
| dynamic oracle | 68 | |
| Q19-1018 We present a new cubic-time algorithm to calculate the optimal next step in shift-reduce dependency parsing, relative to ground truth, commonly referred to as ***** dynamic oracle *****. | ||
| Q13-1033 Experimental evaluation on a wide range of data sets clearly shows that using ***** dynamic oracle *****s to train greedy parsers gives substantial improvements in accuracy. | ||
| D18-1161 We introduce novel ***** dynamic oracle *****s for training two of the most accurate known shift-reduce algorithms for constituent parsing: the top-down and in-order transition-based parsers. | ||
| P17-1027 As a non-monotonic system requires exploration of erroneous actions during the training process, we develop several non-monotonic variants of the recently defined ***** dynamic oracle ***** for the Covington parser, based on tight approximations of the loss. | ||
| J17-2002 Training our model with ***** dynamic oracle *****s yields a linear-time greedy parser with very competitive performance. | ||
| meta - learning | 68 | |
| 2021.acl-tutorials.3 In the tutorial, we will first introduce *****Meta-learning***** approaches and the theory behind them, and then review the works of applying this technology to NLP problems. | ||
| 2021.acl-long.409 Aiming to further close this gap, we propose a model of semantic memory for WSD in a *****meta-learning***** setting. | ||
| 2021.eacl-main.109 To tackle this problem, we proposed to i) apply a designated *****meta-learning***** method to train the model; ii) regularize attention scores with alignment statistics; iii) apply a smoothing technique in pretraining. | ||
| 2021.naacl-main.425 The proposed method outperforms transfer learning and *****meta-learning***** baselines. | ||
| 2021.emnlp-main.480 Specifically, we tackle the weakly-supervised paraphrase generation problem by: (1) obtaining abundant weakly-labeled parallel sentences via retrieval-based pseudo paraphrase expansion; and (2) developing a *****meta-learning***** framework to progressively select valuable samples for fine-tuning a pre-trained language model BART on the sentential paraphrasing task. | ||
| methodological | 67 | |
| 2021.mtsummit-at4ssl.9 We present a number of ***** methodological ***** recommendations concerning the online evaluation of avatars for text-to-sign translation, focusing on the structure, format and length of the questionnaire, as well as methods for eliciting and faithfully transcribing responses | ||
| 2011.mtsummit-tutorials.1 Over the past twenty years, we have attacked the historical ***** methodological ***** barriers between statistical machine translation and traditional models of syntax, semantics, and structure. | ||
| 2021.clpsych-1.6 We discuss the challenges in creating and validating lexicons in a new language, and highlight our ***** methodological ***** considerations in the data-driven lexicon construction process. | ||
| 2020.acl-main.744 Introducing the benefits of structure to inform neural models presents a ***** methodological ***** challenge | ||
| 2020.findings-emnlp.259 We present a *****methodological***** framework for inferring symmetry of verb predicates in natural language . | ||
| pseudo | 67 | |
| L14-1670 The insights taken from the ***** pseudo ***** data experiments can be used to predict how the method works with real data. | ||
| 2020.emnlp-main.599 Self-training is widely used for UDA, and it predicts ***** pseudo ***** labels on the target domain data for training. | ||
| 2021.emnlp-main.222 In each iteration, we first construct a keyword graph, so the task of assigning ***** pseudo ***** labels is transformed to annotating keyword subgraphs. | ||
| I17-2005 We also perform ***** pseudo ***** active learning to investigate the applicability of active learning in analyzing syllables. | ||
| I17-1066 When these document and sentence embeddings are used for sentiment classification, we find that with both ***** pseudo ***** and external sentiment lexicons, our proposed methods can perform similarly to or better than several highly competitive domain adaptation methods on a benchmark dataset of product reviews | ||
| Similar | 67 | |
| 2020.wmt-1.47 This paper describes the participation of the NLP research team of the IPN Computer Research center in the WMT 2020 ***** Similar ***** Language Translation Task. | ||
| P18-1059 ***** Similar ***** effects on Text Simplification further support our claims | ||
| 2020.vardial-1.1 This paper presents the results of the VarDial Evaluation Campaign 2020 organized as part of the seventh workshop on Natural Language Processing ( NLP ) for *****Similar***** Languages , Varieties and Dialects ( VarDial ) , co - located with COLING 2020 . | ||
| W17-1201 We present the results of the VarDial Evaluation Campaign on Natural Language Processing ( NLP ) for *****Similar***** Languages , Varieties and Dialects , which we organized as part of the fourth edition of the VarDial workshop at EACL'2017 . | ||
| W19-1401 In this paper , we present the findings of the Third VarDial Evaluation Campaign organized as part of the sixth edition of the workshop on Natural Language Processing ( NLP ) for *****Similar***** Languages , Varieties and Dialects ( VarDial ) , co - located with NAACL 2019 . | ||
| lexicographic | 67 | |
| 2020.framenet-1.1 Framenets as an incarnation of frame semantics have been set up to deal with ***** lexicographic ***** issues (cf. | ||
| 2020.iwltp-1.11 Created in 2017, with the aim of building communities of voluntary contributors around African native and/or national languages, cultures, NLP technologies and artificial intelligence, the NTeALan association has set up a series of web collaborative platforms intended to allow the aforementioned communities to create and manage their own ***** lexicographic ***** and linguistic resources. | ||
| L12-1616 Given the different available resources and ***** lexicographic ***** traditions within the CPLP countries, a range of different solutions was adopted for different countries and integrated into a common development framework. | ||
| 2016.gwc-1.31 We describe an automated generator of accurate candidate adverbs, and introduce the ***** lexicographic ***** procedures which will ensure high consistency of wordnet editors' decisions about adverbs | ||
| L10-1161 This paper introduces a new *****lexicographic***** resource , the MuLeXFoR database , which aims to present word - formation processes in a multilingual environment . | ||
| queries | 67 | |
| W18-2320 We select a suitable subset of MeSH terms as ***** queries *****, and utilize MeSH term assignments as pseudo-relevance rankings for retrieval evaluation. | ||
| L08-1004 In this paper we present how resources and tools developed within the Human Language Technology Group at the University of Belgrade can be used for tuning ***** queries ***** before submitting them to a web search engine. | ||
| L08-1370 We ran the subset as ***** queries ***** against the complete list using several matchers, created adjudication pools, adjudicated the results, and compiled two versions of ground truth based on different sets of adjudication guidelines and methods for resolving adjudicator conflicts. | ||
| 2020.findings-emnlp.167 We propose and test two methods: (1) supervised attention; (2) adopting an auxiliary objective of disambiguating references in the input ***** queries ***** to table columns. | ||
| 2021.acl-long.442 It includes 20,604 labels for pairs of natural language ***** queries ***** and codes, each annotated by at least 3 human annotators | ||
| vocabulary | 67 | |
| 2020.lrec-1.43 Our main concerns were that ***** vocabulary ***** in language learning materials might be sparse, i.e. that not all ***** vocabulary ***** items that belong to a particular level would also occur in materials for that level, and, on the other hand, that ***** vocabulary ***** items might be used on lower-level materials if required by the topic (e.g. with a simpler paraphrasing or translation). | ||
| 2021.emnlp-main.446 We show that feed-forward layers in transformer-based language models operate as key-value memories, where each key correlates with textual patterns in the training examples, and each value induces a distribution over the output ***** vocabulary *****. | ||
| 2020.lrec-1.877 Multiple-choice cloze (fill-in-the-blank) questions are widely used in knowledge testing and are commonly used for testing ***** vocabulary ***** knowledge. | ||
| W17-5030 However, online processing techniques have been scarcely applied to investigating the reading difficulties of people with autism and what ***** vocabulary ***** is challenging for them. | ||
| 2011.iwslt-evaluation.22 After the construction of the phrase table the actual SMT ***** vocabulary ***** can be less than the training data ***** vocabulary ***** | ||
| unsupervised parsing | 67 | |
| 2020.coling-main.227 Despite its difficulty, ***** unsupervised parsing ***** is an interesting research direction because of its capability of utilizing almost unlimited unannotated text data. | ||
| P19-1338 In our work, we propose an imitation learning approach to ***** unsupervised parsing *****, where we transfer the syntactic knowledge induced by PRPN to a Tree-LSTM model with discrete parsing actions. | ||
| 2021.eacl-tutorials.1 In this tutorial, we will introduce to the general audience what ***** unsupervised parsing ***** does and how it can be useful for and beyond syntactic parsing. | ||
| D19-6123 In addition, we show that, by sharing parameters between the related languages German and English, we can improve the model's ***** unsupervised parsing ***** F1 score by up to 4% in the low-resource setting. | ||
| 2020.aacl-main.43 Here, we propose a novel fully ***** unsupervised parsing ***** approach that extracts constituency trees from PLM attention heads | ||
| internet | 67 | |
| W19-3701 It can be applied to other low-resourced inflectional languages which have ***** internet ***** corpora and linguistic descriptions of their inflection system, following the example of inflection tables for Ukrainian. | ||
| 2021.germeval-1.11 Spreading ones opinion on the ***** internet ***** is becoming more and more important. | ||
| 2001.mtsummit-papers.14 This paper describes interNOSTRUM, a Spanish3Catalan machine translation system currently under development that achieves great speed through the use of finite-state technologies (so that it may be integrated with ***** internet ***** browsing) and a reasonable accuracy using an advanced morphological transfer strategy (to produce fast translation drafts ready for light postedition). | ||
| 2020.lrec-1.605 Today, recommender systems are an inevitable part of everyone's daily digital routine and are present on most ***** internet ***** platforms. | ||
| 2019.icon-1.11 Increased ***** internet ***** bandwidth at low cost is leading to the creation of large volumes of unstructured data. | ||
| winograd schema challenge | 67 | |
| 2020.acl-main.671 We propose a self-supervised method to solve Pronoun Disambiguation and *****Winograd Schema Challenge***** problems. | ||
| 2021.crac-1.1 Then we focus on recent progress on hard pronoun coreference resolution problems (e.g., *****Winograd Schema Challenge*****) to analyze how well current models can understand commonsense. | ||
| 2020.emnlp-main.664 Performance on the *****Winograd Schema Challenge***** (WSC), a respected English commonsense reasoning benchmark, recently rocketed from chance accuracy to 89% on the SuperGLUE leaderboard, with relatively little corroborating evidence of a correspondingly large improvement in reasoning ability. | ||
| N18-4004 We introduce an automatic system that performs well on two common-sense reasoning tasks, the *****Winograd Schema Challenge***** (WSC) and the Choice of Plausible Alternatives (COPA). | ||
| 2020.coling-main.515 The *****Winograd Schema Challenge***** (WSC) and variants inspired by it have become important benchmarks for common-sense reasoning (CSR). | ||
| TER | 66 | |
| 2006.amta-papers.25 We show that the single-reference variant of ***** TER ***** correlates as well with human judgments of MT quality as the four-reference variant of BLEU. | ||
| D19-1309 Extensive experiments on four public datasets demonstrate the proposed method achieves state-of-the-art results, outperforming previous generative architectures on both automatic metrics (BLEU, METEOR, and ***** TER *****) and human evaluations. | ||
| 2010.amta-papers.8 Our findings show that the models are complementary, and their combination achieve an increase of 1% in BLEU and a reduction of nearly 2% in ***** TER *****. | ||
| L14-1694 Two of them were used in Statistical Machine Translation (SMT) experiments, obtaining very similar qualitative scores in terms of BLEU and ***** TER ***** and therefore a thorough evaluation of both has been carried out. | ||
| W19-5359 Metrics such as BLEU and ***** TER ***** have been used for decades | ||
| mBERT | 66 | |
| 2021.emnlp-main.745 Based on treebank size and available ELMo models, we select Hungarian, Uyghur (a zero-shot language for ***** mBERT *****) and Vietnamese. | ||
| 2021.blackboxnlp-1.15 Visualisations reveal that ***** mBERT ***** loses the ability to cluster representations by language after fine-tuning, a result that is supported by evidence from language identification experiments. | ||
| D19-1077 We compare ***** mBERT ***** with the best-published methods for zero-shot cross-lingual transfer and find ***** mBERT ***** competitive on each task. | ||
| 2021.adaptnlp-1.12 Models such as ***** mBERT ***** and XLMR have shown success in solving Code-Mixed NLP tasks even though they were not exposed to such text during pretraining. | ||
| 2020.coling-main.105 We probe the layers in multilingual BERT (***** mBERT *****) for phylogenetic and geographic language signals across 100 languages and compute language distances based on the ***** mBERT ***** representations | ||
| grammaticality | 66 | |
| L10-1292 Using the grammar checker as an evaluation tool gives a complementary picture to standard metrics such as Bleu, which do not account well for ***** grammaticality *****. | ||
| 2020.sigdial-1.3 We also report a human qualitative evaluation of the final model showing that it achieves high naturalness, semantic coherence and ***** grammaticality *****. | ||
| 2020.acl-srw.33 Besides its biological inspiration, our model also shows competitive performance relative to LSTMs on subject-verb agreement, sentence ***** grammaticality *****, and language modeling tasks. | ||
| W18-1704 This paper describes a new Integer Linear Programming method for MSC using a vertex-labeled graph to select different keywords, and novel 3-gram scores to generate more informative sentences while maintaining their ***** grammaticality *****. | ||
| P19-1161 By evaluating our approach using four different languages, we show that, on average, it reduces gender stereotyping by a factor of 2.5 without any sacrifice to ***** grammaticality ***** | ||
| Spatial | 66 | |
| Q18-1010 ***** Spatial ***** understanding is crucial in many real-world problems, yet little progress has been made towards building representations that capture spatial knowledge. | ||
| 2020.lrec-1.288 ***** Spatial ***** relations between objects can either be explicit – expressed as spatial prepositions, or implicit – expressed by spatial verbs such as moving, walking, shifting, etc. | ||
| 2020.splu-1.6 ***** Spatial ***** expressions (or triggers) are mainly used to describe the positioning of radiographic findings or medical devices with respect to some anatomical structures | ||
| P19-1025 *****Spatial***** aggregation refers to merging of documents created at the same spatial location . | ||
| 2020.lrec-1.717 *****Spatial***** Reasoning from language is essential for natural language understanding . | ||
| recursive | 66 | |
| 2020.findings-emnlp.208 We model the ***** recursive ***** production property of context-free grammars for natural and synthetic languages. | ||
| S19-2015 This paper describes our ***** recursive ***** system for SemEval-2019 Task 1: Cross-lingual Semantic Parsing with UCCA. | ||
| E17-1002 In contrast, the advantages of ***** recursive ***** networks include that they explicitly model the compositionality and the ***** recursive ***** structure of natural language. | ||
| P18-1184 Results on two public Twitter datasets demonstrate that our ***** recursive ***** neural models 1) achieve much better performance than state-of-the-art approaches; 2) demonstrate superior capacity on detecting rumors at very early stage. | ||
| D19-1545 Programmers typically organize executable source code using high-level coding patterns or idiomatic structures such as nested loops, exception handlers and ***** recursive ***** blocks, rather than as individual code tokens | ||
| Contrastive | 66 | |
| 2021.naacl-main.427 To this end, we propose Supporting Clustering with ***** Contrastive ***** Learning (SCCL) – a novel framework to leverage contrastive learning to promote better separation. | ||
| 2021.acl-long.157 This paper aims to solve this problem by proposing a novel model called LoopCAG, which connects ***** Contrastive ***** constraints and Attention Guidance in a Loop manner, engaged explicit spatial and temporal constraints to the generating process. | ||
| 2021.emnlp-main.120 *****Contrastive***** explanations clarify why an event occurred in contrast to another . | ||
| 2021.acl-long.196 However , existing approaches in NLP mainly focus on WHY A rather than contrastive WHY A NOT B , which is shown to be able to better distinguish confusing candidates and improve data efficiency in other research fields . In this paper , we focus on generating contrastive explanations with counterfactual examples in NLI and propose a novel Knowledge - Aware *****Contrastive***** Explanation generation framework ( KACE).Specifically , we first identify rationales ( i.e. , key phrases ) from input sentences , and use them as key perturbations for generating counterfactual examples . | ||
| 2021.bionlp-1.1 *****Contrastive***** learning has been used to learn a high - quality representation of the image in computer vision . | ||
| Abusive | 66 | |
| 2021.bsnlp-1.3 ***** Abusive ***** phenomena are commonplace in language on the web. | ||
| 2020.lrec-1.191 *****Abusive***** texts are reaching the interests of the scientific and social community . | ||
| 2021.woah-1.20 *****Abusive***** language is a growing phenomenon on social media platforms . | ||
| 2021.eacl-main.179 *****Abusive***** language in online discourse negatively affects a large number of social media users . | ||
| D19-5002 *****Abusive***** text is a serious problem in social media and causes many issues among users as the number of users and the content volume increase . | ||
| posterior | 66 | |
| W19-4732 We find that GP models fare better in terms of some key ***** posterior ***** predictive checks than models that do not express covariance between sound changes, and outline directions for future work. | ||
| 2020.lrec-1.340 We project the existing annotations in rich-resource languages by means of Neural Machine Translation (NMT) and ***** posterior ***** word alignments. | ||
| 2021.eacl-main.98 It is nevertheless challenging to train and often results in a trivial local optimum where the latent variable is ignored and its ***** posterior ***** collapses into the prior, an issue known as ***** posterior ***** collapse. | ||
| P18-1069 Experimental results show that our confidence model significantly outperforms a widely used method that relies on ***** posterior ***** probability, and improves the quality of interpretation compared to simply relying on attention scores. | ||
| 2020.coling-main.216 However , an issue known as posterior collapse ( or KL loss vanishing ) happens when the VAE is used in text modelling , where the approximate *****posterior***** collapses to the prior , and the model will totally ignore the latent variables and be degraded to a plain language model during text generation . | ||
| neural summarization | 66 | |
| 2020.coling-main.495 Then, we iteratively train our summarization model on each single-document to alleviate the computational complexity issue that occurs while training ***** neural summarization ***** models in multiple documents (i.e., long sequences) at once. | ||
| D19-1327 Despite the recent developments on ***** neural summarization ***** systems, the underlying logic behind the improvements from the systems and its corpus-dependency remains largely unexplored. | ||
| 2020.eval4nlp-1.1 Comparative analysis revealed that two ***** neural summarization ***** systems leveraging pre-trained models have an advantage in decreasing grammaticality errors, but not necessarily factual errors. | ||
| W17-4513 Experimental results on news stories and opinion articles indicate that ***** neural summarization ***** model benefits from pre-training based on extractive summaries. | ||
| 2020.acl-main.458 On two separate datasets collected from hospitals, we show via both automatic and human evaluation that the proposed approach substantially improves the factual correctness and overall quality of outputs over a competitive ***** neural summarization ***** system, producing radiology summaries that approach the quality of human-authored ones | ||
| events | 66 | |
| L16-1590 Additionally to these core ***** events *****, symptomatic and treatment ***** events ***** have been annotated. | ||
| 2020.alta-1.9 Existing approaches in this realm are limited to the extraction of low-level relations among individual ***** events *****. | ||
| L10-1138 In this paper, we report on a study that was performed within the Semantics of History project on how descriptions of historical ***** events ***** are realized in different types of text and what the implications are for modeling the event information. | ||
| L14-1091 Here, we extend the distant supervision approach to template-based event extraction, focusing on the extraction of passenger counts, aircraft types, and other facts concerning airplane crash ***** events *****. | ||
| W19-9006 Focus being not just on foreign language tuition, but above all on people, places and ***** events ***** in the history and culture of the EU member states, the annotation modules of the e-Platform have been accordingly extended. | ||
| mental health | 66 | |
| W17-3105 Our results indicate that, overall, research participants were enthusiastic about the possibility of using social media (in conjunction with automated Natural Language Processing algorithms) for mood tracking under the supervision of a ***** mental health ***** practitioner. | ||
| 2021.clpsych-1.10 Based on research in ***** mental health ***** studies linking self-harm tendencies with suicide, in our system, we attempt to characterize self-harm aspects expressed in user tweets over a period of time. | ||
| W18-5621 Natural Language Processing (NLP) methods can be used to extract this data, in order to identify symptoms and treatments from ***** mental health ***** records, and temporally anchor the first emergence of these. | ||
| 2021.eacl-main.205 Recent psychological studies indicate that individuals exhibiting suicidal ideation increasingly turn to social media rather than ***** mental health ***** practitioners. | ||
| 2021.acl-short.133 Among social media platforms, Reddit has emerged as the most promising one due to its anonymity and its focus on topic-based communities (subreddits) that can be indicative of someone's state of mind or interest regarding ***** mental health ***** disorders such as r/SuicideWatch, r/Anxiety, r/depression. | ||
| winograd schema | 66 | |
| 2020.acl-main.671 We propose a self-supervised method to solve Pronoun Disambiguation and *****Winograd Schema***** Challenge problems. | ||
| 2021.crac-1.1 Then we focus on recent progress on hard pronoun coreference resolution problems (e.g., *****Winograd Schema***** Challenge) to analyze how well current models can understand commonsense. | ||
| 2020.emnlp-main.664 Performance on the *****Winograd Schema***** Challenge (WSC), a respected English commonsense reasoning benchmark, recently rocketed from chance accuracy to 89% on the SuperGLUE leaderboard, with relatively little corroborating evidence of a correspondingly large improvement in reasoning ability. | ||
| N18-4004 We introduce an automatic system that performs well on two common-sense reasoning tasks, the *****Winograd Schema***** Challenge (WSC) and the Choice of Plausible Alternatives (COPA). | ||
| 2020.coling-main.515 The *****Winograd Schema***** Challenge (WSC) and variants inspired by it have become important benchmarks for common-sense reasoning (CSR). | ||
| political | 66 | |
| 2021.nlp4posimpact-1.14 We also release the first publicly available data set at the intersection of geo***** political ***** relations and a raging pandemic in the context of India and Pakistan. | ||
| 2021.case-1.3 Classification projects need to be anchored in the theoretical interests of scholars of ***** political ***** violence if the data they produce are to be put to analytical use. | ||
| 2021.eacl-main.165 Populist rhetoric has risen across the ***** political ***** sphere in recent years; however, due to its complex nature, computational approaches to it have been scarce. | ||
| 2021.acl-srw.31 The events that took place at the Unite the Right rally held in Charlottesville, Virginia on August 11-12, 2017 caused intense reaction on social media from users across the ***** political ***** spectrum. | ||
| 2021.cinlp-1.5 In this randomized experiment, Americans from the Democratic and Republican parties were either randomly paired with one-another to have an anonymous conversation about politics or alternatively not assigned to a conversation — change in ***** political ***** polarization over time was measured for all participants. | ||
| lstm - crf | 66 | |
| R19-1127 We propose a morphologically informed model for named entity recognition, which is based on *****LSTM-CRF***** architecture and combines word embeddings, Bi-LSTM character embeddings, part-of-speech (POS) tags, and morphological information. | ||
| 2021.semeval-1.138 And then we construct Bidirectional Long Short Term Memory-Conditional Random Field (Bi-*****LSTM-CRF*****) model by Baidu research to predict whether each word in the sentence is toxic or not. | ||
| P19-2029 We propose a novel shared Bi-*****LSTM-CRF***** model to fuse linguistic features efficiently by sharing the LSTM network during the training procedure. | ||
| D19-1399 In this work, we propose a simple yet effective dependency-guided *****LSTM-CRF***** model to encode the complete dependency trees and capture the above properties for the task of named entity recognition (NER). | ||
| 2020.ccl-1.99 Then, we present the framework of Bi-*****LSTM-CRF***** EDUs recognition model using word embedding, POS and syntactic features, which can combine the advantage of CRF and Bi-LSTM. | ||
| decomposition | 65 | |
| 2020.emnlp-main.711 State-of-the-art models for multi-hop question answering typically augment large-scale language models like BERT with additional, intuitively useful capabilities such as named entity recognition, graph-based reasoning, and question ***** decomposition *****. | ||
| D19-5613 Higher quality of information ***** decomposition ***** corresponds to higher performance in terms of bilingual evaluation understudy (BLEU) between output and human-written reformulations. | ||
| 2020.ngt-1.2 This is achieved by applying adversarial training with a latent ***** decomposition ***** scheme. | ||
| Q13-1018 Finally, we employ dual ***** decomposition ***** techniques to produce consistent syntactic and predicate-argument structures while searching over a large space of syntactic configurations. | ||
| J18-1004 Our system includes a cache with fixed size m, and we characterize the relationship between the parameter m and the class of graphs that can be produced through the graph-theoretic concept of tree ***** decomposition ***** | ||
| functionalities | 65 | |
| 2020.coling-main.474 However, despite these advances, there are still desirable ***** functionalities ***** missing from the fact-checking pipeline. | ||
| L12-1368 In this paper we introduce the annotation ***** functionalities ***** of ANALEC, some of the annotated data visualization ***** functionalities *****, and three statistical modules: frequency, correlation and geometrical representations. | ||
| L06-1433 We present an overview of the algorithm and its ***** functionalities *****. | ||
| 2020.lrec-1.871 When a module is put into xtsv, all ***** functionalities ***** of the system are immediately available for that module, and the module can be be a part of an xtsv pipeline. | ||
| 2021.naacl-industry.2 In this paper, we address annotation conflict resolution for Natural Language Understanding (NLU), a structured prediction task, in a real-world setting of commercial voice-controlled personal assistants, where (1) regular data collections are needed to support new and existing ***** functionalities *****, (2) annotation guidelines evolve over time, and (3) the pool of annotators change across data collections | ||
| matrix | 65 | |
| L16-1536 Inspired by observations in (Mikolov et al., 2013b), which show that training their word vector model on comparable corpora yields comparable vector space representations of those corpora, reducing the problem of translating words to finding a rotation ***** matrix *****, and results in (Zou et al., 2013), which showed that bilingual word embeddings can improve Chinese Named Entity Recognition (NER) and English to Chinese phrase translation, we use the sentence-aligned English-French EuroParl corpora and show that word embeddings extracted from a merged corpus (corpus resulted from the merger of the two aligned corpora) can be used to NE translation. | ||
| 2021.acl-srw.36 The mechanism synchronizes source-side and target-side syntactic self-attentions by minimizing the difference between target-side self-attentions and the source-side self-attentions mapped by the encoder-decoder attention ***** matrix *****. | ||
| W17-2621 Beyond the popular vector space models, ***** matrix ***** representations for words have been proposed, since then, ***** matrix ***** multiplication can serve as natural composition operation. | ||
| 2020.osact-1.13 We report confusion ***** matrix *****, accuracy, precision, recall and F1 of the development set and report summarized results of the test set. | ||
| 2020.coling-main.321 A new set of word vectors is generated by a spectral decomposition of the similarity ***** matrix *****, which has a linear algebraic analytic form | ||
| Document | 65 | |
| L08-1258 In this paper we present a new ***** Document ***** Management System called DrStorage. | ||
| L08-1445 A LMF compliant schema implemented in a ***** Document ***** Type Definition (DTD) describing the lexical resources is taken by the system to automatically configure the platform. | ||
| P18-1149 *****Document***** date is essential for many important tasks , such as document retrieval , summarization , event detection , etc . | ||
| P18-1188 *****Document***** modeling is essential to a variety of natural language understanding tasks . | ||
| 2020.aacl-main.62 *****Document***** alignment aims to identify pairs of documents in two distinct languages that are of comparable content or translations of each other . | ||
| contextualized embeddings | 65 | |
| N19-1078 To address this drawback, we propose a method in which we dynamically aggregate ***** contextualized embeddings ***** of each unique string that we encounter. | ||
| W19-4509 In most cases, ***** contextualized embeddings ***** do also not show an improvement on the score achieved by pre-defined embeddings. | ||
| R19-1015 In this paper, we introduce QBERT, a Transformer-based architecture for ***** contextualized embeddings ***** which makes use of a co-attentive layer to produce more deeply bidirectional representations, better-fitting for the WSD task. | ||
| 2021.mwe-1.4 In this paper we propose a supervised model based on ***** contextualized embeddings ***** for predicting whether usages of PIEs are idiomatic or literal | ||
| 2020.gebnlp-1.6 Furthermore, we analyze the effect of the debiasing techniques on downstream tasks which show a negligible impact on traditional embeddings and a 2% decrease in performance in ***** contextualized embeddings *****. | ||
| multiple languages | 65 | |
| D18-1270 This enables our approach to: (a) augment the limited supervision in the target language with additional supervision from a high-resource language (like English), and (b) train a single entity linking model for ***** multiple languages *****, improving upon individually trained models for each language. | ||
| C18-1213 Thereby, we confirm that the approach works for ***** multiple languages *****. | ||
| 2021.nodalida-main.16 This article studies register classification of documents from the unrestricted web, such as news articles or opinion blogs, in a multilingual setting, exploring both the benefit of training on ***** multiple languages ***** and the capabilities for zero-shot cross-lingual transfer. | ||
| 2021.calcs-1.9 Because of globalization, it is becoming more and more common to use ***** multiple languages ***** in a single utterance, also called code-switching. | ||
| Q19-1028 We conducted an exhaustive evaluation using data sets targeting ***** multiple languages ***** and prediction task types, to compare the proposed model with traditional, state-of-the-art, and other neural network strategies. | ||
| deep neural network | 65 | |
| P18-1182 In this paper, we present a new approach (NeuralREG), relying on ***** deep neural network *****s, which makes decisions about form and content in one go without explicit feature extraction. | ||
| 2020.emnlp-main.255 To demystify the “black box” property of ***** deep neural network *****s for natural language processing (NLP), several methods have been proposed to interpret their predictions by measuring the change in prediction probability after erasing each token of an input. | ||
| 2020.coling-main.344 Both models consist of two parts: an encoder enhanced by ***** deep neural network *****s (DNN) that can utilize the contextual information to encode the input into latent variables, and a decoder which is a generative model able to reconstruct the input. | ||
| N18-1044 In this paper, we propose a ***** deep neural network ***** diachronic distributional model. | ||
| 2020.vardial-1.23 From simple models for regression, such as Support Vector Regression, to ***** deep neural network *****s, such as Long Short-Term Memory networks and character-level convolutional neural networks, and, finally, to ensemble models based on meta-learners, such as XGBoost, our interest is focused on approaching the problem from a few different perspectives, in an attempt to minimize the prediction error. | ||
| argument mining | 65 | |
| 2020.lrec-1.143 Our corpus can be used as a resource for analyzing persuasiveness and training an ***** argument mining ***** system to identify and extract argument structures. | ||
| P17-1144 Drafts are manually aligned at the sentence level, and the writer's purpose for each revision is annotated with categories analogous to those used in ***** argument mining ***** and discourse analysis. | ||
| 2021.eacl-main.55 Non-neural approaches to ***** argument mining ***** (AM) are often pipelined and require heavy feature-engineering. | ||
| C18-1176 Finally, we adapt our system to solve a recent ***** argument mining ***** task of identifying argumentative sentences in Web texts retrieved from heterogeneous sources, and obtain F1 scores comparable to the supervised baseline. | ||
| 2021.argmining-1.1 In this paper, we propose a novel problem formulation to mine arguments from Twitter: We formulate ***** argument mining ***** on Twitter as a text classification task to identify tweets that serve as premises for a hashtag that represents a claim of interest. | ||
| parallel corpus filtering | 65 | |
| W19-4309 Our approach shows promising performance on sentence alignment recovery and the WMT 2018 ***** parallel corpus filtering ***** tasks with only a single model. | ||
| W18-6486 The paper also documents Tilde's submissions to the WMT 2018 shared task on ***** parallel corpus filtering *****. | ||
| 2020.wmt-1.107 This paper describes the joint submission of Universitat d'Alacant and Prompsit Language Engineering to the WMT 2020 shared task on ***** parallel corpus filtering *****. | ||
| W18-6473 We describe Vicomtech's participation in the WMT 2018 Shared Task on ***** parallel corpus filtering *****. | ||
| W19-5441 This paper describes the University of Helsinki Language Technology group's participation in the WMT 2019 ***** parallel corpus filtering ***** task. | ||
| representation learning | 65 | |
| D19-1212 Multi-view learning algorithms are powerful ***** representation learning ***** tools, often exploited in the context of multimodal problems. | ||
| 2020.acl-main.588 The idea is to allow the dependency graph to guide the ***** representation learning ***** of the transformer encoder and vice versa. | ||
| 2021.nlp4convai-1.18 In this work, we aim to construct a robust sentence ***** representation learning ***** model, that is specifically designed for dialogue response generation, with Transformer-based encoder-decoder structure. | ||
| 2020.coling-main.22 To address this problem, we propose a modification to DIRL, obtaining a novel weighted domain-invariant ***** representation learning ***** (WDIRL) framework. | ||
| 2021.naacl-main.183 Experimentally, we show that combining variational ***** representation learning ***** and the LB-SOINN memory module achieves better performance than the commonly-used lifelong learning techniques. | ||
| syntactic and semantic | 65 | |
| P19-1423 Inter-sentence relation extraction deals with a number of complex semantic relationships in documents, which require local, non-local, ***** syntactic and semantic ***** dependencies. | ||
| 2021.law-1.4 The annotation tool combines ***** syntactic and semantic ***** cues to assign aspects on a sentence-by-sentence basis, following a sequence of rules that each output a UMR aspect. | ||
| 2021.eacl-main.66 We observe that both the use of reinforcement learning and the release from sequential constraints are beneficial to the quality of the ***** syntactic and semantic ***** parses. | ||
| 2020.coling-main.269 The size and detail of annotations make the test suite a valuable resource for natural language processing applications with ***** syntactic and semantic ***** tasks. | ||
| P17-5005 Our target audience are researchers and practitioners in machine learning, parsing (***** syntactic and semantic *****) and language technology, not necessarily experts in MWEs, who are interested in tasks that involve or could benefit from considering MWEs as a pervasive phenomenon in human language and communication. | ||
| neural abstractive summarization | 65 | |
| W18-6545 Till now, ***** neural abstractive summarization ***** methods have achieved great success for single document summarization (SDS). | ||
| D18-1089 We attempted to verify the degree of abstractiveness of modern ***** neural abstractive summarization ***** systems by calculating overlaps in terms of various types of units. | ||
| 2020.emnlp-main.749 Pre-trained ***** neural abstractive summarization ***** systems have dominated extractive strategies on news summarization performance, at least in terms of ROUGE. | ||
| 2021.emnlp-main.334 It remains challenging for a state-of-the-art ***** neural abstractive summarization ***** model to generate a well-integrated summary sentence. | ||
| 2021.naacl-main.475 Despite significant progress in ***** neural abstractive summarization *****, recent studies have shown that the current models are prone to generating summaries that are unfaithful to the original context. | ||
| machine translation ( MT ) | 65 | |
| L12-1231 In recent years , *****machine translation ( MT )***** research has focused on investigating how hybrid machine translation as well as system combination approaches can be designed so that the resulting hybrid translations show an improvement over the individual component translations . | ||
| 2021.acl-srw.33 It is reported that grammatical information is useful for *****machine translation ( MT )***** task . | ||
| L14-1347 Human translators are the key to evaluating *****machine translation ( MT )***** quality and also to addressing the so far unanswered question when and how to use MT in professional translation workflows . | ||
| 2020.loresmt-1.15 Statistical machine translation ( SMT ) which was the dominant paradigm in *****machine translation ( MT )***** research for nearly three decades has recently been superseded by the end - to - end deep learning approaches to MT . | ||
| 2021.eacl-main.248 Typical ASR systems segment the input audio into utterances using purely acoustic information , which may not resemble the sentence - like units that are expected by conventional *****machine translation ( MT )***** systems for Spoken Language Translation . | ||
| deep | 65 | |
| 2020.lrec-1.259 We present a comparison between *****deep***** learning and traditional machine learning methods for various NLP tasks in Italian . | ||
| 2021.rocling-1.22 Due to the development of *****deep***** learning , the natural language processing tasks have made great progresses by leveraging the bidirectional encoder representations from Transformers ( BERT ) . | ||
| 2021.acl-long.163 It is a common belief that training *****deep***** transformers from scratch requires large datasets . | ||
| C18-1255 Neural machine translation systems require a number of stacked layers for *****deep***** models . | ||
| 2020.lrec-1.220 As the demand for explainable *****deep***** learning grows in the evaluation of language technologies , the value of a principled grounding for those explanations grows as well . | ||
| subset | 64 | |
| W18-2320 We select a suitable ***** subset ***** of MeSH terms as queries, and utilize MeSH term assignments as pseudo-relevance rankings for retrieval evaluation. | ||
| D17-1084 Experiment results show that our method achieves an accuracy of 28.4% on the linear Dolphin18K benchmark, which is 10% (54% relative) higher than previous state-of-the-art systems while achieving an accuracy increase of 12% (59% relative) on the TS6 benchmark ***** subset *****. | ||
| L14-1588 We report the performance of these algorithms on a manually aligned ***** subset ***** of the data. | ||
| L10-1133 Phrase ranking can be done using either a fine-grained six-way scoring scheme that allows to differentiate between ”“much better”” and ”“slightly better””, or a reduced ***** subset ***** of ranking choices. | ||
| N19-1039 Existing state-of-the-art UDA approaches use neural networks to learn representations that are trained to predict the values of ***** subset ***** of important features called “pivot features” on combined data from the source and target domains | ||
| Transformers | 64 | |
| 2021.acl-long.335 In the era of pre-trained language models, ***** Transformers ***** are the de facto choice of model architectures. | ||
| 2020.acl-main.38 In contrast to previous research which focuses on deep encoders, our approach additionally enables ***** Transformers ***** to also benefit from deep decoders. | ||
| 2021.emnlp-main.753 Following the success of dot-product attention in ***** Transformers *****, numerous approximations have been recently proposed to address its quadratic complexity with respect to the input length. | ||
| 2020.sustainlp-1.17 A PyTorch-based implementation of SqueezeBERT is available as part of the Hugging Face ***** Transformers ***** library: https://huggingface.co/squeezeber | ||
| 2021.teachingnlp-1.18 Assignments are designed to be interactive , easily gradable , and to give students hands - on experience with several key types of structure ( sequences , tags , parse trees , and logical forms ) , modern neural architectures ( LSTMs and *****Transformers***** ) , inference algorithms ( dynamic programs and approximate search ) and training methods ( full and weak supervision ) . | ||
| hence | 64 | |
| 2020.acl-main.194 Moreover, we leverage recent advances in data augmentation to guess low-entropy labels for unlabeled data, ***** hence ***** making them as easy to use as labeled data. | ||
| 2009.jeptalnrecital-recital.2 The latter one uses ***** hence ***** the same grammar rules for creation of the language models for these two different languages. | ||
| 2021.humeval-1.2 Based on these promising first results, we discuss future research directions for incorporating subjective human evaluations into language model training and to ***** hence ***** keep the human user in the loop during the development process. | ||
| L10-1282 The documentation must be of a kind that it enables the user to compare different tools offering the same service, ***** hence ***** the descriptions must contain measurable values. | ||
| D19-1204 When user questions are answerable by SQL, the expert describes the SQL and execution results to the user, ***** hence ***** maintaining a natural interaction flow | ||
| Word2Vec | 64 | |
| 2021.wassa-1.21 We highlight a deep learning based approach for detecting emotions using bilingual word embeddings derived from FastText and ***** Word2Vec ***** approaches in Hindi-English code mixed tweets. | ||
| 2020.wanlp-1.11 In this work, we propose two embedding strategies that modify the tokenization phase of traditional word embedding models (***** Word2Vec *****) and contextual word embedding models (BERT) to take into account | ||
| D19-5305 While some, like ***** Word2Vec *****, are based on sequential text input, others are utilizing a graph representation of text. | ||
| C16-1257 We also explore the feasibility of automatic irony recognition by exploiting a varied set of features including lexical, syntactic, sentiment and semantic (***** Word2Vec *****) information. | ||
| W18-6247 We obtain a best performance of R = 0.23 using GloVe, and R = 0.22 using ***** Word2Vec ***** representations for manual and automatically transcribed texts respectively | ||
| cloze | 64 | |
| N19-1052 We formulate this as a ***** cloze ***** test, where the goal is to identify which of two advice-seeking questions was removed from a given narrative. | ||
| K17-1004 In addition, combining our stylistic features with language model predictions reaches state of the art performance on the story ***** cloze ***** challenge. | ||
| W17-0905 This paper analyzes the narrative event ***** cloze ***** test and its recent evolution. | ||
| 2020.coling-main.113 Experiments on a recently released Chinese idiom ***** cloze ***** test dataset show that our proposed method performs better than the existing state of the art. | ||
| 2020.starsem-1.10 Past work has probed BERT representations for this competence, finding that BERT can correctly retrieve noun hypernyms in ***** cloze ***** tasks | ||
| explainable | 64 | |
| 2020.findings-emnlp.355 It also enables an ***** explainable ***** generation process. | ||
| P19-1618 Rule-based models are attractive for various tasks because they inherently lead to interpretable and ***** explainable ***** decisions and can easily incorporate prior knowledge. | ||
| 2020.acl-main.97 Recently, many methods discover effective evidence from reliable sources by appropriate neural networks for ***** explainable ***** claim verification, which has been widely recognized. | ||
| D19-1334 Multi-hop knowledge graph (KG) reasoning is an effective and ***** explainable ***** method for predicting the target entity via reasoning paths in query answering (QA) task. | ||
| W18-2404 The developed framework can enable more ***** explainable ***** and generalizable spoken language understanding systems | ||
| localization | 64 | |
| I17-4033 To accomplish this task, we present an answer ***** localization ***** method to locate answers shown in web pages, considering structural information and semantic information both. | ||
| W03-3004 This solution relies on the ***** localization ***** of the linguistic objects in the context. | ||
| 2011.mtsummit-tutorials.4 The tutorial will provide an overview of current ***** localization ***** practices and challenges, with a special focus on the role of translation memory and translation management technologies. | ||
| 2010.amta-commercial.5 Over the last two years , Adobe Systems has incorporated Machine Translation with post - editing into the *****localization***** workflow . | ||
| 2012.amta-tutorials.2 This session will cover how to increase *****localization***** efficiency with a SYSTRAN desktop product and a server solution . | ||
| Dravidian | 64 | |
| 2021.dravidianlangtech-1.27 Among the known tasks related to offensive speech detection, this is the first task to detect offensive comments posted in social media comments in the ***** Dravidian ***** language. | ||
| 2021.wat-1.21 In this paper, we focus on subword segmentation and evaluate Linguistically Motivated Vocabulary Reduction (LMVR) against the more commonly used SentencePiece (SP) for the task of translating from English into four different ***** Dravidian ***** languages. | ||
| 2020.wildre-1.12 The Indian languages belongs to ***** Dravidian ***** language family such as Tamil, Telugu, Malayalam, Indo-Aryan language family such as Hindi, Punjabi, Bengali and Marathi, European languages such as English, Spanish, Dutch, German and Hungarian are used in this work. | ||
| 2021.dravidianlangtech-1.31 The task requires us to classify ***** Dravidian ***** languages collected from social media into Not-Offensive, Off-Untargeted, Off-Target-Individual, etc | ||
| 2021.dravidianlangtech-1.18 In this paper , we describe the GX system in the EACL2021 shared task on machine translation in *****Dravidian***** languages . | ||
| quantification | 64 | |
| W19-0403 This paper describes in brief the proposal called `QuantML' which was accepted by the International Organisation for Standards (ISO) last February as a starting point for developing a standard for the interoperable annotation of ***** quantification ***** phenomena in natural language, as part of the ISO 24617 Semantic Annotation Framework. | ||
| S17-2112 Specifically the proposed system participated both to tweet polarity classification (two-, three- and five class) and tweet ***** quantification ***** (two and five-class) tasks. | ||
| S17-2113 Basically, our study was aimed to analyze the effectiveness of a mixture of ***** quantification ***** technique with one of deep learning architecture. | ||
| 2021.emnlp-main.774 COVR focuses on questions that require complex reasoning, including higher-order operations such as ***** quantification ***** and aggregation. | ||
| W19-8667 We discuss what this exercise can teach us about the nature of ***** quantification ***** and about the challenges posed by the generation of quantified expressions | ||
| senses | 64 | |
| L14-1323 In this paper we tackle the problem of automatically annotating, with both word ***** senses ***** and named entities, the MASC 3.0 corpus, a large English corpus covering a wide range of genres of written and spoken text. | ||
| 2021.semeval-1.3 This task allows the largely under-investigated inherent ability of systems to discriminate between word ***** senses ***** within and across languages to be evaluated, dropping the requirement of a fixed sense inventory. | ||
| 2020.ldl-1.12 Links between lexemes in different languages can be made, e.g., through a derivation property or ***** senses *****. | ||
| L10-1586 These relations reflect the distribution of lexical unit ***** senses ***** with respect to the concepts in the ontology. | ||
| 2020.conll-1.21 In this paper we present three new neural models for learning density matrices from a corpus, and test their ability to discriminate between word ***** senses ***** on a range of compositional datasets | ||
| semantic parsers | 64 | |
| 2021.iwpt-1.4 Strong and affordable in-domain data is a desirable asset when transferring trained ***** semantic parsers ***** to novel domains. | ||
| P17-2098 In this paper, we propose to exploit structural regularities in language in different domains, and train ***** semantic parsers ***** over multiple knowledge-bases (KBs), while sharing information across datasets. | ||
| D19-6111 As a result, for effective post-deployment domain adaptation and personalization, ***** semantic parsers ***** are continuously retrained to learn new user vocabulary and paraphrase variety. | ||
| 2021.emnlp-main.472 The availability of corpora has led to significant advances in training ***** semantic parsers ***** in English. | ||
| P18-1168 Training ***** semantic parsers ***** from weak supervision (denotations) rather than strong supervision (programs) complicates training in two ways | ||
| feature | 64 | |
| P17-2063 Using a standard dataset, we first show that while ***** feature ***** performance is high, LID data is highly dimensional and mostly sparse (99.5%) as it includes large vocabularies for many languages; memory requirements grow as languages are added. | ||
| 2020.alta-1.4 0.09 under the best-performing settings and; (2) the performance of the ***** feature *****-based method can be further improved by ***** feature ***** selection. | ||
| W17-2703 Further, we claim that enhancing our system with deep learning techniques like ***** feature ***** ranking can achieve even better results, as it can benefit from both approaches. | ||
| P17-1173 We show via experiments on text chunking and relation extraction that this restructuring does indeed speed up ***** feature ***** extraction in practice by reducing redundant computation. | ||
| L06-1485 Its ***** feature *****s include synchronized multi-channel audio and video playback, compatibility with several corpora, platform independence, and mixed display of capabilities and a well-defined method for layering datasets. | ||
| dependency parser | 64 | |
| W19-6149 UniParse does this by enabling highly efficient, sufficiently independent, easily readable, and easily extensible implementations for all ***** dependency parser ***** components. | ||
| Q15-1035 We show how to train the fast ***** dependency parser ***** of Smith and Eisner (2008) for improved accuracy. | ||
| L12-1422 Finally, the converter is evaluated by assessing the impact of conversion on the performance of the ***** dependency parser *****. | ||
| 2021.acl-long.452 Third, we propose word-internal structure parsing as a new task, and conduct benchmark experiments using a competitive ***** dependency parser *****. | ||
| L14-1227 The result is the first publicly available ***** dependency parser ***** for Old French | ||
| received | 64 | |
| L14-1222 In recent years, the parsing of discontinuous structures has ***** received ***** a rising interest. | ||
| 2021.acl-long.520 While Question Answering over KG (KGQA) has ***** received ***** some attention from the research community, QA over Temporal KGs (Temporal KGQA) is a relatively unexplored area. | ||
| 2020.wanlp-1.9 We ***** received ***** 47 submissions for Subtask 1 from 18 teams and 9 submissions for Subtask 2 from 9 teams. | ||
| 2020.mwe-1.6 Strict quality control was applied for error limitation, i.e., each MT output sentence ***** received ***** first manual post editing and annotation plus second manual quality rechecking. | ||
| D19-1342 Aspect-level sentiment classification, which is a fine-grained sentiment analysis task, has ***** received ***** lots of attention these years. | ||
| large corpora | 64 | |
| C18-1290 Standard word embedding algorithms learn vector representations from ***** large corpora ***** of text documents in an unsupervised fashion. | ||
| L06-1505 This shows that deep large-coverage non-probabilistic parsers can be efficient enough to parse very ***** large corpora ***** in a reasonable amount of time. | ||
| 2002.amta-papers.18 The main stumbling block to applying these data driven techniques directly is that most of them require ***** large corpora ***** rarely available for such `new' languages. | ||
| 2020.semeval-1.140 Ours is a minimalist unsupervised system that uses word co-occurrence frequencies from ***** large corpora ***** to capture unexpectedness as a mean to capture funniness. | ||
| S17-1018 In this paper, we present a method to automatically learn argument role inventories for verbs from ***** large corpora ***** of text, images and videos. | ||
| complex | 64 | |
| K18-1001 However, the simple graphical model structure belies the often ***** complex ***** non-local constraints between output labels. | ||
| D19-5817 Our study suggests that while current metrics may be suitable for existing QA datasets, they limit the ***** complex *****ity of QA datasets that can be created. | ||
| L10-1262 The first approach uses ***** complex ***** tags that describe full words and does not require any word segmentation. | ||
| D19-1124 Existing works simply assume the Gaussian priors of the latent variable, which are incapable of representing ***** complex ***** latent variables effectively. | ||
| 2010.amta-srw.4 Hebrew and Arabic are related but mutually incomprehensible languages with *****complex***** morphology and scarce parallel corpora . | ||
| argument component | 64 | |
| 2020.coling-main.128 Argument mining systems often consider contextual information, i.e. information outside of an argumentative discourse unit, when trained to accomplish tasks such as *****argument component***** identification, classification, and relation extraction. | ||
| J17-3005 We identify *****argument components***** using sequence labeling at the token level and apply a new joint model for detecting argumentation structures. | ||
| P17-1002 Contrary to models that operate on the *****argument component***** level, we find that framing AM as dependency parsing leads to subpar performance results. | ||
| W18-5208 This paper focuses on *****argument component***** classification for transcribed spoken classroom discussions, with the goal of automatically classifying student utterances into claims, evidence, and warrants. | ||
| 2021.acl-long.497 Most existing methods determine argumentative relations by exhaustively enumerating all possible pairs of *****argument components*****, which suffer from low efficiency and class imbalance. | ||
| Chinese | 64 | |
| 2020.emnlp-main.518 Character - level BERT pre - trained in Chinese suffers a limitation of lacking lexicon information , which shows effectiveness for *****Chinese***** NER . | ||
| 2020.emnlp-main.318 Word - level information is important in natural language processing ( NLP ) , especially for the *****Chinese***** language due to its high linguistic complexity . | ||
| L16-1536 Inspired by observations in ( Mikolov et al . , 2013b ) , which show that training their word vector model on comparable corpora yields comparable vector space representations of those corpora , reducing the problem of translating words to finding a rotation matrix , and results in ( Zou et al . , 2013 ) , which showed that bilingual word embeddings can improve Chinese Named Entity Recognition ( NER ) and English to *****Chinese***** phrase translation , we use the sentence - aligned English - French EuroParl corpora and show that word embeddings extracted from a merged corpus ( corpus resulted from the merger of the two aligned corpora ) can be used to NE translation . | ||
| 2018.gwc-1.48 The present work seeks to make the logographic nature of *****Chinese***** script a relevant research ground in wordnet studies . | ||
| 2021.acl-long.161 Recent pretraining models in Chinese neglect two important aspects specific to the *****Chinese***** language : glyph and pinyin , which carry significant syntax and semantic information for language understanding . | ||
| explainability | 63 | |
| W19-0605 We discuss the advantages and disadvantages of using topological information, and some open problems such as ***** explainability ***** of the classifier decisions. | ||
| 2021.acl-srw.33 Although we could not obtain the high quality phrase structure in constituency parsing when evaluated monolingually, we find that the induced phrase structures enhance the ***** explainability ***** of translation through the synchronization constraint. | ||
| 2020.nl4xai-1.7 In order to increase trust in the usage of Bayesian Networks and to cement their role as a model which can aid in critical decision making, the challenge of ***** explainability ***** must be faced. | ||
| 2021.ranlp-1.67 Meanwhile, we generate the lexicon consists of sentiment word based on the ***** explainability ***** score. | ||
| D19-1565 A further issue with most existing systems is the lack of ***** explainability ***** | ||
| Markov | 63 | |
| P18-2083 In the first stage, we train PhraseCTM, which models the generation of words and phrases simultaneously by linking the phrases and component words within ***** Markov ***** Random Fields when they are semantically coherent. | ||
| Q14-1036 We address the difficulty of inference through an online algorithm which uses a hybrid of ***** Markov ***** chain Monte Carlo and variational inference. | ||
| W18-3704 We finally use the results of sub-sequence analysis method to generate a tutorial ***** Markov ***** process for effective tutorial sessions. | ||
| D18-1378 For discourse relations, Limbic adopts a generative process regularized by a ***** Markov ***** Random Field. | ||
| N19-1141 We formulate fake news detection as an inference problem in a ***** Markov ***** random field (MRF) which can be solved by the iterative mean-field algorithm | ||
| derivation | 63 | |
| L14-1416 After a brief summarization of theoretical descriptions of Czech ***** derivation ***** and the state of the art of NLP approaches to Czech ***** derivation *****, we discuss the linguistic background of the network and introduce the formal structure of the network and the semi-automatic annotation procedure. | ||
| 2020.ldl-1.12 Links between lexemes in different languages can be made, e.g., through a ***** derivation ***** property or senses. | ||
| W19-0402 We hypothesize that a divide-and-conquer approach to semantic parsing starting with ***** derivation ***** of ULFs will lead to semantic analyses that do justice to subtle aspects of linguistic meaning, and will enable construction of more accurate semantic parsers. | ||
| 2021.emnlp-main.352 Standard techniques for training such a policy require an oracle ***** derivation ***** for each generation, and we prove that finding the shortest such ***** derivation ***** can be reduced to parsing under a particular weighted context-free grammar | ||
| K19-1023 We present a new method for transition - based parsing where a solution is a pair made of a dependency tree and a *****derivation***** graph describing the construction of the former . | ||
| node | 63 | |
| P18-3011 Our algorithm outperforms the state of the art SAS method by 1.7% F1 score in ***** node ***** prediction. | ||
| S18-2032 It is more expressive, as different combination functions can be used for each child ***** node *****. | ||
| P19-1628 We revisit a popular graph-based ranking algorithm and modify how ***** node ***** (aka sentence) centrality is computed in two ways: (a) we employ BERT, a state-of-the-art neural representation learning model to better capture sentential meaning and (b) we build graphs with directed edges arguing that the contribution of any two ***** node *****s to their respective centrality is influenced by their relative position in a document. | ||
| 2021.emnlp-main.826 This characteristic is not suitable for span-based parsing models because they predict ***** node ***** labels independently. | ||
| 2004.jeptalnrecital-long.24 Therefore, an alternative extension of TAG is introduced based on the notion of ***** node ***** sharing | ||
| Slovene | 63 | |
| L14-1642 By running the tool for 235 days we tested it on the task of collecting two monitor corpora, one for Croatian and Serbian and the other for ***** Slovene *****, thus also creating new and valuable resources for these languages. | ||
| W17-1410 In this paper we present the adaptations of a state-of-the-art tagger for South Slavic languages to non-standard texts on the example of the ***** Slovene ***** language. | ||
| L06-1072 Both the ***** Slovene ***** and English text islinguistically annotated at the word-level, by context disambiguatedlemmas and morphosyntactic descriptions, which follow the MULTEXTguidelines. | ||
| L08-1257 The paper presents a set of approaches to extend the automatically created *****Slovene***** wordnet with nominal multi - word expressions . | ||
| W17-1418 We present results of the first gender classification experiments on *****Slovene***** text to our knowledge . | ||
| lexicalized | 63 | |
| 1995.iwpt-1.27 In this paper, we discuss a three-stage approach to disambiguation in the context of a ***** lexicalized ***** grammar, using a variety of domain independent heuristic techniques. | ||
| W89-0235 We take Lexicalized Tree Adjoining Grammars as an in stance of ***** lexicalized ***** grammar. | ||
| D19-1340 For example, the performance of a state-of-the-art RTE model trained on the masked Fake News Challenge (Pomerleau and Rao, 2017) data and evaluated on Fact Extraction and Verification (Thorne et al., 2018) data improved by over 10% in accuracy score compared to the fully ***** lexicalized ***** model. | ||
| W03-3006 some ***** lexicalized ***** grammar G. It differs from previous approaches in several ways:- | ||
| L06-1270 In this paper, we present a general method for aligning ontologies, which was used to align a conceptual thesaurus, ***** lexicalized ***** in 20 languages with a partial version of it ***** lexicalized ***** in Romanian | ||
| phrasal | 63 | |
| D18-1411 We model the joint probability of data fields, texts, ***** phrasal ***** spans, and latent annotations with an adapted semi-hidden Markov model, and impose a soft statistical constraint to further improve the performance. | ||
| I17-1067 In order to aid in the evaluation of such systems, we introduce a new phrase-level semantic textual similarity dataset comprised of human activity phrases, providing a testbed for automated systems that analyze relationships between ***** phrasal ***** descriptions of people's actions. | ||
| 2021.naacl-main.234 Naturally-occurring bracketings, such as answer fragments to natural language questions and hyperlinks on webpages, can reflect human syntactic intuition regarding ***** phrasal ***** boundaries. | ||
| L06-1001 It is supplied, in seven separate interval tiers, with an orthographical transcription, detailed part-of-speech tags, simplified part-of-speech tags, a phonological transcription, a broad phonetic transcription, the pitch relation between each stressed and post-tonic syllable, the ***** phrasal ***** intonation, and an empty tier for comments. | ||
| 2002.amta-systems.6 This hybrid, large-scale system is capable of learning all its knowledge of lexical and ***** phrasal ***** translations directly from data | ||
| categorical | 63 | |
| 2020.lrec-1.563 A typical MIR consists of two sections: a structured ***** categorical ***** part and an unstructured text part. | ||
| D19-1034 Based on the detected boundaries, our model utilizes the boundary-relevant regions to predict entity ***** categorical ***** labels, which can decrease computation cost and relieve error propagation problem in layered sequence labeling model. | ||
| L10-1182 Semi-automatic extraction of NPIs is a challenging task since NPIs do not have uniform ***** categorical ***** or other syntactic properties that could be used for detecting them; they occur as single words or as multi-word expressions of almost any syntactic category. | ||
| N19-1071 Continuous relaxations enable us to sample from ***** categorical ***** distributions, allowing gradient-based optimization, unlike alternatives that rely on reinforcement learning. | ||
| P19-1475 The ***** categorical ***** nature of these tasks has led to common use of a cross entropy log-loss objective during training | ||
| paradigm | 63 | |
| L14-1237 To test the hypothesis that this technology is close to applicability, and to provide a testbed for reducing any accuracy gaps, we have developed an evaluation ***** paradigm ***** for historical record handwriting recognition. | ||
| L12-1230 A formal model of the evaluation ***** paradigm ***** will be useful for comparing evaluations protocols, investigating evaluation constraint relaxation and getting a better understanding of the evaluation ***** paradigm *****, provided it is general enough to be able to represent any natural language processing task. | ||
| C18-1186 With proposed distant supervision ***** paradigm *****, the learned response ranking model makes use of the knowledge in the QA pairs and the corresponding retrieved review lists. | ||
| P18-3015 We investigate a new training ***** paradigm ***** for extractive summarization. | ||
| 2020.aacl-main.70 Moreover, the framework is trained with a specifically designed weak supervision ***** paradigm ***** making use of available answers in the training phase | ||
| pairs | 63 | |
| W17-4606 We propose an E2E model based on pointer networks, which can be trained directly on ***** pairs ***** of raw input and output text. | ||
| 2020.readi-1.1 We have collected word ***** pairs ***** from seven different categories, chosen for their homophonous properties, along with sentence examples and frequency information from said ***** pairs *****. | ||
| R19-1140 Although the existing models achieve high performance on ***** pairs ***** of morphologically simple languages, they perform very poorly on morphologically rich languages such as Turkish and Finnish. | ||
| W17-2805 We evaluate our approach against state-of-the-art supervised and unsupervised grounding and grammar induction systems, and show that a robot can learn to execute never seen-before commands from ***** pairs ***** of unlabelled linguistic and visual inputs. | ||
| C18-1247 Experiments performed on evaluating correlation between emotion ***** pairs ***** offer interesting insights into the relationship between them | ||
| text corpora | 63 | |
| 2020.coling-main.579 Large ***** text corpora ***** are increasingly important for a wide variety of Natural Language Processing (NLP) tasks, and automatic language identification (LangID) is a core technology needed to collect such datasets in a multilingual context. | ||
| W17-8105 The E-platform integrates: 1/ an environment for creating, organizing and maintaining electronic text archives, for extracting ***** text corpora ***** and aligning corpora; 2/ a linguistic database; 3/ a concordancer; 4/ a set of modules for the generation and editing of practice exercises for each text or corpus; 5/ functionalities for export from the platform and import to other educational platforms. | ||
| P19-1132 In practice, multiple passes are computationally expensive and this makes difficult to scale to longer paragraphs and larger ***** text corpora *****. | ||
| L12-1394 In its current state of development, KnowPipe provides facilities for preprocessing Russian and German ***** text corpora *****, for pattern-based knowledge-rich context extraction from these corpora using shallow analysis as well as tools for ranking Russian context candidates. | ||
| 2021.acl-long.124 Models pre-trained on large-scale regular ***** text corpora ***** often do not work well for user-generated data where the language styles differ significantly from the mainstream text. | ||
| support vector machine | 63 | |
| L14-1344 This tool applies fingerprinting to different acoustic features extracted from the audio signal in order to remove perceptual irrelevancies, and a ***** support vector machine ***** is trained for classifying these fingerprints in classes music and no-music. | ||
| K19-1062 The model consists of 1) a recurrent neural network (RNN) to learn scoring functions for pair-wise relations, and 2) a structured ***** support vector machine ***** (SSVM) to make joint predictions. | ||
| S17-2141 Since two submissions were allowed, two different machine learning methods were developed to solve this task, a ***** support vector machine ***** approach and a recurrent neural network approach. | ||
| 2021.sustainlp-1.1 The structure of our convex program is such that standard ***** support vector machine ***** software packages, which are numerically robust and efficient, can solve it. | ||
| 2008.amta-papers.4 We construct a discriminative, syntactic language model (LM) by using a latent ***** support vector machine ***** (SVM) to train an unlexicalized parser to judge sentences. | ||
| linguistic knowledge | 63 | |
| 2021.deelio-1.1 While some of these patterns confirm the conventional prior ***** linguistic knowledge *****, the rest are relatively unexpected, which may provide new insights. | ||
| 2021.sigmorphon-1.22 Traditionally, character-level transduction problems have been solved with finite-state models designed to encode structural and ***** linguistic knowledge ***** of the underlying process, whereas recent approaches rely on the power and flexibility of sequence-to-sequence models with attention. | ||
| W18-5040 The research described in this paper examines how to learn ***** linguistic knowledge ***** associated with discourse relations from unlabeled corpora. | ||
| 2010.amta-papers.35 Distributional paraphrasing has wider applicability, but doesn't benefit from any ***** linguistic knowledge *****. | ||
| 2021.acl-long.326 The graph network injects structural psycho***** linguistic knowledge ***** in LIWC, a computerized instrument for psycholinguistic analysis, by constructing a heterogeneous tripartite graph. | ||
| word order | 63 | |
| 1998.amta-papers.33 The approach is based on pattern matching, morphological rules, and ***** word order ***** inversion. | ||
| 2020.udw-1.4 We use Universal Dependencies treebanks to test whether a well-known typological trade-off between ***** word order ***** freedom and richness of morphological marking of core arguments holds within individual languages. | ||
| 2021.ranlp-srw.22 To this end, we consider an evolutionary model of language and demonstrate, both theoretically and using genetic algorithms, that a language with a fixed ***** word order ***** is optimal. | ||
| 2020.acl-main.47 We examine a methodology using neural language models (LMs) for analyzing the ***** word order ***** of language. | ||
| L12-1595 We present a method for improving word alignment quality for phrase-based statistical machine translation by reordering the source text according to the target ***** word order ***** suggested by an initial word alignment. | ||
| Neural Machine Translation ( NMT | 63 | |
| 2020.ngt-1.4 *****Neural Machine Translation ( NMT***** ) is resource - intensive . | ||
| D18-1511 In *****Neural Machine Translation ( NMT***** ) , the decoder can capture the features of the entire prediction history with neural connections and representations . | ||
| P19-1019 While machine translation has traditionally relied on large amounts of parallel corpora , a recent research line has managed to train both *****Neural Machine Translation ( NMT***** ) and Statistical Machine Translation ( SMT ) systems using monolingual corpora only . | ||
| D18-1036 One of the weaknesses of *****Neural Machine Translation ( NMT***** ) is in handling lowfrequency and ambiguous words , which we refer as troublesome words . | ||
| 2020.findings-emnlp.319 To improve the performance of *****Neural Machine Translation ( NMT***** ) for low - resource languages ( LRL ) , one effective strategy is to leverage parallel data from a related high - resource language ( HRL ) . | ||
| polarities | 62 | |
| 2021.naacl-main.167 The sentiment ***** polarities ***** underlying user reviews are of great value for business intelligence. | ||
| P19-1051 Open-domain targeted sentiment analysis aims to detect opinion targets along with their sentiment ***** polarities ***** from a sentence. | ||
| D19-1551 Aspect-level sentiment classification is a crucial task for sentiment analysis, which aims to identify the sentiment ***** polarities ***** of specific targets in their context. | ||
| L12-1370 The annotated corpus has been applied to learn relation extraction rules for extraction of opinion holders, opinion content and classification of ***** polarities *****. | ||
| 2020.findings-emnlp.6 The ***** polarities ***** sequence is designed to depend on the generated aspect terms labels | ||
| DBpedia | 62 | |
| 2018.gwc-1.8 As such, this resource aims to provide a gold standard for link discovery, while also allowing PWN to distinguish itself from other resources such as ***** DBpedia ***** or BabelNet. | ||
| L12-1323 In this paper, we describe the general ***** DBpedia ***** knowledge base and as well as the ***** DBpedia ***** data sets that specifically aim at supporting computational linguistics tasks. | ||
| L14-1587 The goal is to reconcile information provided by language specific ***** DBpedia ***** chapters to obtain a consistent results set. | ||
| P19-1377 Using ***** DBpedia ***** knowledge graph as a proxy to long-term memory, mentioned concepts become activated and trigger further activation as the text is sequentially traversed. | ||
| L16-1532 This paper introduces the ***** DBpedia ***** Abstract Corpus, a large-scale, open corpus of annotated Wikipedia texts in six languages, featuring over 11 million texts and over 97 million entity links | ||
| morpheme | 62 | |
| D19-1150 Korean morphological analysis has been considered as a sequence of ***** morpheme ***** processing and POS tagging. | ||
| L10-1111 This framework poses a previously unexplored problem, online unknown ***** morpheme ***** detection. | ||
| S17-1001 We find that the most reliably solvable analogy categories involve either 1) the application of a ***** morpheme ***** with clear syntactic effects, 2) male–female alternations, or 3) named entities. | ||
| 1963.earlymt-1.5 The word-meanings of this type of ***** morpheme *****, thus, must be carefully distinguished from the sentence-meanings that configuration of these ***** morpheme *****s produce. | ||
| P17-1051 We present in this paper a novel framework for *****morpheme***** segmentation which uses the morpho - syntactic regularities preserved by word representations , in addition to orthographic features , to segment words into morphemes . | ||
| deterministic | 62 | |
| 2021.tacl-1.3 We propose the Conversation Graph (ConvGraph), a graph-based representation of dialogues that can be exploited for data augmentation, multi- reference training and evaluation of non- ***** deterministic ***** agents. | ||
| 2021.acl-short.59 We make an OntoNotes-like coreference dataset called OntoGUM publicly available, converted from GUM, an English corpus covering 12 genres, using ***** deterministic ***** rules, which we evaluate. | ||
| D17-1197 Experiments on three representative applications show our model variants outperform models based on ***** deterministic ***** attention and standard language modeling baselines. | ||
| D17-1210 More specifically, we design an actor that observes and manipulates the hidden state of the neural machine translation decoder and propose to train it using a variant of ***** deterministic ***** policy gradient. | ||
| P19-1298 Neural machine translation (NMT) takes ***** deterministic ***** sequences for source representations | ||
| interlingua | 62 | |
| 1997.mtsummit-workshop.8 The main characteristics of the ***** interlingua ***** are as follows: (1) Conceptual primitives, elements of the ***** interlingua *****, can be linked to any parts of speech in English or Japanese. | ||
| 2004.amta-papers.26 In this paper, we describe the creation of an ***** interlingua ***** and the development of a corpus of semantically annotated text, to be validated in six languages and evaluated in several ways. | ||
| L08-1545 In this paper, we report our work on the creation of a number of lexical resources that are crucial for an ***** interlingua ***** based MT from English to other languages. | ||
| L06-1383 The paper presents advances in the use of semantic features and *****interlingua***** relations for word sense disambiguation ( WSD ) as part of unification - based deep processing grammars . | ||
| 1998.amta-papers.3 The MT engine of the JANUS speech - to - speech translation system is designed around four main principles : 1 ) an *****interlingua***** approach that allows the efficient addition of new languages , 2 ) the use of semantic grammars that yield low cost high quality translations for limited domains , 3 ) modular grammars that support easy expansion into new domains , and 4 ) efficient integration of multiple grammars using multi - domain parse lattices and domain re - scoring . | ||
| optimization | 62 | |
| 2021.emnlp-main.621 Furthermore, we relieve the non-stationary problem caused by the changing dynamics of the environment as evolving of agents' policies by introducing a joint ***** optimization ***** process that makes agents can exchange their policy information. | ||
| W17-5509 Reinforcement learning is widely used for dialogue policy ***** optimization ***** where the reward function often consists of more than one component, e.g., the dialogue success and the dialogue length. | ||
| 2021.conll-1.34 As proof of concept, we first provide a small step-by-step example transforming a naive grammar over unsegmented words into a linguistically motivated grammar over morphemes, and then discuss a description of the English auxiliary system, passives, and raising verbs produced by a prototype implementation of a procedure for automated grammar ***** optimization *****. | ||
| L12-1415 MaltParser offers a wide range of parameters for ***** optimization *****, including nine different parsing algorithms, two different machine learning libraries (each with a number of different learners), and an expressive specification language that can be used to define arbitrarily rich feature models. | ||
| 2021.alvr-1.2 We evaluate our method using a well-trained multi-modalities stylish caption generation model and find those causal inferences that could provide us the insights for next step ***** optimization ***** | ||
| PoS | 62 | |
| L16-1675 We present FlexTag, a highly flexible ***** PoS ***** tagging framework. | ||
| S17-2013 Most of these features were calculated using word embedding vectors similarity to align Part of Speech (***** PoS *****) and Name Entities (NE) tagged tokens of each sentence pair. | ||
| L06-1188 Detailed insight on tagging, F-measure for all ***** PoS ***** categories is provided in the course of the paper along with other facts of interest. | ||
| 2020.globalex-1.13 The Siamese LSTM withattention and ***** PoS ***** tagging (LSTM-A) performed better than the other two systems, achieving a 5-Class Accuracy score of 0.844 in theOverall Results, ranking the first position among five teams | ||
| C16-1032 We propose a new approach to *****PoS***** tagging where in a first step , we assign a coarse - grained tag corresponding to the main syntactic category . | ||
| crosslingual | 62 | |
| K19-1015 However, the cross-lingual correspondence between sentences and words is less studied, despite that this correspondence can significantly benefit many applications such as ***** crosslingual ***** semantic search and textual inference. | ||
| C18-1220 Multilingual topic models enable ***** crosslingual ***** tasks by extracting consistent topics from multilingual corpora. | ||
| 2005.mtsummit-swtmt.1 In the talk we will demonstrate the utilization of ontological knowledge indifferent ***** crosslingual ***** applications reaching from ***** crosslingual ***** document retrieval via ***** crosslingual ***** question answering to complex information services involving several ***** crosslingual ***** functionalities, including machine translation. | ||
| 2020.cl-1.3 Probabilistic topic modeling is a common first step in ***** crosslingual ***** tasks to enable knowledge transfer and extract multilingual features. | ||
| 2021.law-1.4 The aspect labels are designed specifically for Uniform Meaning Representations (UMR), an annotation schema that aims to encode ***** crosslingual ***** semantic information | ||
| derivational | 62 | |
| L12-1555 The paper presents construction of \emphDerywator – a language tool for the recognition of Polish ***** derivational ***** relations. | ||
| L14-1057 The structure of CroDeriV enables the detection of verbal ***** derivational ***** families in Croatian as well as the distribution and frequency of particular affixes and lexical morphemes. | ||
| E17-2019 In this paper we propose a new task of predicting the ***** derivational ***** form of a given base-form lemma that is appropriate for a given context. | ||
| 2018.gwc-1.16 When ***** derivational ***** relations deficiency exists in a wordnet, such as the Arabic WordNet, it makes it very difficult to exploit in the natural language processing community | ||
| 2020.lrec-1.485 Russian morphology has been studied for decades , but there is still no large high coverage resource that contains the *****derivational***** families ( groups of words that share the same root ) of Russian words . | ||
| Artificial | 62 | |
| 2010.jeptalnrecital-court.29 In ***** Artificial ***** Intelligence, analogy is used as a non exact reasoning technique to solve problems, for natural language processing, for learning classification rules, etc. | ||
| W17-3502 Poetry generation is becoming popular among researchers of Natural Language Generation, Computational Creativity and, broadly, ***** Artificial ***** Intelligence. | ||
| J74-2001 Personal Notes; Computational Semantics Tutorial at Lugano in March; ***** Artificial ***** Intelligence: Directory Being Compiled (Donald E. Walker); Letters: Logos Development Corporation on MT System (Yorick Wilks); Solar Project Distributes Materials (Tim Diller; John Olney; Nathan Ucuzoglu); NAS/NRC Studies International Information Programs; NFAIS Meeting, Overlap Study, Indexer Training Kit (Ben H. Weil); | ||
| 2021.ltedi-1.1 This study sheds light on the effects of COVID-19 in the particular field of Computational Linguistics and Natural Language Processing within ***** Artificial ***** Intelligence | ||
| K18-1048 Building systems that can communicate with humans is a core problem in *****Artificial***** Intelligence . | ||
| error | 62 | |
| 2008.amta-srw.5 All these techniques are compared through word ***** error ***** rate and diacritization ***** error ***** rate both in terms of full diacritization and ignoring vowel endings. | ||
| 2020.findings-emnlp.14 This paper reflects on the meaningfulness of the speed reading task, showing that (a) better and faster approaches to, say, document classification, already exist, which also learn to ignore part of the input (I give an example with 7% ***** error ***** reduction and a 136x speed-up over the state of the art in neural speed reading); and that (b) any claims that neural speed reading is “human-inspired”, are ill-founded. | ||
| D17-1297 Our approach achieves state-of-the-art results on ***** error ***** correction for three different datasets, and it has the additional advantage of only using a small set of easily computed features that require no linguistic input. | ||
| S17-2104 The obtained results show that user-specific classifiers trained on tweets from user timeline can introduce noise as they are ***** error ***** prone because they are classified by an imperfect system. | ||
| 2021.eacl-main.160 Using our evaluation experiments, we show that the total number of annotators can have a strong impact on study power and that current statistical analysis methods can inflate type I ***** error ***** rates up to eight-fold | ||
| text categorization | 62 | |
| W19-4805 Self-explaining ***** text categorization ***** requires a classifier to make a prediction along with supporting evidence. | ||
| W17-3005 Preliminary studies using out-of-vocabulary splits from abusive tweet data show promising results, outperforming competitive ***** text categorization ***** strategies by 4-11%. | ||
| 2020.semeval-1.151 For the proposed system construction, we used different strategies, and the best ones were based on deep neural networks and a ***** text categorization ***** algorithm. | ||
| P17-1052 This paper proposes a low-complexity word-level deep convolutional neural network (CNN) architecture for ***** text categorization ***** that can efficiently represent long-range associations in text. | ||
| 2020.aespen-1.5 We cast the problem of event annotation as one of ***** text categorization *****, and compare state of the art ***** text categorization ***** techniques on event data produced within the Uppsala Conflict Data Program (UCDP). | ||
| author | 62 | |
| 2021.eval4nlp-1.18 Authorship attribution is the task of assigning an unknown document to an ***** author ***** from a set of candidates. | ||
| W17-3106 We then illustrate the future possibility of this work with an example of an exposure scenario ***** author *****ed with our application. | ||
| W17-4913 In ***** author *****ship attribution, many different approaches have successfully resolved this issue at the cost of linguistic interpretability: The resulting algorithms may be able to distinguish one language variety from the other, but do not give us much information on their distinctive linguistic properties. | ||
| 2020.starsem-1.19 Obfuscation can, however, be thought of as the construction of adversarial examples to attack ***** author ***** identification, suggesting that the deep learning architectures used for adversarial attacks could have application here. | ||
| 2021.emnlp-main.25 Furthermore, we provide a description of idiolects through measuring inter- and intra-***** author ***** variation, showing that variation in idiolects is often distinctive yet consistent. | ||
| human language | 62 | |
| D19-1123 This can significantly improve the learning of style and variation in ***** human language *****. | ||
| 2021.acl-srw.6 ZLA is a well-known tendency in ***** human language *****s where the more frequently a word is used, the shorter it will be. | ||
| Q17-1003 Prediction also affects perception and might be a key to robustness in ***** human language ***** processing. | ||
| 2020.emnlp-main.143 The ***** human language ***** can be expressed through multiple sources of information known as modalities, including tones of voice, facial gestures, and spoken language. | ||
| P17-5005 Our target audience are researchers and practitioners in machine learning, parsing (syntactic and semantic) and language technology, not necessarily experts in MWEs, who are interested in tasks that involve or could benefit from considering MWEs as a pervasive phenomenon in ***** human language ***** and communication. | ||
| real world | 62 | |
| L12-1351 In addition to being a rich source of language direct quotations from business leaders can have ”***** real world *****” consequences. | ||
| 2021.naacl-industry.38 Secondly, even with large training data, the intent detection models can see a different distribution of test data when being deployed in the ***** real world *****, leading to poor accuracy. | ||
| 2010.amta-commercial.14 Following this, we give an overview of the PLuTO framework as a whole, with particular emphasis on the MT components, and provide a ***** real world ***** use case scenario in which PLuTO MT services are ex- ploited. | ||
| L14-1240 This corpus is one of the first lexical resources focusing on ***** real world ***** applications that analyze the voice of the customer which is crucial for various industrial use cases. | ||
| 2021.woah-1.4 This paper thus serves as a reality-check for the current benchmark of hateful meme detection and its applicability for detecting ***** real world ***** hate. | ||
| network | 62 | |
| P19-1516 In this paper, we propose a neural ***** network ***** inspired multi- task learning framework that can simultaneously extract ADRs from various sources. | ||
| S19-1018 In doing so, it provides a reformalisation (in TTR) of enthymemes and topoi as ***** network *****s rather than functions, and information state update rules for conditionals. | ||
| W19-5409 More specifically, one of the proposed approaches employs the translation knowledge between the two languages from two different translation directions; while the other one employs extra monolingual knowledge from both source and target sides, obtained by pre-training deep self-attention ***** network *****s. | ||
| W18-6230 This paper describes an approach to solve implicit emotion classification with the use of pre-trained word embedding models to train multiple neural ***** network *****s. | ||
| 2021.emnlp-main.11 Recent studies have leveraged graph neural ***** network *****s to capture the inter-sentential relationship (e.g., the discourse graph) within the documents to learn contextual sentence embedding. | ||
| product | 62 | |
| W17-7902 Market pressure on translation ***** product *****ivity joined with technological innovation is likely to fragment and decontextualise translation jobs even more than is cur-rently the case. | ||
| L16-1004 While Edit Distance as such does not express cognitive effort or time spent editing machine translation suggestions, we found that it correlates strongly with the ***** product *****ivity tests we performed, for various language pairs and domains. | ||
| D17-1142 We observe that the evidence-conclusion discourse relations, also known as arguments, often appear in ***** product ***** reviews, and we hypothesise that some argument-based features, e.g. | ||
| D18-1403 We present a neural framework for opinion summarization from online ***** product ***** reviews which is knowledge-lean and only requires light supervision (e.g., in the form of ***** product ***** domain labels and user-provided ratings). | ||
| L16-1623 We can often detect from a person's utterances whether he/she is in favor of or against a given target entity (a ***** product *****, topic, another person, etc.). | ||
| curriculum learning | 62 | |
| 2021.acl-long.137 Motivated by the recent finding that models trained with random negative samples are not ideal in real-world scenarios, we propose a hierarchical ***** curriculum learning ***** framework that trains the matching model in an “easy-to-difficult” scheme. | ||
| 2021.wassa-1.13 We find that ***** curriculum learning ***** works best for difficult tasks and may even lead to a decrement in performance for tasks with higher performance without ***** curriculum learning *****. | ||
| 2020.findings-emnlp.48 We show that models using learned difficulty and/or ability outperform heuristic-based ***** curriculum learning ***** models on the GLUE classification tasks. | ||
| N19-1119 In this paper, we propose a ***** curriculum learning ***** framework for NMT that reduces training time, reduces the need for specialized heuristics or large batch sizes, and results in overall better performance. | ||
| 2021.eacl-main.119 We propose an effective method of *****curriculum learning***** to train summarization models from such noisy data. | ||
| sentence fusion | 62 | |
| 2020.inlg-1.9 Furthermore, we show that our formulation of data-to-text generation opens up the possibility for zero-shot domain adaptation using a general-domain dataset for ***** sentence fusion *****. | ||
| 2020.emnlp-main.338 Our findings highlight the importance of modeling points of correspondence between sentences for effective ***** sentence fusion *****. | ||
| 2020.emnlp-main.699 Our experiments on ***** sentence fusion ***** and sentiment transfer demonstrate that Masker performs competitively in a fully unsupervised setting. | ||
| 2020.acl-srw.26 It is publicly shared to serve as a basis for future work to measure the success of ***** sentence fusion ***** systems. | ||
| P19-1209 *****Sentence fusion***** assumes multi-sentence input; yet sentence selection methods only work with single sentences and not combinations of them. | ||
| Wikidata | 61 | |
| N18-2101 To this end, we propose a neural network architecture equipped with copy actions that learns to generate single-sentence and comprehensible textual summaries from ***** Wikidata ***** triples. | ||
| 2021.emnlp-main.302 Specifically, at the entity level, we replace target entities with other entities of the same semantic class in ***** Wikidata *****; at the context level, we use pre-trained language models (e.g., BERT) to generate word substitutions. | ||
| 2021.bsnlp-1.14 The latter refers to Wikipedia and its structured counterpart - ***** Wikidata *****, our source of lemmatization rules, and real-world entities. | ||
| W18-5211 We explore the idea of using event knowledge about prototypical situations from FrameNet and fact knowledge about concrete entities from ***** Wikidata ***** to solve the task. | ||
| 2020.emnlp-main.459 The experimental results on five datasets sampled from Freebase, NELL and ***** Wikidata ***** show that our method outperforms state-of-the-art baselines | ||
| hierarchies | 61 | |
| L12-1130 For each cut i, the two ***** hierarchies ***** can be seen as two clusterings Cˆi_l , Cˆi_r of the leaf concepts. | ||
| 2021.acl-long.182 Although probabilistic models can generate topic ***** hierarchies ***** by introducing nonparametric priors like Chinese restaurant process, such methods have data scalability issues. | ||
| J18-2005 Tree ***** hierarchies ***** are learned along with the corresponding morphological paradigms simultaneously. | ||
| I17-4013 This IJCNLP2017-Task2 competition seeks to automatically calculate Valence and Arousal ratings within the ***** hierarchies ***** of vocabulary and phrases in Chinese. | ||
| 2005.mtsummit-papers.22 We make extensive use of underspecification and type ***** hierarchies ***** to maximize generality and precision | ||
| SLT | 61 | |
| 2016.iwslt-1.16 We participated in the machine translation (MT) task as well as the spoken language language translation (***** SLT *****) track for English→German and German→English translation. | ||
| 2008.iwslt-papers.5 Automatic spoken language translation (***** SLT *****), as a cost-effective solution to this dilemma, has received increased attention in recent years. | ||
| 2014.iwslt-evaluation.22 The same systems are used for the ***** SLT ***** track, where we additionally perform punctuation prediction on the automatic transcriptions employing hierarchical phrase-based translation. | ||
| 2014.iwslt-evaluation.14 We participated in two of the proposed tasks: (i) the Automatic Speech Recognition task (ASR) in two languages, Italian with the Vecsys company, and English alone, (ii) the English to French Spoken Language Translation task (***** SLT *****). | ||
| 2021.mtsummit-asltrw.3 We present our tools to monitor the pipeline and a web application to present the results of our ***** SLT ***** pipeline to the end users | ||
| MultiWOZ | 61 | |
| 2021.eacl-main.110 ARDM outperforms or is on par with state-of-the-art methods on two popular task-oriented dialog datasets: CamRest676 and ***** MultiWOZ *****. | ||
| 2020.nlp4convai-1.10 Our method achieves near the current state-of-the-art in joint goal accuracy on ***** MultiWOZ ***** 2.1 given full training data. | ||
| 2020.acl-main.637 In our experiments, our proposed model achieves a 7.03% relative improvement over the baseline, establishing a new state-of-the-art joint goal accuracy of 52.04% on the ***** MultiWOZ ***** 2.0 dataset. | ||
| 2021.emnlp-main.176 Experimental results show that KPN effectively alleviates catastrophic forgetting and outperforms previous state-of-the-art lifelong learning methods by 4.25% and 8.27% of whole joint goal accuracy on the ***** MultiWOZ ***** benchmark and the SGD benchmark, respectively. | ||
| 2020.acl-main.53 Our SOM-DST (Selectively Overwriting Memory for Dialogue State Tracking) model achieves state-of-the-art joint goal accuracy with 51.72% in ***** MultiWOZ ***** 2.0 and 53.01% in ***** MultiWOZ ***** 2.1 in an open vocabulary-based DST setting | ||
| EBMT | 61 | |
| 2006.amta-papers.17 In an experiment conducted on test sets extracted from Europarl and the Penn II Treebank we show that our method can raise the BLEU score up to 3.8% relative to the ***** EBMT ***** baseline. | ||
| 1999.mtsummit-1.37 The paper outlines the basic idea underlying the ***** EBMT ***** system and investigates the possibilities and limits of the translation template induction process. | ||
| R17-1085 The obtained results indicate that integrating domain-specific bilingual lexicons of MWEs improves translation quality of the ***** EBMT ***** system when texts to translate are related to the specific domain and induces a relatively slight deterioration of translation quality when translating general-purpose texts. | ||
| 2005.mtsummit-ebmt.2 Example-Based Machine Translation (***** EBMT *****) systems have typically operated on individual sentences without taking into account prior context. | ||
| 2005.mtsummit-ebmt.4 The proposed generalization technique has been implemented as a part of an ***** EBMT ***** system | ||
| hypernymy | 61 | |
| 2020.acl-main.334 Previously, the design of unsupervised ***** hypernymy ***** scores has been extensively studied. | ||
| 2016.gwc-1.57 However, the ***** hypernymy ***** linkages were found to be inadequate in certain cases and posed a challenge due to sense granularity of language. | ||
| L16-1056 We describe the infrastructure we developed to iterate over the web corpus for extracting the ***** hypernymy ***** relations and store them effectively into a large database. | ||
| W17-2339 In this paper, we apply a new method for hierarchical multi-label text classification that initializes a neural network model final hidden layer such that it leverages label co-occurrence relations such as ***** hypernymy *****. | ||
| 2020.mwe-1.3 To this aim, we build a taxonomy of tools with hyponymy and ***** hypernymy ***** relations from the data by decomposing all multi-word expressions of tool names | ||
| Rhetorical Structure | 61 | |
| R19-1043 We apply the ***** Rhetorical Structure ***** Theory to build a discourse tree of an answer and select elementary discourse units that are suitable for indexing. | ||
| L14-1468 We present a revised and extended version of the Potsdam Commentary Corpus, a collection of 175 German newspaper commentaries (op-ed pieces) that has been annotated with syntax trees and three layers of discourse-level information: nominal coreference,connectives and their arguments (similar to the PDTB, Prasad et al. 2008), and trees reflecting discourse structure according to ***** Rhetorical Structure ***** Theory (Mann/Thompson 1988). | ||
| W19-2705 We investigate the relationship between the notion of nuclearity as proposed in ***** Rhetorical Structure ***** Theory (RST) and the signalling of coherence relations. | ||
| W18-4917 This paper aims to present the first open Spanish-Chinese parallel corpus annotated with discourse information, whose theoretical framework is based on the ***** Rhetorical Structure ***** Theory (RST) | ||
| 2020.lrec-1.648 We present a freely available , genre - balanced English web corpus totaling 4 M tokens and featuring a large number of high - quality automatic annotation layers , including dependency trees , non - named entity annotations , coreference resolution , and discourse trees in *****Rhetorical Structure***** Theory . | ||
| idioms | 61 | |
| 2021.emnlp-main.821 It is designed to provide a challenging test-bed for Indonesian NLI by explicitly incorporating various linguistic phenomena such as numerical reasoning, structural changes, ***** idioms *****, or temporal and spatial reasoning. | ||
| L16-1368 The survey shows that the light verb constructions either get special annotations as such, or are treated as ordinary verbs, while VP ***** idioms ***** are handled through different strategies. | ||
| P18-5005 This makes ***** idioms ***** and metaphors an important research area for computational and cognitive linguistics, and their automatic identification and interpretation indispensable for any semantics-oriented NLP application. | ||
| 2020.wildre-1.9 This approach combines the rule-based generalization of ***** idioms ***** in English language and classification of statements based on the context to determine the ***** idioms ***** in the sentence. | ||
| 2021.argmining-1.11 We define 17 ***** idioms ***** in total by referring to argumentation schemes as well as analyzing actual arguments and fitting ***** idioms ***** to them | ||
| Abstractive | 61 | |
| W17-1002 ***** Abstractive ***** document summarization seeks to automatically generate a summary for a document, based on some abstract “understanding” of the original document. | ||
| 2020.coling-main.606 *****Abstractive***** summarization at controllable lengths is a challenging task in natural language processing . | ||
| P17-1108 *****Abstractive***** summarization is the ultimate goal of document summarization research , but previously it is less investigated due to the immaturity of text generation techniques . | ||
| 2020.lrec-1.819 *****Abstractive***** summarization typically relies on large collections of paired articles and summaries . | ||
| 2021.acl-long.472 *****Abstractive***** summarization for long - document or multi - document remains challenging for the Seq2Seq architecture , as Seq2Seq is not good at analyzing long - distance relations in text . | ||
| synthesis | 61 | |
| L14-1484 Corpus analysis is a powerful tool for signed language ***** synthesis *****. | ||
| 2021.emnlp-main.101 However, a fundamental challenge in building information extraction models for material science ***** synthesis ***** procedures is getting accurate labels for the materials, operations, and other entities of those procedures. | ||
| L12-1003 It is implemented as a client server based framework in Java and interfaces software for speech recognition, ***** synthesis *****, speech classification and quality evaluation. | ||
| W19-2309 To control for aspects such as preserving meaning while modifying style, we propose a reranking approach in the data ***** synthesis ***** phase | ||
| 2020.acl-main.541 Further experimental results using various multimodal ***** synthesis ***** techniques highlight the challenge presented by our dataset, including non-local constraints and multi-modal inputs. | ||
| noisy | 61 | |
| 2021.acl-long.277 However, it also incurs two major problems: ***** noisy ***** labels and imbalanced training data. | ||
| 2020.wmt-1.69 On the other hand, traditional machine translation has a long history of leveraging unlabeled data through ***** noisy ***** channel modeling. | ||
| 2021.naacl-main.269 Experiments in general ***** noisy ***** settings with four languages and distantly labeled settings demonstrate the effectiveness of our method. | ||
| D18-1230 Here we propose two neural models to suit ***** noisy ***** distant supervision from the dictionary. | ||
| D19-5553 We present an approach to correct ***** noisy ***** User Generated Content (UGC) in French aiming to produce a pretreatement pipeline to improve Machine Translation for this kind of non-canonical corpora | ||
| stance | 61 | |
| W89-0235 We take Lexicalized Tree Adjoining Grammars as an in ***** stance ***** of lexicalized grammar. | ||
| D19-1657 Recent works show improvements in ***** stance ***** detection by using either the attention mechanism or sentiment information. | ||
| 2021.naacl-main.148 The goal of ***** stance ***** detection is to identify whether the author of a text is in favor of, neutral or against a specific target. | ||
| 2020.acl-main.509 This problem is challenging because differences in ***** stance ***** intensity are often subtle and require nuanced language understanding. | ||
| C18-1286 An important step to analyze the discussions on social media and to assist in healthy decision-making is ***** stance ***** detection | ||
| framework | 61 | |
| P18-1217 To illustrate the flexibility offered by the neural network based ***** framework *****, we present three extensions base on NSTC without re-deduced inference algorithms. | ||
| I17-1073 The rule based ***** framework ***** provides good control to dialog designers at the expense of being more time consuming and laborious. | ||
| 2021.emnlp-main.418 To address this problem, we propose a training ***** framework ***** with certified robustness to eliminate the causes that trigger the generation of profanity. | ||
| K19-2008 The adopted multi-task model also can allow learning for one ***** framework ***** to benefit the others. | ||
| 2020.winlp-1.2 Finally, the performance of ***** framework ***** is evaluated using the annotated Amharic news comments | ||
| bidirectional encoder representations | 61 | |
| 2020.lrec-1.157 In the former approach, we used Japanese-specific linguistic features, including character-type features such as “kanji” and “hiragana.” In the latter approach, we used two models: a long short-term memory (LSTM) model (Hochreiter and Schmidhuber, 1997) and a ***** bidirectional encoder representations ***** from transformers (BERT) model (Devlin et al., 2019), which achieved the highest accuracy in various natural language processing tasks in 2018. | ||
| W19-3206 The systems for the two subtasks are based on ***** bidirectional encoder representations ***** from transformers (BERT), and achieves promising results. | ||
| 2021.rocling-1.22 Due to the development of deep learning, the natural language processing tasks have made great progresses by leveraging the ***** bidirectional encoder representations ***** from Transformers (BERT). | ||
| 2021.smm4h-1.10 For both tasks we used models based on ***** bidirectional encoder representations ***** from transformers (BERT). | ||
| S19-2142 Finally, the fourth subsystem is a ***** bidirectional encoder representations ***** from transformers (BERT) model. | ||
| advertisements | 61 | |
| P19-3008 Employers' low awareness and interest in attracting PhD graduates means that the term “PhD” is rarely used as a keyword in job ***** advertisements *****; 80% of companies looking to employ similar researchers do not specifically ask for a PhD qualification. | ||
| P19-1114 Traffickers exploit their victims by anonymously offering sexual services through online ***** advertisements *****. | ||
| 2021.eacl-main.99 Podcast episodes often contain material extraneous to the main content, such as ***** advertisements *****, interleaved within the audio and the written descriptions. | ||
| N18-3027 Detecting the similarity between job ***** advertisements ***** is important for job recommendation systems as it allows, for example, the application of item-to-item based recommendations. | ||
| 2020.alta-1.8 In this paper, a machine learning-natural language processing (ML-NLP) based approach was used to explore and extract skill requirements from research intensive job ***** advertisements *****, suitable for PhD graduates. | ||
| referring expression generation | 61 | |
| 2020.lrec-1.13 We are releasing this dataset to encourage research in the field of coreference resolution, ***** referring expression generation ***** and identification within realistic, deep dialogs involving multiple domains. | ||
| W19-8645 We suggest four extensions to that framework: (1) we introduce a trainable neural planning component that can generate effective plans several orders of magnitude faster than the original planner; (2) we incorporate typing hints that improve the model's ability to deal with unseen relations and entities; (3) we introduce a verification-by-reranking stage that substantially improves the faithfulness of the resulting texts; (4) we incorporate a simple but effective ***** referring expression generation ***** module. | ||
| L12-1032 The ontology is used in particular for concept generalizations during ***** referring expression generation *****. | ||
| W18-6540 This task presents two advantages: many of the mechanisms already available for static contexts may be applied with small adaptations, and it introduces the concept of changing conditions into the task of ***** referring expression generation *****. | ||
| 2020.inlg-1.16 A previous approach, called Perceptual Cost Pruning, modeled human QRE production using a preference-based ***** referring expression generation ***** algorithm, first removing facts from the input knowledge base based on a model of perceptual cost. | ||
| speech synthesis | 61 | |
| 2020.lrec-1.818 Recent advances in neural ***** speech synthesis ***** have enabled the development of such systems with a data-driven approach that does not require significant development of language-specific tools. | ||
| L10-1249 The synthetic voices for Viennese varieties, implemented with the open domain unit selection ***** speech synthesis ***** engine Multisyn of Festival will also be released within Festival. | ||
| L12-1136 We have created a synchronous corpus of acoustic and 3D facial marker data from multiple speakers for adaptive audio-visual text-to-***** speech synthesis *****. | ||
| L06-1484 Speech technology applications, such as speech recognition, ***** speech synthesis *****, and speech dialog systems, often require corpora based on highly customized specifications. | ||
| L10-1421 The present paper outlines the Vergina speech database, which was developed in support of research and development of corpus-based unit selection and statistical parametric ***** speech synthesis ***** systems for Modern Greek language. | ||
| morphologically rich | 61 | |
| C16-1132 Evaluation of machine translation (MT) into ***** morphologically rich ***** languages (MRL) has not been well studied despite posing many challenges. | ||
| 2021.eacl-main.158 The present article discusses how to improve translation quality when using limited training data to translate towards ***** morphologically rich ***** languages. | ||
| 2020.coling-main.409 Kinyarwanda, a ***** morphologically rich ***** language, currently lacks tools for automated morphological analysis. | ||
| L10-1262 Arabic is a ***** morphologically rich ***** language, which presents a challenge for part of speech tagging. | ||
| U19-1001 Finite State Transducers have long been the go-to method for modeling ***** morphologically rich ***** languages, and in this paper we discuss some of the distinct modeling challenges present in the morphosyntax of verbs in Kunwinjku. | ||
| computer | 61 | |
| L10-1105 We describe our ***** computer *****-supported framework to overcome the rule of metadata schism. | ||
| 2020.***** computer *****m-1.12 The results show a lot of variation between different systems and illustrate how some methodologies reach higher precision or recall, how different systems extract different types of terms, how some are exceptionally good at finding rare terms, or are less impacted by term length. | ||
| 2005.mtsummit-posters.1 This paper presents TTPlayer, a trace file analysis tool used to develop TransType, an innovative ***** computer *****-aided translation system. | ||
| 2020.lrec-1.33 For instance, in “The FBI alleged in court documents that Zazi had admitted having a handwritten recipe for explosives on his ***** computer *****”, do people believe that Zazi had a handwritten recipe for explosives? | ||
| 2003.mtsummit-papers.40 The goal of the AMETRA project is to make a ***** computer *****-assisted translation tool from the Spanish language to the Basque language under the memory-based translation framework. | ||
| legal | 61 | |
| D17-2003 Case studies tend to be used in ***** legal *****, business, and health education contexts, but less in the teaching and learning of linguistics. | ||
| 2021.nllp-1.19 German court rulings contain much structural information, so we create a pre-processing pipeline tailored explicitly to the German ***** legal ***** domain. | ||
| 2020.findings-emnlp.380 In addition to the dataset and reference results, LMs specialized in the ***** legal ***** domain were made publicly available. | ||
| 2008.amta-govandcom.11 We show that although the language used in this type of ***** legal ***** text is complex and specialized, an SMT system can produce intelligible and useful translations, provided that the system can be trained on a vast amount of ***** legal ***** text. | ||
| 2020.emnlp-tutorials.6 Simultaneous translation , which performs translation concurrently with the source speech , is widely useful in many scenarios such as international conferences , negotiations , press releases , *****legal***** proceedings , and medicine . | ||
| bootstrap | 60 | |
| 2021.naacl-industry.7 The method is applied to successfully ***** bootstrap ***** a slot tagging system for a major music streaming service that currently serves several tens of thousands of daily voice queries. | ||
| L06-1288 In this paper, we investigate whether it is possible to ***** bootstrap ***** a named entity tagger for textual databases by exploiting the database structure to automatically generate domain and database-specific gazetteer lists. | ||
| 2020.lrec-1.839 The key idea is to ***** bootstrap ***** from a small set of argument components automatically identified using simple heuristics in combination with reliable contextual cues. | ||
| L14-1691 Thirdly, we propose an statistical ***** bootstrap ***** approach for the identification and disambiguation of RDF-based predicates using a machine learning-based classifier. | ||
| K18-1034 So, we ***** bootstrap ***** a dataset of 430 images, scanned in two different settings and their corresponding ground truth | ||
| Participants | 60 | |
| P16-5008 ***** Participants ***** will gain a complete understanding of the theoretical basis and the practical workings of MetaNet, and acquire relevant information about the Frame Semantics basis of that knowledge base and the way that FrameNet handles the widespread phenomenon of metaphor in language. | ||
| W17-0704 ***** Participants *****' behaviour was accurately mimicked by a classifier which was trained on more cases from the base dialect and fewer from the target dialect. | ||
| S18-1108 In this paper, we describe the participation of the NewsReader system in the SemEval-2018 Task 5 on Counting Events and ***** Participants ***** in the Long Tail. | ||
| W19-5301 ***** Participants ***** were asked to build machine translation systems for any of 18 language pairs, to be evaluated on a test set of news stories | ||
| W19-7601 It will lay out the timeline , process and mechanisms used to customise Neural MT models and how these were used in conjunction with Human Based evaluations to determine which approach to Neural MT provided the best translation outcomes . The tutorial will cover the following topics and methods:- Structural differences in Neural Networks and how they assist the language decoding process RNN , CNN and TNN will be covered in detailed.- Customisation of Neural MT using the KantanMT Platform- Using MQM Framework for the evaluation and comparison of Translation Outputs and comparison to Human Translation- Collation and analysis of experimental findings in reaching our decision to standardise on Transformer type networks . *****Participants***** of the tutorial will get a clear understanding of Neural Model types and the differences , it will also cover how to customise these models and then how to set up a controlled experiment to determine translation performance . | ||
| bottleneck | 60 | |
| C16-1093 In particular, we (1) investigate the role of context in accurately capturing semantic anomalies and implement a system based on distributional topic coherence, which achieves state-of-the-art accuracy on a standard test set; (2) thoroughly investigate our system's performance across individual adjective classes, concluding that a class-dependent approach is beneficial to the task; (3) discuss the data size ***** bottleneck ***** in this area, and highlight the challenges of automatic error generation for content words. | ||
| P18-2049 Neural machine translation (NMT) models are typically trained with fixed-size input and output vocabularies, which creates an important ***** bottleneck ***** on their accuracy and generalization capability. | ||
| W16-4914 This is a necessary step to provide accurate coaching on how to correct ungrammatical input, and it will allow us to overcome a current ***** bottleneck ***** in the field — an exponential burst of ambiguity caused by ambiguous lexical items (Flickinger, 2010). | ||
| 2020.emnlp-main.165 Though federated learning has distinct advantages in privacy protection, it suffers from the communication ***** bottleneck *****, which is mainly caused by the need to upload cumbersome local parameters. | ||
| N19-1340 Semantic role labeling (SRL) is a task to recognize all the predicate-argument pairs of a sentence, which has been in a performance improvement ***** bottleneck ***** after a series of latest works were presented | ||
| Recurrent | 60 | |
| 2020.emnlp-main.156 ***** Recurrent ***** neural networks empirically generate natural language with high syntactic fidelity. | ||
| W18-0607 In this paper, we apply a hierarchical ***** Recurrent ***** neural network (RNN) architecture with an attention mechanism on social media data related to mental health. | ||
| 2016.iwslt-1.2 ***** Recurrent ***** language models, in particular, have been a great success due to their ability to model arbitrary long context. | ||
| K19-1031 Although some approaches such as the attention mechanism have partially remedied the problem , we found that the current standard NMT model , Transformer , has difficulty in translating long sentences compared to the former standard , *****Recurrent***** Neural Network ( RNN)-based model . | ||
| P19-1149 *****Recurrent***** networks have achieved great success on various sequential tasks with the assistance of complex recurrent units , but suffer from severe computational inefficiency due to weak parallelization . | ||
| semantic parser | 60 | |
| 2020.acl-main.608 The downstream naive ***** semantic parser ***** accepts the intermediate output and returns the target logical form. | ||
| P19-1473 To overcome this, we propose a novel framework to build a unified multi-domain enabled ***** semantic parser ***** trained only with weak supervision (denotations). | ||
| D19-1543 These table-related tokens are troublesome for the downstream neural ***** semantic parser ***** because it brings complex semantics and hinders the sharing across the training examples. | ||
| 2021.emnlp-main.310 We conduct extensive experiments to study the research problems involved in continual semantic parsing and demonstrate that a neural ***** semantic parser ***** trained with TotalRecall achieves superior performance than the one trained directly with the SOTA continual learning algorithms and achieve a 3-6 times speedup compared to re-training from scratch. | ||
| I17-2021 In this work, we present a method for re-ranking black-box ASR hypotheses using an in-domain language model and ***** semantic parser ***** trained for a particular task | ||
| semantic compositionality | 60 | |
| L16-1732 Building a knowledge graph for representing common-sense knowledge in which concepts discerned from noun phrases are cast as vertices and lexicalized relations are cast as edges leads to learning the embeddings of common-sense knowledge accounting for ***** semantic compositionality ***** as well as implied knowledge. | ||
| 2020.mwe-1.12 This paper explores the use of word2vec and GloVe embeddings for unsupervised measurement of the ***** semantic compositionality ***** of MWE candidates. | ||
| D17-1124 Our models are grounded in the literal-first psycholinguistic hypothesis, which can adaptively learn ***** semantic compositionality ***** of a phrase literally or idiomatically. | ||
| L16-1194 This paper presents mwetoolkit+sem: an extension of the mwetoolkit that estimates ***** semantic compositionality ***** scores for multiword expressions (MWEs) based on word embeddings | ||
| 2020.emnlp-main.651 By formulating DST as a semantic parsing task over hierarchical representations, we can incorporate ***** semantic compositionality *****, cross-domain knowledge sharing and co-reference. | ||
| expressions | 60 | |
| L10-1064 In Knowledge Management, variations in information ***** expressions ***** have proven a real challenge. | ||
| L12-1537 In our framework, contextual occurrence information of much fewer canonical ***** expressions ***** are expanded into the whole forms of derived ***** expressions *****, to be utilized when identifying those derived ***** expressions *****. | ||
| 2020.findings-emnlp.420 However, few models consider the fusion of linguistic features with multiple visual features with different sizes of receptive fields, though the proper size of the receptive field of visual features intuitively varies depending on ***** expressions *****. | ||
| L12-1613 We present an approach to the description of Polish Multi-word Expressions (MWEs) which is based on ***** expressions ***** in the WCCL language of morpho-syntactic constraints instead of grammar rules or transducers. | ||
| 2021.codi-main.6 However, text comprehension can be difficult when referring ***** expressions ***** are non-verbalized and have to be resolved in the discourse context | ||
| dialogue generation | 60 | |
| 2020.acl-main.516 The persona-based ***** dialogue generation ***** task is thus introduced to tackle the personality-inconsistent problem by incorporating explicit persona text into ***** dialogue generation ***** models. | ||
| 2020.findings-emnlp.179 Unlike prior ***** dialogue generation ***** efforts, we treat each seller's historical dialogues as a list of Customer-Seller utterance pairs and allow the model to measure their different importance, and copy words directly from most relevant pairs. | ||
| 2020.acl-main.515 We collect and build a large-scale Chinese dataset aligned with the commonsense knowledge for ***** dialogue generation *****. | ||
| P19-1538 We present open domain ***** dialogue generation ***** with meta-words. | ||
| 2021.acl-long.11 Experiments on the multi-reference Reddit Dataset and DailyDialog Dataset demonstrate that our DialoFlow significantly outperforms the DialoGPT on the ***** dialogue generation ***** task. | ||
| translation memory | 60 | |
| 1997.mtsummit-papers.5 It first discusses the principles which are relevant for the definition of such interfaces; it then presents a state of the art and a proposal in the area of text interfaces, ***** translation memory ***** interfaces, and terminology exchange. | ||
| 2016.amta-researchers.3 Computer-aided translation (CAT) tools often use a ***** translation memory ***** (TM) as the key resource to assist translators. | ||
| 2012.amta-commercial.11 The ***** translation memory ***** was integrated into the statistical search using two novel features. | ||
| W17-7906 To this end, we compiled a small ***** translation memory ***** (English-Spanish) and applied several lexical and syntactic transformation rules to the source sentences with both English and Spanish being the source language. | ||
| 2002.amta-papers.8 One of the limitations of *****translation memory***** systems is that the smallest translation units currently accessible are aligned sentential pairs . | ||
| foreign language | 60 | |
| 2020.wat-1.4 However, it is not practical to translate them all manually into a new ***** foreign language *****. | ||
| W19-9006 Focus being not just on ***** foreign language ***** tuition, but above all on people, places and events in the history and culture of the EU member states, the annotation modules of the e-Platform have been accordingly extended. | ||
| K17-1025 We present a feature-rich knowledge tracing method that captures a student's acquisition and retention of knowledge during a ***** foreign language ***** phrase learning task. | ||
| 2003.mtsummit-systems.10 In response to growing needs for cross-lingual patent retrieval, we propose PRIME (Patent Retrieval In Multilingual Environment system), in which users can retrieve and browse patents in ***** foreign language *****s only by their native language. | ||
| 2021.ranlp-1.66 Feature engineering is an important step in classical NLP pipelines , but machine learning engineers may not be aware of the signals to look for when processing *****foreign language***** text . | ||
| large scale | 60 | |
| N19-1302 However, for multi-hop QA tasks, which require reasoning with multiple sentences, it remains unclear how best to utilize entailment models pre-trained on ***** large scale ***** datasets such as SNLI, which are based on sentence pairs. | ||
| P18-1009 This formulation allows us to use a new type of distant supervision at ***** large scale *****: head words, which indicate the type of the noun phrases they appear in. | ||
| D19-1197 Since the paired data now is no longer enough to train a neural generation model, we consider leveraging the ***** large scale ***** of unpaired data that are much easier to obtain, and propose response generation with both paired and unpaired data. | ||
| 2020.findings-emnlp.145 Advances in machine reading comprehension (MRC) rely heavily on the collection of ***** large scale ***** human-annotated examples in the form of (question, paragraph, answer) triples. | ||
| P18-1208 From a resource perspective, there is a genuine need for ***** large scale ***** datasets that allow for in-depth studies of this form of language. | ||
| dialogs | 59 | |
| 2021.conll-1.1 We conclude that creating shared mental models between users and AI systems is important to achieving successful ***** dialogs *****. | ||
| L12-1631 The ***** dialogs ***** are manually classified into two classes: completed and uncompleted music retrieval tasks. | ||
| 2020.lrec-1.13 We are releasing this dataset to encourage research in the field of coreference resolution, referring expression generation and identification within realistic, deep ***** dialogs ***** involving multiple domains. | ||
| 2021.emnlp-main.401 The ***** dialogs ***** are collection using a two-phase pipeline: (1) A novel multimodal dialog simulator generates simulated dialog flows, with an emphasis on diversity and richness of interactions, (2) Manual paraphrasing of generating utterances to draw from natural language distribution. | ||
| 2021.sigdial-1.3 After stitching, our ***** dialogs ***** are provably deeper, contain longer-term dependencies, and span multiple contexts, when compared with the source ***** dialogs *****—all free of cost without any additional annotations | ||
| Especially | 59 | |
| W16-4609 ***** Especially *****, the system for the HINDENhi-ja task with pivoting by English uses the reordering technique. | ||
| D19-5226 ***** Especially *****, we made the UCSY-corpus to be cleaned in WAT 2019. | ||
| 2020.coling-main.48 ***** Especially ***** when extended to complex vector space, they show the capability in handling various relation patterns including symmetry, antisymmetry, inversion and composition. | ||
| N18-2063 ***** Especially ***** where scholars are merely interested in exploring the bigger picture of a language family's phylogeny, algorithms for automatic cognate detection are a useful complement for current research on language phylogenies. | ||
| P19-1012 ***** Especially *****, they benefit from information coming from structural features, such as features drawn from neighboring tokens in the dependency tree | ||
| faithfulness | 59 | |
| 2020.coling-main.502 We construct a Chinese e-commerce product summarization dataset, and the experimental results on this dataset demonstrate that our models significantly improve the ***** faithfulness *****. | ||
| 2020.acl-main.173 Furthermore, we show that textual entailment measures better correlate with ***** faithfulness ***** than standard metrics, potentially leading the way to automatic evaluation metrics as well as training and decoding criteria. | ||
| 2021.emnlp-main.10 DecSum substantially outperforms text-only summarization methods and model-based explanation methods in decision ***** faithfulness ***** and representativeness. | ||
| 2021.eacl-main.243 Our proposed objective improves ***** faithfulness ***** without reducing the translation quality and has a useful regularization effect on the NMT model and can even improve translation quality in some cases. | ||
| W19-8645 We suggest four extensions to that framework: (1) we introduce a trainable neural planning component that can generate effective plans several orders of magnitude faster than the original planner; (2) we incorporate typing hints that improve the model's ability to deal with unseen relations and entities; (3) we introduce a verification-by-reranking stage that substantially improves the ***** faithfulness ***** of the resulting texts; (4) we incorporate a simple but effective referring expression generation module | ||
| canonical | 59 | |
| W16-3808 As regards the position of an argument in the dependency structure with respect to its predicate, there exist three types of valency filling: active (***** canonical *****), passive, and discontinuous. | ||
| L12-1537 Each derived functional expression is intended to be identified by referring to the most similar occurrence of its ***** canonical ***** expression. | ||
| W19-2504 Our results show that it is indeed possible to distinguish characters by their speech in the plays of ***** canonical ***** writers such as George Bernard Shaw, whereas characters are clustered more closely in the works of lesser-known playwrights. | ||
| 2021.eacl-tutorials.2 A first objective of this tutorial is to connect NLP researchers with state-of-the-art aggregation models for a diverse set of ***** canonical ***** language annotation tasks. | ||
| 2020.acl-main.612 Existing entity embeddings are learned from ***** canonical ***** Wikipedia articles and local contexts surrounding target entities | ||
| synonym | 59 | |
| C18-1208 There are several goals of this work and resource: (a) to provide gold standard data for automatic experiments in the future (such as automatic discovery of ***** synonym ***** classes, word sense disambiguation, assignment of classes to occurrences of verbs in text, coreferential linking of verb and event arguments in text, etc.), (b) to build a core (bilingual) lexicon linked to existing resources, for comparative studies and possibly for training automatic tools, and (c) to enrich the annotation of a parallel treebank, the Prague Czech English Dependency Treebank, which so far contained valency annotation but has not linked ***** synonym *****ous senses of verbs together. | ||
| 2021.acl-long.237 Moreover, when attacked by TextFooler with ***** synonym ***** replacement, SEQA demonstrates much less performance drops than baselines, thereby indicating stronger robustness. | ||
| 2019.gwc-1.16 Instead, a subpart of the links are realised through near ***** synonym ***** or hyponymy links to compensate for the fact that no precise translation can be found in the target resource. | ||
| 2021.sustainlp-1.9 Traditional ***** synonym ***** recommendations often include ill-suited suggestions for writer's specific contexts. | ||
| S17-2168 The best setting achieved an F1 score of 71.0% for ***** synonym ***** and 30.0% for hyponym relation on the test data | ||
| baseline | 59 | |
| L14-1369 A large vocabulary speech recognition ***** baseline ***** system was built using the QA corpus. | ||
| D18-1258 We characterize the dataset and explore its learning potential by training ***** baseline ***** models for question to logical form and question to answer mapping. | ||
| E17-1033 For dependency parsing, the improvement reaches 2 percent points over the full training ***** baseline ***** when we use two topics. | ||
| 2014.amta-workshop.4 We experimentally show that optimizing hyperparameters and number of iterations in online learning yields consistent improvement against ***** baseline ***** results. | ||
| W19-4325 We evaluate these algorithms on several benchmark datasets and observe that, while adversarial training is beneficial to most ***** baseline ***** algorithms, there are cases where it may lead to overfitting and performance degradation | ||
| translating | 59 | |
| W19-4617 Parallel datasets are mainly collected three different ways; i) ***** translating ***** Arabic texts into Turkish by professional translators, ii) exploiting the web for open-source Arabic-Turkish parallel texts, iii) using back-translation. | ||
| K19-1031 Experiments on ASPEC English-to-Japanese and WMT2014 English-to-German translation tasks demonstrate that relative position helps ***** translating ***** sentences longer than those in the training data. | ||
| R17-1049 Secondly, we consider the scenario where the domain is not known and predicted at the sentence level before ***** translating *****. | ||
| S18-1041 Our Spanish-only approach aimed to demonstrate that it is beneficial to automatically generate additional training data by (i) ***** translating ***** training data from other languages and (ii) applying a semi-supervised learning method. | ||
| 2020.wmt-1.28 The second system is document level, ***** translating ***** multiple sentences, trained on multi-sentence sequences up to 3000 characters long | ||
| summary | 59 | |
| D19-1388 Next, an editing generator generates new ***** summary ***** based on the ***** summary ***** pattern or extracted facts. | ||
| D19-1389 Our iterative algorithm under the Information Bottleneck objective searches gradually shorter subsequences of the given sentence while maximizing the probability of the next sentence conditioned on the ***** summary *****. | ||
| C16-1024 We address this issue and present a general optimization framework where any function of input documents and a system ***** summary ***** can be plugged in. | ||
| S17-2059 In addition to well known similarity measures such as cosine similarity, we use other measures based on the ***** summary ***** statistics of word embedding representation for a given text. | ||
| D19-5557 Based on the ***** summary ***** of Pang and Gimpel (2018) and Mir et al | ||
| description | 59 | |
| P19-1641 Motivated by video dense captioning, we propose a model to generate procedure captions from narrated instructional videos which are a sequence of step-wise clips with ***** description *****. | ||
| L10-1034 We have spotted four groups of problems: the lack of a value for an existing category, the lack of a category, the interdependence of values and categories lacking some ***** description *****, and the lack of a support for some types of categories. | ||
| P17-1175 We introduce a Multi-modal Neural Machine Translation model in which a doubly-attentive decoder naturally incorporates spatial visual features obtained using pre-trained convolutional neural networks, bridging the gap between image ***** description ***** and translation. | ||
| L06-1053 However, detailed information ***** description ***** increases tag types. | ||
| 1963.earlymt-1.32 Corpora selected for ***** description ***** are chosen so as to have similar texts within the same scientific discipline for the several languages | ||
| shared tasks | 59 | |
| D19-6305 This approach was then adapted for the two ***** shared tasks ***** of SR'19. | ||
| D19-5601 Second, we describe the results of the two ***** shared tasks ***** 1) efficient neural machine translation (NMT) where participants were tasked with creating NMT systems that are both accurate and efficient, and 2) document generation and translation (DGT) where participants were tasked with developing systems that generate summaries from structured data, potentially with assistance from text in another language. | ||
| 2020.sdp-1.25 Our system participates in two ***** shared tasks *****, CL-SciSumm 2020 and LongSumm 2020. | ||
| 2020.loresmt-1.5 Workshop on Technologies for MT of Low Resource Languages (LoResMT 2020) organized ***** shared tasks ***** of low resource language pair translation using zero-shot NMT. | ||
| 2021.emnlp-demo.21 After a year of development, the library now includes more than 650 unique datasets, has more than 250 contributors, and has helped support a variety of novel cross-dataset research projects and ***** shared tasks *****. | ||
| semantic analysis | 59 | |
| L12-1579 Syntactic parses can provide valuable information for many NLP tasks, such as machine translation, ***** semantic analysis *****, etc. | ||
| 2020.lrec-1.362 The study of predicate frame is an important topic for ***** semantic analysis *****. | ||
| 2014.lilt-9.3 This is probably because it requires a combination of robust, deep ***** semantic analysis ***** and logical inference—and why develop something with this complexity if you perhaps can get away with something simpler? | ||
| L12-1184 With the CINTIL-International Corpus of Portuguese, an ongoing corpus annotated with fully flegded grammatical representation, sentences get not only a high level of lexical, morphological and syntactic annotation but also a ***** semantic analysis ***** that prepares the data to a manual specification step and thus opens the way for a number of tools and resources for which there is a great research focus at the present. | ||
| 2021.emnlp-main.314 Frame semantic parsing is a *****semantic analysis***** task based on FrameNet which has received great attention recently . | ||
| grammar induction | 59 | |
| 2021.cmcl-1.19 This paper asks whether a distinction between production-based and perception-based ***** grammar induction ***** influences either (i) the growth curve of grammars and lexicons or (ii) the similarity between representations learned from independent sub-sets of a corpus. | ||
| 2020.sltu-1.30 We describe a questionnaire which collects the necessary data to bootstrap the number ***** grammar induction ***** system and parameterize the verbalizer templates described in Ritchie et al. | ||
| 2020.emnlp-main.270 In this paper, we consider the syntactic properties of languages emerged in referential games, using unsupervised ***** grammar induction ***** (UGI) techniques originally designed to analyse natural language. | ||
| 2020.emnlp-main.354 In this work, we study visually grounded ***** grammar induction ***** and learn a constituency parser from both unlabeled text and its visual groundings. | ||
| Q18-1016 There has been recent interest in applying cognitively- or empirically - motivated bounds on recursion depth to limit the search space of *****grammar induction***** models ( Ponvert et al . , 2011 ; Noji and Johnson , 2016 ; Shain et al . , 2016 ) . | ||
| dialogue act classification | 59 | |
| L16-1017 We, first, assess the importance of various types of features and their combinations for effective cross-domain ***** dialogue act classification *****. | ||
| 2020.lrec-1.147 We establish a benchmark ***** dialogue act classification ***** model for the corpus, thereby providing a proof of concept for the proposed annotation schema. | ||
| 2020.lrec-1.74 In this paper, we describe our strategies and their evaluation for ***** dialogue act classification *****. | ||
| 2020.lrec-1.672 The proposed chatbot relies on a controller which performs ***** dialogue act classification ***** and feeds user input either to a sequence-to-sequence chatbot or to a QA system. | ||
| D17-1229 Our experiments on ***** dialogue act classification ***** demonstrate the effectiveness of this approach. | ||
| software | 59 | |
| L16-1711 This paper introduces an open source, interoperable generic ***** software ***** tool set catering for the entire workflow of creation, migration, annotation, query and analysis of multi-layer linguistic corpora. | ||
| 2021.acl-long.59 We release ***** software ***** tools to facilitate citation-aware SciIE development. | ||
| L10-1159 This paper describes the first pattern recognition based ***** software ***** components developed in the AVATecH project and their integration in the annotation tool ELAN. | ||
| 2000.amta-systems.2 In this paper we describe the KANTOO machine translation environment , a set of *****software***** services and tools for multilingual document production . | ||
| L10-1056 This paper describes a *****software***** toolkit for the interactive display and analysis of automatically extracted or manually derived annotation features of visual and audio data . | ||
| image classification | 59 | |
| N19-1328 While training *****image classification***** models with label noise have received much attention, training text classification models have not. | ||
| 2021.acl-long.493 The pre-trained StructuralLM achieves new state-of-the-art results in different types of downstream tasks, including form understanding (from 78.95 to 85.14), document visual question answering (from 72.59 to 83.94) and document *****image classification***** (from 94.43 to 96.08). | ||
| 2020.semeval-1.114 Our results show that *****image classification***** models have the potential to help classifying memes, with DenseNet outperforming ResNet. | ||
| 2021.eacl-main.6 Our method improves the performance of standard backbones such as BERT, Electra, and ResNet-50 on a wide variety of tasks, such as question answering on SQuAD and NewsQA, benchmark task SuperGLUE, conversation response selection on Ubuntu Dialog corpus v2.0, as well as *****image classification***** on MNIST and ImageNet without any changes to the underlying models. | ||
| N18-1197 Results on *****image classification*****, text editing, and reinforcement learning show that, in all settings, models with a linguistic parameterization outperform those without. | ||
| cross - lingual | 59 | |
| 2020.acl-main.581 To better tackle the named entity recognition ( NER ) problem on languages with little / no labeled data , *****cross - lingual***** NER must effectively leverage knowledge learned from source languages with rich labeled data . | ||
| 2020.repl4nlp-1.16 Multilingual BERT ( mBERT ) trained on 104 languages has shown surprisingly good cross - lingual performance on several NLP tasks , even without explicit *****cross - lingual***** signals . | ||
| 2021.naacl-main.31 While *****cross - lingual***** techniques are finding increasing success in a wide range of Natural Language Processing tasks , their application to Semantic Role Labeling ( SRL ) has been strongly limited by the fact that each language adopts its own linguistic formalism , from PropBank for English to AnCora for Spanish and PDT - Vallex for Czech , inter alia . | ||
| R17-1007 To facilitate *****cross - lingual***** studies , there is an increasing interest in identifying linguistic universals . | ||
| W16-3704 In recent years there has been a lot of interest in *****cross - lingual***** parsing for developing treebanks for languages with small or no annotated treebanks . | ||
| Fake | 58 | |
| 2020.wanlp-1.7 *****Fake***** news and deceptive machine - generated text are serious problems threatening modern societies , including in the Arab world . | ||
| 2019.icon-1.2 *****Fake***** news , rumor , incorrect information , and misinformation detection are nowadays crucial issues as these might have serious consequences for our social fabrics . | ||
| 2020.lrec-1.755 *****Fake***** news has altered society in negative ways in politics and culture . | ||
| 2021.wnut-1.21 *****Fake***** news causes significant damage to society . | ||
| W17-4214 *****Fake***** news has become a hotly debated topic in journalism . | ||
| summarisation | 58 | |
| R19-1116 Calculating the Semantic Textual Similarity (STS) is an important research area in natural language processing which plays a significant role in many applications such as question answering, document ***** summarisation *****, information retrieval and information extraction. | ||
| D18-1445 We propose a method to perform automatic document ***** summarisation ***** without using reference summaries. | ||
| D19-1318 Most text-to-text generation tasks, for example text ***** summarisation ***** and text simplification, require copying words from the input to the output. | ||
| L06-1023 Computer-aided ***** summarisation ***** is a technology developed at the University of Wolverhampton as a complement to automatic ***** summarisation *****, to produce high quality summaries with less effort | ||
| 2020.fnp-1.1 FNS ***** summarisation ***** shared task is the first to target financial annual reports. | ||
| resource | 58 | |
| 2020.sustainlp-1.19 In this work, we show that existing software-based energy estimations are not accurate because they do not take into account hardware differences and how ***** resource ***** utilization affects energy consumption. | ||
| L10-1470 Whereas English is the lingua franca of online information, Oromo, despite its relative wide distribution within Ethiopia and neighbouring countries like Kenya and Somalia, is one of the most ***** resource ***** scarce languages. | ||
| 2021.wat-1.25 Individually, Indian languages are ***** resource ***** poor which hampers translation quality but by leveraging multilingualism and abundant monolingual corpora, the translation quality can be substantially boosted. | ||
| L14-1436 Annotating multimodal data is ***** resource ***** consuming, thus the results are promising. | ||
| D18-1045 We find that in all but ***** resource ***** poor settings back-translations obtained via sampling or noised beam outputs are most effective | ||
| morphological reinflection | 58 | |
| 2021.emnlp-main.159 We further show that our bootstrapping methods substantially outperform hallucination-based methods commonly used for overcoming the annotation bottleneck in ***** morphological reinflection ***** tasks. | ||
| W18-5816 CALIMA-Star also supports ***** morphological reinflection *****. | ||
| 2020.alta-1.5 In this paper, we attempt to model Nen verbal morphology using state-of-the-art machine learning models for ***** morphological reinflection *****. | ||
| W17-4111 This is achieved by using unlabeled tokens or random strings as training data for an autoencoding task, adapting a network for ***** morphological reinflection *****, and performing multi-task training | ||
| E17-1049 We show that our new architecture outperforms single-source reinflection models and publish our dataset for multi-source ***** morphological reinflection ***** to facilitate future research. | ||
| Empirically | 57 | |
| 2021.acl-long.512 ***** Empirically *****, this leads to sets often exhibiting high overlap, e.g., strings may differ by only a single word. | ||
| D19-1063 ***** Empirically *****, our approach is able to ask for help more effectively than competitive baselines and, thus, attains higher task success rate on both previously seen and previously unseen environments. | ||
| D19-1431 ***** Empirically *****, our model achieves state-of-the-art results on few-shot link prediction KG benchmarks. | ||
| 2020.emnlp-main.179 ***** Empirically *****, extensive experiments on both public and real-world datasets demonstrate the effectiveness of the MGL method. | ||
| 2021.naacl-main.82 ***** Empirically *****, our agent outperforms the baseline model that does not use syntax information on the Room-to-Room dataset, especially in the unseen environment | ||
| Particularly | 57 | |
| I17-1024 ***** Particularly *****, we differentiate the original sense and extended senses of a word by introducing their global occurrence information and model their relatedness through the local textual context information. | ||
| 2020.findings-emnlp.105 ***** Particularly *****, our designed automated type representation learning mechanism is a pluggable module which can be easily incorporated with any KGE model. | ||
| 2021.emnlp-main.488 ***** Particularly *****, we evaluate how well the ranking imposed by our approach associates with the ranking imposed by the manual annotations. | ||
| 2020.coling-main.183 ***** Particularly *****, we propose to accommodate two types of weakly labeled data for MWS, i.e., SWS data and DictEx data by employing a simple yet competitive graph-based parser with local loss. | ||
| 2020.emnlp-main.112 ***** Particularly *****, this is the first work reporting the generation results on MIMIC-CXR to the best of our knowledge | ||
| instance | 57 | |
| 2021.repl4nlp-1.1 In this paper, we propose zero-shot ***** instance *****-weighting, a general model-agnostic zero-shot learning framework for improving CLTC by leveraging source ***** instance ***** weighting. | ||
| L12-1406 This paper describes the SYNC3 collaborative annotation tool, which implements an alternative architecture: it remains a desktop application, fully exploiting the advantages of desktop applications, but provides collaborative annotation through the use of a centralised server for storing both the documents and their metadata, and ***** instance ***** messaging protocols for communicating events among all annotators. | ||
| N19-1003 In this work, we tackle the above challenges by introducing a new data sampling technique based on spaced repetition that dynamically samples informative and diverse unlabeled ***** instance *****s with respect to individual learner and ***** instance ***** characteristics. | ||
| K17-3008 We introduce context embeddings, dense vectors derived from a language model that represent the left/right context of a word ***** instance *****, and demonstrate that context embeddings significantly improve the accuracy of our transition based parser. | ||
| N19-1290 This paper adopts posterior regularization (PR) to integrate some domain-specific rules in ***** instance ***** selection using REINFORCE | ||
| constrained | 57 | |
| L10-1415 Such an approach is, however, strongly ***** constrained ***** by the finite content of the reference corpus, providing limited language possibilities. | ||
| 2020.vardial-1.19 Our solutions are based on the BERT Transformer models, the ***** constrained ***** versions of our models reaching 1st place in two subtasks and 3rd place in one subtask, while our un***** constrained ***** models outperform all the ***** constrained ***** systems by a large margin. | ||
| 2021.wmt-1.54 We participated in all three evaluation tracks including Large Track and two Small Tracks where the former one is un***** constrained ***** and the latter two are fully ***** constrained *****. | ||
| W18-6410 We participate in the multilingual subtrack with a system trained under the ***** constrained ***** condition to translate from English to both Finnish and Estonian. | ||
| 2021.wmt-1.37 We participate in the Russian-to-Chinese task under the ***** constrained ***** condition | ||
| indexing | 57 | |
| R19-1053 Moreover, we experiment with different ***** indexing ***** and pre-training strategies. | ||
| 2021.hackashop-1.6 The study was motivated by the need to select the most appropriate technique to extract keywords for ***** indexing ***** news articles in a real-world large-scale news analysis engine. | ||
| L06-1508 The Gazetteer makes use of MAYA Design's Universal Database Architecture; a peer-to-peer system based upon bundles of attribute-value pairs with universally unique identity, and sophisticated ***** indexing ***** and data fusion tools. | ||
| L08-1256 In this paper we present results from using Random ***** indexing ***** for Latent Semantic Analysis to handle Singular Value Decomposition tractability issues | ||
| L10-1107 This paper compares several *****indexing***** methods for person names extracted from text , developed for an information retrieval system with requirements for fast approximate matching of noisy and multicultural Romanized names . | ||
| LFG | 57 | |
| 2019.lilt-17.4 I highlight features of existing ***** LFG ***** analyses and focus in particular on the modular architecture of ***** LFG *****, its attendant multidimensional lexicon and the analytic consequences which follow from this. | ||
| L16-1565 We present NorGramBank, a treebank for Norwegian with highly detailed ***** LFG ***** analyses. | ||
| 2006.bcs-1.10 This paper describes the construction of a dependency bank gold standard for Arabic, DCU 250 Arabic Dependency Bank (DCU 250), based on the Arabic Penn Treebank Corpus (ATB) (Bies and Maamouri, 2003; Maamouri and Bies, 2004) within the theoretical framework of Lexical Functional Grammar (***** LFG *****) | ||
| 2020.lrec-1.631 This paper reports on a parsing system for Wolof based on the *****LFG***** formalism . | ||
| 2021.cl-4.31 Abstract The universal generation problem for *****LFG***** grammars is the problem of determining whether a given grammar derives any terminal string with a given f - structure . | ||
| polysemous | 57 | |
| 2016.gwc-1.39 This strategy allows the extraction of information for ***** polysemous ***** English words if definitions and/or semantic relations are present in the dictionary. | ||
| 2003.mtsummit-papers.26 Trained on very large corpora, the model provides relevant, organized contexonyms that reflect the fine-grained connotations and contextual usage of the target word, as well as the distinct senses of homonyms and ***** polysemous ***** words. | ||
| S18-2031 Words are ***** polysemous ***** and multi-faceted, with many shades of meanings. | ||
| 2021.semeval-1.92 The purpose of the MCL-WiC task is to tackle the challenge of capturing the ***** polysemous ***** nature of words without relying on a fixed sense inventory in a multilingual and cross-lingual setting. | ||
| 2020.cogalex-1.16 Specifically, homonymous senses (e.g., bat as mammal vs. bat as sports equipment) are reliably more distant from one another in the embedding space than ***** polysemous ***** ones (e.g., chicken as animal vs. chicken as meat) | ||
| extracting | 57 | |
| 2020.emnlp-main.183 In this work, we propose the first end-to-end model with a novel position-aware tagging scheme that is capable of jointly ***** extracting ***** the triplets. | ||
| 2020.findings-emnlp.25 Specifically, we present a universal training framework named Pretrain-KGE consisting of three phases: semantic-based fine-tuning phase, knowledge ***** extracting ***** phase and KGE training phase. | ||
| L08-1232 At one level, ***** extracting ***** the relevant information from research papers is a text mining task, requiring both extensive language resources and specialised knowledge of the subject domain. | ||
| L10-1358 First, we introduce a supervised approach to ***** extracting ***** geographical relations on a fine-grained level | ||
| C16-1320 As an additional objective, we discuss two novel use cases including automatically ***** extracting ***** links to public datasets from the proceedings, which would further accelerate the advancement in digital libraries. | ||
| translator | 57 | |
| C18-1305 But more importantly, we show how to use reinforcement learning (RL) to further adapt the adapted ***** translator *****, where translated sentences with more proper slot tags receive higher rewards. | ||
| 2021.eacl-main.62 The learned ***** translator ***** can be used to generate representations for unseen users in the future. | ||
| 2020.lrec-1.641 The texts come from different sources: daily newspaper articles, Czech translation of the Wall Street Journal, transcribed dialogs and a small amount of user-generated, short, often non-standard language segments typed into a web ***** translator *****. | ||
| 2020.signlang-1.18 Though our current database is small, we hope for ***** translator *****s to invest themselves and help us to keep it expanding. | ||
| 2021.triton-1.18 This ongoing project will study the external integration of TM and MT, examining if the productivity and post-editing efforts of ***** translator *****s are higher or lower than using only TM. | ||
| cognitive | 57 | |
| 2020.emnlp-main.170 We find that beam search enforces uniform information density in text, a property motivated by ***** cognitive ***** science. | ||
| 2020.gamnlp-1.10 Computer-based experiments with game-like features have been developed previously for research on ***** cognitive ***** skills, ***** cognitive ***** processing speed, working memory, attention, learning, problem solving, group behavior and other phenomena. | ||
| N18-2110 Language changes serve as a sign that a patient's ***** cognitive ***** functions have been impacted, potentially leading to early diagnosis. | ||
| W17-5533 This hybrid DM architecture affords incremental processing of uncertain input, a flexible, mixed-initiative information grounding process that can be adapted to users' ***** cognitive ***** capacities and interactive idiosyncrasies, and generic mechanisms that foster transitions in the joint discourse state that are understandable and controllable by those users, in order to effect a robust interaction for users with varying capacities. | ||
| 2020.isa-1.8 The resource is structured on action concepts that are meant to be ***** cognitive ***** entities and to which a linguistic caption is attached | ||
| grammatical error diagnosis | 57 | |
| W17-5907 Detection and correction of Chinese grammatical errors have been two of major challenges for Chinese automatic ***** grammatical error diagnosis *****.This paper presents an N-gram model for automatic detection and correction of Chinese grammatical errors in NLPTEA 2017 task. | ||
| C16-1085 Misuse of Chinese prepositions is one of common word usage errors in ***** grammatical error diagnosis *****. | ||
| W18-3726 Chinese ***** grammatical error diagnosis ***** is an important natural language processing (NLP) task, which is also an important application using artificial intelligence technology in language education. | ||
| W18-3730 The main goal of Chinese ***** grammatical error diagnosis ***** task is to detect word er-rors in the sentences written by Chinese-learning students. | ||
| W16-4906 This paper presents the NLP-TEA 2016 shared task for Chinese ***** grammatical error diagnosis ***** which seeks to identify grammatical error types and their range of occurrence within sentences written by learners of Chinese as foreign language. | ||
| resolution | 57 | |
| 2020.coling-main.435 One critical issue of zero anaphora ***** resolution ***** (ZAR) is the scarcity of labeled data. | ||
| 2020.acl-main.560 Through this paper, we attempt to convince the ACL community to prioritise the ***** resolution ***** of the predicaments highlighted here, so that no language is left behind. | ||
| L14-1646 The ECB corpus is one of the data sets used for evaluation of the task of event coreference ***** resolution *****. | ||
| 2020.emnlp-main.686 This paper analyzes the impact of higher-order inference (HOI) on the task of coreference ***** resolution *****. | ||
| W19-3812 In this work, contribution of transfer learning technique to pronoun ***** resolution ***** systems is investigated and the gender bias contained in classification models is evaluated. | ||
| text processing | 57 | |
| L12-1309 The RIDIRE-CPI user-friendly interface is specifically intended for allowing collaborative work performance by users with low skills in web technology and ***** text processing *****. | ||
| L06-1376 Our approach will serve for the support of manual database curation and as a basis for ***** text processing ***** applications. | ||
| L14-1325 This paper presents TexAfon 2.0, an improved version of the ***** text processing ***** tool TexAFon, specially oriented to the generation of synthetic speech with expressive content. | ||
| L16-1427 In addition, we propose an incremental way of finding the optimal combination of simple ***** text processing ***** options and machine learning features for sentiment classification. | ||
| L10-1533 Lexical resources are basic components of many *****text processing***** system devoted to information extraction , question answering or dialogue . | ||
| humor recognition | 57 | |
| D19-1669 In addition, we further show that this approach is also beneficial for small sample ***** humor recognition ***** tasks through a semi-supervised label propagation procedure, which achieves about 0.7 accuracy on the 16000 One-Liners (Mihalcea and Strapparava, 2005) and Pun of the Day (Yang et al., 2015) humour classification datasets using only 10% of known labels. | ||
| L10-1506 Research on automatic ***** humor recognition ***** has developed several features which discriminate funny text from ordinary text. | ||
| W18-6242 The system combines style-features from previous studies on ***** humor recognition ***** in short text with ambiguity-based features. | ||
| W16-4319 We investigated a possibility that a state-of-the-art *****humor recognition***** system can be used in detecting sentences inducing laughters in talks. | ||
| P19-1394 The task of *****humor recognition***** has attracted a lot of attention recently due to the urge to process large amounts of user-generated texts and rise of conversational agents. | ||
| automated | 57 | |
| 2020.sigdial-1.29 A total of 20 papers from the last two years are surveyed to analyze three types of evaluation protocols: ***** automated *****, static, and interactive. | ||
| 2020.sdp-1.20 We were able to obtain ~267,000 unique research papers through our fully-***** automated ***** framework using ~76,000 queries, resulting in almost 200,000 more papers than the number of queries. | ||
| 2020.findings-emnlp.347 Current ***** automated ***** methods to estimate turn and dialogue level user satisfaction employ hand-crafted features and rely on complex annotation schemes, which reduce the generalizability of the trained models. | ||
| W17-3105 Our results indicate that, overall, research participants were enthusiastic about the possibility of using social media (in conjunction with ***** automated ***** Natural Language Processing algorithms) for mood tracking under the supervision of a mental health practitioner. | ||
| 2021.sigdial-1.35 We propose an ***** automated ***** correction for this issue, which is present in 70% of the dialogs. | ||
| semantic parsing task | 57 | |
| P19-1010 Structured information about entities is critical for many ***** semantic parsing task *****s. | ||
| D19-3012 Experimental results show that the neural-based semantic parser system achieves competitive performance on ***** semantic parsing task *****, and grammar-based semantic parsers significantly improve the performance of a business search engine. | ||
| 2021.semeval-1.179 Question answering from semi-structured tables can be seen as a ***** semantic parsing task ***** and is significant and practical for pushing the boundary of natural language understanding. | ||
| D19-1392 We unify different broad-coverage ***** semantic parsing task *****s into a transduction parsing paradigm, and propose an attention-based neural transducer that incrementally builds meaning representation via a sequence of semantic relations. | ||
| 2020.emnlp-main.651 By formulating DST as a ***** semantic parsing task ***** over hierarchical representations, we can incorporate semantic compositionality, cross-domain knowledge sharing and co-reference. | ||
| neural machine translation ( NMT ) | 57 | |
| 2020.wmt-1.65 Despite advances in *****neural machine translation ( NMT )***** quality , rare words continue to be problematic . | ||
| C18-1274 This study proposes a new *****neural machine translation ( NMT )***** model based on the encoder - decoder model that incorporates named entity ( NE ) tags of source - language sentences . | ||
| 2018.iwslt-1.15 This work describes AppTek 's speech translation pipeline that includes strong state - of - the - art automatic speech recognition ( ASR ) and *****neural machine translation ( NMT )***** components . | ||
| N18-2082 We propose a process for investigating the extent to which sentence representations arising from *****neural machine translation ( NMT )***** systems encode distinct semantic phenomena . | ||
| 2020.vardial-1.10 In this work , we systematically investigate different set - ups for training of *****neural machine translation ( NMT )***** systems for translation into Croatian and Serbian , two closely related South Slavic languages . | ||
| lexica | 56 | |
| L06-1434 Automatic alignments were compared for ***** lexica ***** containing different vowel variants, with both context-independent and context-dependent acoustic models sets. | ||
| L16-1353 These results show a high potential for this method to be used in bilingual ***** lexica ***** production for language pairs with reduced amount of parallel or comparable corpora, in particular for phrase table expansion in Statistical Machine Translation systems. | ||
| C16-1095 In the absence of large annotated corpora, parallel corpora, treebanks, bilingual ***** lexica *****, etc., we found the following to be effective: exploiting distributional regularities in monolingual data, projecting information across closely related languages, and utilizing human linguist judgments. | ||
| K18-2021 We thus encourage the community to create, convert, or make available more such ***** lexica ***** in future tasks. | ||
| W16-4810 We present an approach for automatic verification and augmentation of multilingual ***** lexica ***** | ||
| matrices | 56 | |
| 2021.acl-long.198 Previous methods are typically node-centric and merely utilize different weight ***** matrices ***** to parameterize edge types, which 1) ignore the rich semantics embedded in the topological structure of edges, and 2) fail to distinguish local and non-local relations for each node. | ||
| E17-1021 We present an end-to-end graph-based neural network dependency parser that can be trained to reproduce ***** matrices ***** of edge scores, which can be directly projected across word alignments. | ||
| D19-1502 We construct three kinds of connectivity ***** matrices ***** to capture different kinds of semantic correlations between entities. | ||
| C16-1255 However, MRMF, that was not used for semantic classification with distributional features before, can easily be extended with more ***** matrices ***** containing more information from different sources on the same problem. | ||
| 2021.acl-short.48 Orthogonality constraints encourage ***** matrices ***** to be orthogonal for numerical stability | ||
| GloVe | 56 | |
| Q16-1028 This provides a theoretical justification for nonlinear models like PMI, word2vec, and ***** GloVe *****, as well as some hyperparameter choices. | ||
| D19-1534 For example, ***** GloVe ***** and word2vec accurately encode magnitude for numbers up to 1,000. | ||
| 2020.lrec-1.437 Nowadays, classical count-based word embeddings using positive pointwise mutual information (PPMI) weighted co-occurrence matrices have been widely superseded by machine-learning-based methods like word2vec and ***** GloVe *****. | ||
| 2019.icon-1.26 The proposed approach is applied over a biomedical text corpus to learn word representation and compared with ***** GloVe *****, which is one of the most popular word embedding approaches. | ||
| P18-1003 Word embedding models such as ***** GloVe ***** rely on co-occurrence statistics to learn vector representations of word meaning | ||
| segmented | 56 | |
| 2020.lrec-1.484 It was automatically sentence ***** segmented ***** and aligned, as well as manually post-corrected, and contains 71,778 translation units. | ||
| Q15-1026 Space-delimited words in Turkish and Hebrew text can be further ***** segmented ***** into meaningful units, but syntactic and semantic context is necessary to predict segmentation. | ||
| L08-1261 For Chinese, a significant difficulty lies in the fact that the text comes as a string of characters, only ***** segmented ***** by sentence boundaries. | ||
| 2020.coling-main.406 Words are properly ***** segmented ***** in the Persian writing system; in practice, however, these writing rules are often neglected, resulting in single words being written disjointedly and multiple words written without any white spaces between them | ||
| L12-1363 Nowadays , the use of statistical models implies the use of huge sized corpora that need to be recorded , transcribed , annotated and *****segmented***** to be usable . | ||
| factuality | 56 | |
| P19-1432 Event ***** factuality ***** prediction (EFP) is the task of assessing the degree to which an event mentioned in a sentence has happened. | ||
| 2021.unimplicit-1.3 While analyzing reference (data, text) samples, we encountered a range of systematic uncertainties that are related to cases on implicit phenomena in text, and the nature of non-linguistic knowledge we expect to be involved when assessing ***** factuality *****. | ||
| N19-4014 FAKTA predicts the ***** factuality ***** of given claims and provides evidence at the document and sentence level to explain its predictions. | ||
| D17-1317 To probe the feasibility of automatic political fact-checking, we also present a case study based on PolitiFact.com using their ***** factuality ***** judgments on a 6-point scale. | ||
| 2021.naacl-main.383 Through these annotations we identify the proportion of different categories of factual errors and benchmark ***** factuality ***** metrics, showing their correlation with human judgement as well as their specific strengths and weaknesses | ||
| differentiable | 56 | |
| 2021.emnlp-main.288 This is achieved by our fully ***** differentiable ***** and end-to-end paradigm that contains three complementary modules: taking the chest X-ray images and clinical history document of patients as inputs, our classification module produces an internal checklist of disease-related topics, referred to as enriched disease embedding; the embedding representation is then passed to our transformer-based generator, to produce the medical report; meanwhile, our generator also creates a weighted embedding representation, which is fed to our interpreter to ensure consistency with respect to disease-related topics. | ||
| 2021.emnlp-main.416 We then reinforce the QA pair generation process with a ***** differentiable ***** reward function to mitigate exposure bias, a common problem in natural language generation. | ||
| P19-1551 Unlike previous approaches to latent tree learning, we stochastically sample global structures and our parser is fully ***** differentiable *****. | ||
| P18-2059 We experiment with various sparse and constrained attention transformations and propose a new one, constrained sparsemax, shown to be ***** differentiable ***** and sparse. | ||
| 2021.acl-long.559 Then we integrate the induced dependency relations into the transformer, in a ***** differentiable ***** manner, through a novel dependency-constrained self-attention mechanism | ||
| causality | 56 | |
| 2020.fnp-1.8 Financial ***** causality ***** detection is centered on identifying connections between different assets from financial news in order to improve trading strategies. | ||
| D18-1372 We achieve strong accuracies for both tasks but find different approaches best: an SVM for ***** causality ***** prediction (F1 = 0.791) and a hierarchy of Bidirectional LSTMs for causal explanation identification (F1 = 0.853). | ||
| 2020.fnp-1.10 We apply ensemble-based and sequence tagging methods for identifying ***** causality *****, and extracting causal subsequences. | ||
| 2021.acl-long.276 To solve the data lacking problem, we introduce a new approach to augment training data for event ***** causality ***** identification, by iteratively generating new examples and classifying event ***** causality ***** in a dual learning framework. | ||
| D19-1296 Our method exploits Wikipedia article sections that describe ***** causality ***** and the redundancy stemming from the multilinguality of Wikipedia | ||
| keyword | 56 | |
| 2020.sustainlp-1.14 Researchers have proposed simple yet effective techniques for the retrieval problem based on using BERT as a relevance classifier to rerank initial candidates from ***** keyword ***** search. | ||
| 2021.dialdoc-1.4 We used ***** keyword ***** matching, multilingual sentence embedding to evaluate the answer. | ||
| 2020.lrec-1.794 Linguistic Data Consortium developed the SAFE-T Corpus to support the NIST (National Institute of Standards and Technology) OpenSAT (Speech Analytic Technologies) evaluation series, whose goal is to advance speech analytic technologies including automatic speech recognition, speech activity detection and ***** keyword ***** search in multiple domains including simulated public safety communications data. | ||
| 2021.hackashop-1.8 Through these interactions and the use of ***** keyword ***** highlighting, the content related to each topic and its change over time can be explored. | ||
| 2020.deelio-1.4 Our experiments demonstrate that insertion of character-level synthetic noise and ***** keyword ***** replacement with hypernyms are effective augmentation methods, and that the quality of generations improves to a peak at approximately three times the amount of original data | ||
| agglutinative | 56 | |
| L16-1574 Beyond the specific task at hand the approach will also be useful for the analysis of other types of spaceless text such as Twitter hashtags and texts in ***** agglutinative ***** or spaceless languages like Finnish or Chinese. | ||
| K18-2024 We propose two word representation models for ***** agglutinative ***** languages that better capture the similarities between words which have similar tasks in sentences. | ||
| 2020.lrec-1.307 With this corpus, we show that Inuktitut displays a much higher degree of polysynthesis than other ***** agglutinative ***** languages usually considered in ASR, such as Finnish or Turkish. | ||
| W17-4105 However, popular models that learn such embeddings are unaware of the morphology of words, so it is not directly applicable to highly ***** agglutinative ***** languages such as Korean. | ||
| L12-1191 Turkish is a highly ***** agglutinative ***** language with a very productive and rich morphology whereas English has a very poor morphology when compared to this language | ||
| unannotated | 56 | |
| D17-1236 We investigate an end-to-end method for automatically inducing task-based dialogue systems from small amounts of ***** unannotated ***** dialogue data. | ||
| 2020.emnlp-main.289 The task of emotion-cause pair extraction deals with finding all emotions and the corresponding causes in ***** unannotated ***** emotion texts. | ||
| C16-2028 TextPro-AL is a web-based application integrating four components: a machine learning based NLP pipeline, an annotation editor for task definition and text annotations, an incremental re-training procedure based on active learning selection from a large pool of ***** unannotated ***** data, and a graphical visualization of the learning status of the system. | ||
| K19-1035 In this work, we propose to leverage ***** unannotated ***** sentences from auxiliary languages to help learning language-agnostic representations. | ||
| P19-1400 In this work, we show how we can learn to link mentions without having any labeled examples, only a knowledge base and a collection of ***** unannotated ***** texts from the corresponding domain | ||
| descriptive | 56 | |
| S17-2017 IDF weighting and Part-of-Speech tagging are applied on the examined sentences to support the identification of words that are highly ***** descriptive ***** in each sentence. | ||
| D19-5615 Neural models have recently shown significant progress on data-to-text generation tasks in which ***** descriptive ***** texts are generated conditioned on database records. | ||
| 2021.acl-short.36 In this paper we propose a novel approach to encourage captioning models to produce more detailed captions using natural language inference, based on the motivation that, among different captions of an image, ***** descriptive ***** captions are more likely to entail less ***** descriptive ***** captions. | ||
| 2021.trustnlp-1.2 We show that the extent of revealed biases in word embeddings depends on the ***** descriptive ***** statistics and similarity measures used to measure the bias. | ||
| 2020.ai4hi-1.4 To this end, we use machine learning to generate text descriptions for the extracted images on the one hand, and to detect ***** descriptive ***** phrases and titles of images from the text on the other hand | ||
| briefly | 56 | |
| 2021.lantern-1.5 We ***** briefly ***** survey existing related tasks in L&V and propose multi-modal information extraction as a general direction for future research. | ||
| 2020.emnlp-main.312 Many English-as-a-second language learners have trouble using near-synonym words (e.g., small vs.little; ***** briefly ***** vs.shortly) correctly, and often look for example sentences to learn how two nearly synonymous terms differ. | ||
| 1993.iwpt-1.21 It is ***** briefly ***** sketched how the parser can be enhanced with feature structures. | ||
| E17-3027 This paper gives an overview of ICE and its performance, and ***** briefly ***** describes the research underlying the extraction algorithms. | ||
| 2020.wac-1.1 For every issue we try to assess its severity and ***** briefly ***** discuss possible mitigation options. | ||
| evaluation metrics | 56 | |
| 2021.naacl-main.90 While traditional corpus-level ***** evaluation metrics ***** for machine translation (MT) correlate well with fluency, they struggle to reflect adequacy. | ||
| 2020.acl-main.450 QAGS has substantially higher correlations with these judgments than other automatic ***** evaluation metrics *****. | ||
| L10-1211 The evaluation itself is processed in a UIMA component, users can create and plug their own ***** evaluation metrics ***** in addition to the predefined metrics. | ||
| L16-1489 Such an evaluation process affords various insights into the image description datasets and ***** evaluation metrics *****, such as the variations of image descriptions within and across datasets and also what the metrics capture. | ||
| 2021.eval4nlp-1.12 Reference-based automatic ***** evaluation metrics ***** are notoriously limited for NLG due to their inability to fully capture the range of possible outputs. | ||
| recent studies | 56 | |
| N18-1082 In addition, ***** recent studies ***** aiming at solving prepositional attachment and preposition selection problems depend heavily on external linguistic resources and use dataset-specific word representations. | ||
| W18-3501 This work also includes a brief outline of similar corpora and ***** recent studies ***** in the field of IM. | ||
| 2020.aacl-main.50 However, ***** recent studies ***** propose to directly use scene graphs to introduce information about object relations into captioning, hoping to better describe interactions between objects. | ||
| D18-1177 Several ***** recent studies ***** have shown the benefits of combining language and perception to infer word embeddings. | ||
| 2020.coling-main.465 As in other natural language understanding tasks, a common practice for this task is to train and evaluate a model on a single dataset, and ***** recent studies ***** suggest that SimpleQuestions, the most popular and largest dataset, is nearly solved under this setting. | ||
| pronoun resolution | 56 | |
| W19-3812 In this work, contribution of transfer learning technique to ***** pronoun resolution ***** systems is investigated and the gender bias contained in classification models is evaluated. | ||
| W19-3813 For solving the gender bias in gendered ***** pronoun resolution ***** task, I propose a novel neural network model based on the pre-trained BERT. | ||
| 2020.lrec-1.11 As far as we know, ours is the first neural model of zero-***** pronoun resolution ***** for Arabic; and our model also outperforms the state-of-the-art for Chinese. | ||
| 2021.eacl-main.190 However, we find that existing gender bias benchmarks do not fully probe professional bias as ***** pronoun resolution ***** may be obfuscated by cross-correlations from other manifestations of gender prejudice. | ||
| D19-1439 We use a language-model-based approach for ***** pronoun resolution ***** in combination with our WikiCREM dataset. | ||
| simultaneous translation | 56 | |
| 2021.autosimtrans-1.2 In this paper we introduce our Chinese-English ***** simultaneous translation ***** system participating in AutoSimulTrans2021. | ||
| 2021.autosimtrans-1.5 This corpus is expected to promote the research of automatic ***** simultaneous translation ***** as well as the development of practical systems. | ||
| 2020.aacl-main.23 Experiments on three translation corpora and two language pairs show the efficacy of the proposed framework on balancing the quality and latency in adapting NMT to perform ***** simultaneous translation *****. | ||
| D18-1337 Our agent with prediction has better translation quality and less delay compared to an agent-based ***** simultaneous translation ***** system without prediction. | ||
| 2021.autosimtrans-1.4 We propose a two-stage ***** simultaneous translation ***** pipeline system which is composed of Quartznet and BPE-based transformer. | ||
| writing | 56 | |
| 2021.dravidianlangtech-1.7 We show that transliteration is essential in unsupervised translation between Dravidian languages, as they do not share a common ***** writing ***** system. | ||
| W17-4905 The results seem to suggest that ***** writing ***** style is predictive of scientific fraud. | ||
| 2020.lrec-1.42 In this paper , we report on datasets that we created for research in feedback comment generation a task of automatically generating feedback comments such as a hint or an explanatory note for *****writing***** learning . | ||
| D19-1299 Recent neural models for data - to - text generation rely on massive parallel pairs of data and text to learn the *****writing***** knowledge . | ||
| D17-1221 When people recall and digest what they have read for *****writing***** summaries , the important content is more likely to attract their attention . | ||
| surprisal | 55 | |
| 2021.iwcs-1.9 We also studied the correlations of ***** surprisal ***** scores computed with three state-of-the-art language models. | ||
| 2020.iwpt-1.6 This paper describes a neural incremental generative parser that is able to provide accurate ***** surprisal ***** estimates and can be constrained to use a bounded stack. | ||
| 2021.emnlp-main.74 The explanatory power of a subset of the proposed operationalizations suggests that the strongest trend may be a regression towards a mean ***** surprisal ***** across the language, rather than the phrase, sentence, or document—a finding that supports a typical interpretation of UID, namely that it is the byproduct of language users maximizing the use of a (hypothetical) communication channel. | ||
| 2020.conll-1.53 To do this, we use recurrent neural networks to calculate the ***** surprisal ***** of stimuli from previously published neurolinguistic studies of the N400. | ||
| 2021.cmcl-1.6 We advance a novel explanation of similarity - based interference effects in subject - verb and reflexive pronoun agreement processing , grounded in *****surprisal***** values computed from a pretrained large - scale Transformer model , GPT-2 . | ||
| salience | 55 | |
| 2021.ranlp-1.40 To overcome this, we aim to perform event ***** salience ***** classification and explore whether a transformer model is capable of classifying new information into less and more general prominence classes. | ||
| S18-2004 Specifically, we learn word ***** salience ***** scores such that, using pre-trained word embeddings as the input, can accurately predict the words that appear in a sentence, given the words that appear in the sentences preceding or succeeding that sentence. | ||
| D18-1129 We also present SalIE, the first fact ***** salience ***** system. | ||
| 2020.lrec-1.257 Furthermore, we conduct experiments on entity ***** salience ***** detection; the results demonstrate that WN-Salience is a challenging testbed that is complementary to existing ones. | ||
| 2021.acl-srw.7 Our method also beats several competitive ***** salience ***** detection baselines | ||
| selectional | 55 | |
| 2020.coling-main.109 We investigate whether Bert contains information on the ***** selectional ***** preferences of words, by examining the probability it assigns to the dependent word given the presence of a head word in a sentence. | ||
| 2003.mtsummit-papers.3 Information on subcategorization and ***** selectional ***** restrictions is important for natural language processing tasks such as deep parsing, rule-based machine translation and automatic summarization. | ||
| D19-1528 Experiments on ***** selectional ***** preference acquisition and word similarity demonstrate the effectiveness of the proposed model, and a further study of scalability also proves that our embeddings only need 1/20 of the original embedding size to achieve better performance. | ||
| 2020.blackboxnlp-1.25 We address this issue by deploying a novel word-learning paradigm to test BERT's few-shot learning capabilities for two aspects of English verbs: alternations and classes of ***** selectional ***** preferences | ||
| L14-1359 Methods for automatic detection and interpretation of metaphors have focused on analysis and utilization of the ways in which metaphors violate *****selectional***** preferences ( Martin , 2006 ) . | ||
| representational | 55 | |
| 2021.blackboxnlp-1.15 The results presented here suggest that the process of fine-tuning causes a reorganisation of the model's limited ***** representational ***** capacity, enhancing language-independent representations at the expense of language-specific ones. | ||
| 2013.iwslt-evaluation.2 Our results also show that HMEANT is a robust and reliable semantic MT evaluation metric for running large-scale evaluation campaigns as it is inexpensive and simple while maintaining the semantic ***** representational ***** transparency to provide a perspective which is different from BLEU and TER in order to understand the performance of the state-of-the-art MT systems. | ||
| 2020.conll-1.39 For models to generalize abstract patterns in expected ways to unseen data, they must share ***** representational ***** features in predictable ways. | ||
| 2021.emnlp-demo.24 Recently, objects with more geometric structure (eg. distributions, complex or hyperbolic vectors, or regions such as cones, disks, or boxes) have been explored for their alternative inductive biases and additional ***** representational ***** capacity. | ||
| 2011.mtsummit-tutorials.1 We show that by keeping the metrics deeply grounded within the theoretical framework of semantic frames , the new HMEANT and MEANT metrics can significantly outperform even the state - of - the - art expensive HTER and TER metrics , while at the same time maintaining the desirable characteristics of simplicity , inexpensiveness , and *****representational***** transparency . | ||
| translationese | 55 | |
| D19-6501 The analysis shows stronger potential ***** translationese ***** effects in machine translated outputs than in human translations. | ||
| 2020.iwslt-1.34 This study analyzes ***** translationese ***** patterns in translation, interpreting, and machine translation outputs in order to explore possible reasons. | ||
| 2020.acl-main.253 BLEU cannot capture human preferences because references are ***** translationese ***** when source sentences are natural text. | ||
| 2021.latechclfl-1.12 The paper reports the results of a ***** translationese ***** study of literary texts based on translated and non-translated Russian. | ||
| 2020.wmt-1.23 Our analysis reveals that the presence of ***** translationese ***** texts in the validation data led us to take decisions in building NMT systems that were not optimal to obtain the best results on the test data | ||
| approximate | 55 | |
| 1963.earlymt-1.32 We shall make repeated revisions of the grammar as we learn how to make it ***** approximate ***** better the language text fed into the computer. | ||
| D19-1023 Rather than relying on pre-aligned relation seeds to learn relation representations, we first ***** approximate ***** them using entity embeddings learned by the GCN. | ||
| 2020.findings-emnlp.434 We collect known words by segmenting oov words and by ***** approximate ***** string matching, and we then aggregate their pre-trained embeddings. | ||
| 2021.teachingnlp-1.18 Assignments are designed to be interactive, easily gradable, and to give students hands-on experience with several key types of structure (sequences, tags, parse trees, and logical forms), modern neural architectures (LSTMs and Transformers), inference algorithms (dynamic programs and ***** approximate ***** search) and training methods (full and weak supervision). | ||
| 2000.amta-papers.14 Although undeniably useful for the translation of certain types of repetitive document, current translation memory technology is limited by the rudimentary techniques employed for ***** approximate ***** matching | ||
| sentential | 55 | |
| 2021.rocling-1.35 The results show that local features that affect the overall ***** sentential ***** sentiment confuse the model: multiple target entities, transitional words, sarcasm, and rhetorical questions. | ||
| 2021.acl-long.327 We propose the Metaphor-relation BERT (Mr-BERT) model, which explicitly models the relation between a verb and its grammatical, ***** sentential ***** and semantic contexts. | ||
| D17-1188 We demonstrate that for sentence-level relation extraction it is beneficial to consider other relations in the ***** sentential ***** context while predicting the target relation. | ||
| P19-1332 Paraphrasing exists at different granularity levels, such as lexical level, phrasal level and ***** sentential ***** level. | ||
| 2021.semeval-1.90 Evaluating the complexity of a target word in a ***** sentential ***** context is the aim of the Lexical Complexity Prediction task at SemEval-2021 | ||
| coherent | 55 | |
| 2021.eacl-main.98 Analyses of the latent space show that interpolation in the latent space is able to generate ***** coherent ***** sentences with smooth transition and demonstrate improved classification over strong baselines with latent features from unsupervised pretraining. | ||
| 2020.coling-main.212 We propose a model that features two different generation components: an outliner, which proceeds the main story line to realize global coherence; a detailer, which supplies relevant details to the story in a locally ***** coherent ***** manner. | ||
| P19-1244 In this paper, we propose a novel end-to-end hierarchical attention network focusing on learning to represent ***** coherent ***** evidence as well as their semantic relatedness with the claim. | ||
| 2020.findings-emnlp.219 There has been considerable progress made towards conversational models that generate ***** coherent ***** and fluent responses; however, this often involves training large language models on large dialogue datasets, such as Reddit. | ||
| P19-1479 By organizing the article into graph structure, our model can better understand the internal structure of the article and the connection between topics, which makes it better able to generate ***** coherent ***** and informative comments | ||
| Hate | 55 | |
| W19-3506 *****Hate***** speech and abusive language spreading on social media need to be detected automatically to avoid conflict between citizen . | ||
| 2021.ltedi-1.24 The proliferation of *****Hate***** Speech and misinformation in social media is fast becoming a menace to society . | ||
| 2021.emnlp-main.29 *****Hate***** speech has grown significantly on social media , causing serious consequences for victims of all demographics . | ||
| 2020.restup-1.3 *****Hate***** speech may take different forms in online social environments . | ||
| S19-2069 *****Hate***** speech occurs more often than ever and polarizes society . | ||
| annotation schema | 55 | |
| L10-1542 We consider an enhanced TEI XML markup language, which is used as an intermediate stage in translating from the initial XML obtained from Optical Character Recognition to taXMLit, the target ***** annotation schema *****. | ||
| E17-1026 Three independent annotators manually labelled MVSC, following a broad ***** annotation schema ***** about different aspects that can be grasped from natural language text coming from social networks. | ||
| E17-1025 Informed by linguistic theories, we propose for the first time a multi-layered ***** annotation schema ***** for irony and its application to a corpus of French, English and Italian tweets. | ||
| L10-1483 We propose a hierarchical dependency structure ***** annotation schema ***** that is more detailed and more flexible than the known ***** annotation schema *****ta. | ||
| W16-4210 The ***** annotation schema ***** is driven by the needs of our clinical partners and the linguistic aspects of German language | ||
| scientific papers | 55 | |
| 2021.acl-long.115 In summary, our contributions are (1) a new dataset for numerical table-to-text generation using pairs of a table and a paragraph of a table description with richer inference from ***** scientific papers *****, and (2) a table-to-text generation framework enriched with numerical reasoning. | ||
| 2020.nlpcovid19-acl.1 The COVID-19 Open Research Dataset (CORD-19) is a growing resource of ***** scientific papers ***** on COVID-19 and related historical coronavirus research. | ||
| E17-3011 This problem can be seen as a semi-supervised clustering of ***** scientific papers ***** based on their features. | ||
| S18-1126 Our hypothesis is that this deep learning model can be applied to extract and classify relations between entities for ***** scientific papers ***** at the same time. | ||
| R19-1098 We propose a simple unsupervised method for extracting pseudo-parallel monolingual sentence pairs from comparable corpora representative of two different text styles, such as news articles and ***** scientific papers *****. | ||
| present | 55 | |
| 2010.amta-papers.6 In this paper, we ***** present ***** the insights gained from a detailed study of coupling a highly modular English-Hindi RBMT system with a standard phrase-based SMT system. | ||
| W17-1413 In the paper we ***** present ***** an adaptation of Liner2 framework to solve the BSNLP 2017 shared task on multilingual named entity recognition. | ||
| D19-1212 Multi-view learning algorithms are powerful re***** present *****ation learning tools, often exploited in the context of multimodal problems. | ||
| D19-6008 This paper explores the use of Bidirectional Encoder Re***** present *****ations from Transformers(BERT) along with external relational knowledge from ConceptNet to tackle the problem of commonsense inference. | ||
| D18-1241 We ***** present ***** QuAC, a dataset for Question Answering in Context that contains 14K information-seeking QA dialogs (100K questions in total). | ||
| raw text | 55 | |
| 2020.aacl-main.29 In this paper, we propose a method which trains the generation model in a completely unsupervised way with unaligned ***** raw text ***** data and KB triples. | ||
| D19-6123 Recently, neural network models which automatically infer syntactic structure from ***** raw text ***** have started to achieve promising results. | ||
| K18-2016 We introduce a complete neural pipeline system that takes ***** raw text ***** as input, and performs all tasks required by the shared task, ranging from tokenization and sentence segmentation, to POS tagging and dependency parsing. | ||
| L06-1266 Korean, and point out a second order machine learning algorithm to unveil term similarity from a given ***** raw text ***** corpus. | ||
| 2021.sigmorphon-1.8 We describe the second SIGMORPHON shared task on unsupervised morphology: the goal of the SIGMORPHON 2021 Shared Task on Unsupervised Morphological Paradigm Clustering is to cluster word types from a ***** raw text ***** corpus into paradigms. | ||
| commonsense reasoning | 55 | |
| D19-6002 HNN obtains new state-of-the-art results on three classic ***** commonsense reasoning ***** tasks, pushing the WNLI benchmark to 89%, the Winograd Schema Challenge (WSC) benchmark to 75.1%, and the PDP60 benchmark to 90.0%. | ||
| 2021.emnlp-main.705 Experimental results show that generalizing ***** commonsense reasoning ***** on unseen assertions is inherently a hard task. | ||
| 2020.coling-main.222 CommonsenseQA is a task in which a correct answer is predicted through ***** commonsense reasoning ***** with pre-defined knowledge. | ||
| 2021.emnlp-main.445 In this paper, we investigate what models learn from ***** commonsense reasoning ***** datasets. | ||
| P19-1487 Deep learning models perform poorly on tasks that require ***** commonsense reasoning *****, which often necessitates some form of world-knowledge or reasoning over information not immediately present in the input. | ||
| humor | 55 | |
| D19-1211 Although ***** humor ***** detection is an established research area in NLP, in a multimodal context it has been understudied. | ||
| 2020.lrec-1.753 Furthermore, current ***** humor ***** datasets are lacking in both joke variety and size, with almost all current datasets having less than 100k jokes. | ||
| 2020.acl-demos.28 Building datasets of creative text, such as ***** humor *****, is quite challenging. | ||
| 2020.semeval-1.111 This paper presents our work on the Memotion Analysis shared task of SemEval 2020, which involves the sentiment and ***** humor ***** analysis of memes. | ||
| 2021.emnlp-main.364 We present the largest dataset to date with labeled ***** humor ***** on 785K posts related to COVID-19. | ||
| conversation | 55 | |
| D19-5809 To properly generate a question coherent to the grounding text and the current ***** conversation ***** history, the proposed framework first locates the focus of a question in the text passage, and then identifies the question pattern that leads the sequential generation of the words in a question. | ||
| 2020.acl-main.54 The goal-oriented dialogue system needs to be optimized for tracking the dialogue flow and carrying out an effective ***** conversation ***** under various situations to meet the user goal. | ||
| L10-1071 Working within the EU funded COMPANIONS program, we investigate the use of appropriateness as a measure of ***** conversation ***** quality, the hypothesis being that good companions need to be good ***** conversation *****al partners . | ||
| 2020.lrec-1.95 The key factor is to recognize these variants and carry out a successful ***** conversation *****, as misinterpretation can lead to total failure of the given interaction. | ||
| 2020.peoples-1.7 Furthermore, we propose contextual augmentation of pretrained language models for emotion recognition in ***** conversation *****s, which is to consider not only previous utterances, but also ***** conversation *****-related information such as speakers, speech acts and topics. | ||
| hypernymy detection | 55 | |
| 2020.ldl-1.11 Although the task has been widely addressed in English, there is not much work in Spanish, and according to our knowledge there is not any available dataset for supervised ***** hypernymy detection ***** in Spanish. | ||
| S18-2025 Existing methods of ***** hypernymy detection ***** mainly rely on statistics over a big corpus, either mining some co-occurring patterns like “animals such as cats” or embedding words of interest into context-aware vectors. | ||
| E17-1007 Being based on general linguistic hypotheses and independent from training data, unsupervised measures are more robust, and therefore are still useful artillery for ***** hypernymy detection *****. | ||
| N18-1103 LEAR specialisation achieves state-of-the-art performance in the tasks of hypernymy directionality, ***** hypernymy detection *****, and graded lexical entailment, demonstrating the effectiveness and robustness of the proposed asymmetric specialisation model. | ||
| 2020.emnlp-main.502 We address ***** hypernymy detection *****, i.e., whether an is-a relationship exists between words (x ,y), with the help of large textual corpora. | ||
| divergences | 54 | |
| 2021.acl-long.562 We show that models trained on synthetic ***** divergences ***** output degenerated text more frequently and are less confident in their predictions. | ||
| C18-1294 In this paper, we used two corpora to investigate the ***** divergences ***** in the behavior of pedagogically relevant grammatical structures in reception and production texts. | ||
| 2000.amta-papers.5 The approach relies on canonical predicate-argument structures (or dependency structures), which provide a suitable pivot representation for the handling of structural ***** divergences ***** and the recovery of dropped arguments. | ||
| J17-3002 To do this, we first devise a hierarchical alignment scheme where Chinese and English parse trees are aligned in a way that eliminates conflicts and redundancies between word alignments and syntactic parses to prevent the generation of spurious translation ***** divergences *****. | ||
| D19-1421 Our experiments on the Penn Treebank and Wikitext-2 show that these power ***** divergences ***** can indeed be used to prioritize learning on the frequent or rare words, and lead to general performance improvements in the case of sampling-based learning | ||
| subtitles | 54 | |
| 2020.findings-emnlp.381 Domain adaptation between distant domains (e.g., movie ***** subtitles ***** and research papers), however, cannot be performed effectively due to mismatches in vocabulary; it will encounter many domain-specific words (e.g., “angstrom”) and words whose meanings shift across domains (e.g., “conductor”). | ||
| W17-5546 Evaluation results on retrieval-based models trained on movie and TV ***** subtitles ***** demonstrate that the inclusion of such a weighting model improves the model performance on unsupervised metrics. | ||
| 2020.lrec-1.498 In the conclusion, the key findings are summarised regarding formal aspects of the ***** subtitles ***** conditioning the accessibility to the multimedia content of the EuroparlTV. | ||
| D17-1293 Therefore we investigate a significant application, which is to associate forum threads to ***** subtitles ***** of video clips. | ||
| 2016.iwslt-1.24 We present our submissions to the IWSLT 2016 machine translation task, as our first attempt to translate ***** subtitles ***** and one of our early experiments with neural machine translation (NMT) | ||
| vocabularies | 54 | |
| 2008.amta-govandcom.4 8.5 platforms, translates patient examination questions for all language pairs in the set English, French, Japanese, Arabic, Catalan, using ***** vocabularies ***** of about 400 to 1 100 words, and can be run in a distributed client/server environment, where the client application is hosted on a Nokia Internet Tablet device. | ||
| 2021.iwslt-1.31 The aim of reducing the size of the input and output ***** vocabularies ***** is to increase the generalization capabilities of the translation model, enabling the system to translate and generate infrequent and new (unseen) words at inference time by combining previously seen sub-word units. | ||
| 2020.lrec-1.241 Recently, the machine learning-based CONTES method has addressed these challenges for reference ***** vocabularies ***** that are ontologies, as is often the case in life sciences and biomedical domains. | ||
| 2021.wmt-1.50 Surprisingly, the smaller size of ***** vocabularies ***** perform better, and the extensive monolingual English data offers a modest improvement. | ||
| P19-1154 However, the conversion is seriously compromised by the ambiguities of Chinese characters corresponding to pinyin as well as the predefined fixed ***** vocabularies ***** | ||
| compute | 54 | |
| L08-1296 Rules for identify resolution, which ***** compute ***** similarities between target and source entities based on class information and instance properties and values, can be defined for each class in the ontology. | ||
| L14-1120 Most methods ***** compute ***** word contexts from general corpora. | ||
| 2021.iwpt-1.12 As we are interested in efficiency, we evaluate core parsers without pretrained language models (as these are typically huge networks and would constitute most of the ***** compute ***** time) or other augmentations that can be transversally applied to any of them. | ||
| 2021.sustainlp-1.5 While these variants are memory and ***** compute ***** efficient, it is not possible to directly use them with popular pre-trained language models trained using vanilla attention, without an expensive corrective pre-training stage. | ||
| C16-1109 For any sentence pair comprising a complex sentence and its simple counterpart, we employ a many-to-one method of aligning each word in the complex sentence with the most similar word in the simple sentence and ***** compute ***** sentence similarity by averaging these word similarities | ||
| kernel | 54 | |
| E17-3022 To provide cohesive answers, we use a measure of rhetoric agreement between a question and an answer by tree ***** kernel ***** learning of their DTs. | ||
| W18-3909 We therefore conclude that our multiple ***** kernel ***** learning method is the best approach to date for Arabic dialect identification. | ||
| D17-1202 By representing each document as a graph-of-words, we are able to model these relationships and then determine how similar two documents are by using a modified shortest-path graph ***** kernel *****. | ||
| W16-4307 Furthermore, we modify the traditional tree ***** kernel ***** function to compute the similarity based on word embedding vectors instead of exact string match and present experiments using the new models. | ||
| 2019.iwslt-1.24 We observe a wide range of performances across different ***** kernel ***** settings | ||
| intelligibility | 54 | |
| 2021.americasnlp-1.3 Using a standard LSTM model and publicly available Bible translations, we explore how character language models can be applied to the tasks of estimating mutual ***** intelligibility *****, identifying genetic similarity, and distinguishing written variants. | ||
| 2001.mtsummit-eval.7 The experiment described here looks only at the ***** intelligibility ***** of MT output. | ||
| 2003.mtsummit-papers.7 In this paper, we present a fine-grained machine translation evaluation framework that, in addition to the notions of ***** intelligibility ***** and fidelity, includes a typology of errors common in automatic translation, as well as several other properties of source and translated texts. | ||
| 2004.amta-papers.25 The experiment shows that snap judgments on ***** intelligibility ***** are made successfully and that system rankings on snap judgments are consistent with more detailed ***** intelligibility ***** measures. | ||
| 2001.mtsummit-eval.8 In this experiment we evaluated a system's ability to translate named entities, and compared this measure with previous evaluation scores of fidelity and ***** intelligibility ***** | ||
| cascaded | 54 | |
| Q19-1020 Speech translation has traditionally been approached through ***** cascaded ***** models consisting of a speech recognizer trained on a corpus of transcribed speech, and a machine translation system trained on parallel texts. | ||
| 2021.eacl-main.216 However, ***** cascaded ***** models have the advantage of including automatic speech recognition output, useful for a variety of practical ST systems that often display transcripts to the user alongside the translations. | ||
| 2020.acl-main.661 This paper provides a brief survey of these developments, along with a discussion of the main challenges of traditional approaches which stem from committing to intermediate representations from the speech recognizer, and from training ***** cascaded ***** models separately towards different objectives. | ||
| 2021.emnlp-main.771 We enhance the ***** cascaded ***** method with different training approaches, including the teacher-student method, the multi-task method, and the back-translation method. | ||
| 2020.iwslt-1.30 We first investigate a ***** cascaded ***** system, where an unsupervised compression model is used to post-edit the transcribed speech | ||
| translated | 54 | |
| 2020.emnlp-main.365 We use the original (monolingual) model to generate sentence embeddings for the source language and then train a new system on ***** translated ***** sentences to mimic the original model. | ||
| 2020.emnlp-main.6 The term translationese has been used to describe features of ***** translated ***** text, and in this paper, we provide detailed analysis of potential adverse effects of translationese on machine translation evaluation. | ||
| 1999.mtsummit-1.84 One of the most important issues in the field of machine translation is evaluation of the ***** translated ***** sentences. | ||
| 2012.amta-monomt.4 This means that we ***** translated ***** the “English” sentences into English by SMT. | ||
| 1999.mtsummit-1.69 The results of the experimental evaluation show that the degree of understandability for sample 2000 sentences amounts to 2.67, indicating that the meaning of the ***** translated ***** English sentences is almost clear to users, but the sentences still include minor grammatical or stylistic errors up to max | ||
| corpus annotated | 54 | |
| 2020.coling-main.402 the first large-scale English multi-domain (community Q&A forums, debate forums, review forums) ***** corpus annotated ***** with theory-based AQ scores. | ||
| W18-5621 We are developing an EHR ***** corpus annotated ***** with time expressions, clinical entities and their relations, to be used for NLP development. | ||
| 2020.lrec-1.33 A ***** corpus annotated ***** with Chinese readers' veridicality judgments is released as the Chinese PragBank for further analysis. | ||
| L10-1419 The associated ***** corpus annotated ***** for source ― target domain mappings will be publicly available | ||
| W19-4004 Here we present a ***** corpus annotated ***** with these relations and the analysis of these results. | ||
| paragraph | 54 | |
| 2020.emnlp-main.714 We propose a graph reasoning network based on the semantic structure of the sentences to learn cross ***** paragraph ***** reasoning paths and find the supporting facts and the answer jointly. | ||
| 2021.rocling-1.7 MRC is an important natural language processing (NLP) task aiming to assess the ability of a machine to understand natural language expressions, which is typically operationalized by first asking questions based on a given text ***** paragraph ***** and then receiving machine-generated answers in accordance with the given context ***** paragraph ***** and questions. | ||
| 2020.nlp4convai-1.3 In this paper, we show that the information from selfattentions of BERT are useful for language modeling of questions conditioned on ***** paragraph ***** and answer phrases. | ||
| 2021.emnlp-main.412 We propose a novel framework PermGen whose objective is to maximize the expected log-likelihood of output ***** paragraph ***** distributions with respect to all possible sentence orders. | ||
| 2020.coling-main.204 Visual storytelling aims to generate a narrative ***** paragraph ***** from a sequence of images automatically | ||
| mentions | 54 | |
| 2020.emnlp-main.563 To solve these problems, this paper proposes a novel extraction-linking approach, where a unified extractor recognizes all types of slot ***** mentions ***** appearing in the question sentence before a linker maps the recognized columns to the table schema to generate executable SQL queries. | ||
| P19-2026 Named entity recognition (NER) and entity linking (EL) are two fundamentally related tasks, since in order to perform EL, first the ***** mentions ***** to entities have to be detected. | ||
| 2020.emnlp-main.722 This is fulfilled by using a linked knowledge graph to select informative entities and then masking their ***** mentions *****. | ||
| W17-2305 It allows intelligent systems to leverage rich knowledge available in those sources (such as concept properties and relations) to enhance the semantics of the ***** mentions ***** of these concepts in text. | ||
| I17-1020 Furthermore, we utilize critical non-textual clues such as time between two consecutive posts and people ***** mentions ***** within the posts | ||
| hyperpartisan | 54 | |
| P18-1022 We show how a style analysis can distinguish ***** hyperpartisan ***** news from the mainstream (F1 = 0.78), and satire from both (F1 = 0.81). | ||
| S19-2176 The task asked participants to predict whether a given article is ***** hyperpartisan *****, i.e., extreme-left or extreme-right. | ||
| 2021.ranlp-1.62 Experimental evidence shows that personality embeddings are effective in three classification tasks including authorship verification, stance, and ***** hyperpartisan ***** detection. | ||
| 2021.ranlp-1.140 Because of its harmful effects at reinforcing one's bias and the posterior behavior of people, ***** hyperpartisan ***** news detection has become an important task for computational linguists. | ||
| S19-2186 It obtained very contrasting results: poor on the main task, but much more effective at distinguishing documents published by ***** hyperpartisan ***** media outlets from unbiased ones, as it ranked first | ||
| multimodal corpus | 54 | |
| L12-1611 In this paper we describe our user study conducted in a lab at the University of Bremen in order to collect empirical speech and gesture data and later create and analyse a ***** multimodal corpus *****. | ||
| W16-4016 We present the HuComTech corpus, a ***** multimodal corpus ***** containing 50 hours of videotaped interviews containing a rich annotation of about 2 million items annotated on 33 levels. | ||
| 2020.lrec-1.806 We present a ***** multimodal corpus ***** for sentiment analysis based on the existing Switchboard-1 Telephone Speech Corpus released by the Linguistic Data Consortium. | ||
| L08-1200 We describe a new ***** multimodal corpus ***** currently under development | ||
| L16-1078 This paper addresses the need of these technologies by presenting and sharing a ***** multimodal corpus ***** of public speaking presentations. | ||
| Question | 54 | |
| 2020.ccl-1.95 *****Question***** classification is a crucial subtask in question answering system . | ||
| W18-6536 *****Question***** Generation is the task of automatically creating questions from textual input . | ||
| D18-1452 We address jointly two important tasks for *****Question***** Answering in community forums : given a new question , ( i ) find related existing questions , and ( ii ) find relevant answers to this new question . | ||
| L08-1445 Despite of the importance of lexical resources for a number of NLP applications ( Machine Translation , Information Extraction , *****Question***** Answering , among others ) , there has been a traditional lack of generic tools for the creation , maintenance and management of computational lexica . | ||
| 2021.eacl-main.26 *****Question***** answering over knowledge bases ( KBQA ) usually involves three sub - tasks , namely topic entity detection , entity linking and relation detection . | ||
| dependency trees | 54 | |
| I17-1007 In this paper, we propose a probabilistic parsing model that defines a proper conditional probability distribution over non-projective ***** dependency trees ***** for a given sentence, using neural representations as inputs. | ||
| P18-2071 Different from widely-used RST-DT and PDTB, SciDTB uses ***** dependency trees ***** to represent discourse structure, which is flexible and simplified to some extent but do not sacrifice structural integrity. | ||
| D19-5901 We pay Turkers to construct unlabeled ***** dependency trees ***** for 500 English sentences using an interactive graphical dependency tree editor, collecting 10 annotations per sentence. | ||
| D19-1569 Although a large majority of works typically focus on leveraging the expressive power of neural networks in handling this task, we explore the possibility of integrating ***** dependency trees ***** with neural networks for representation learning. | ||
| W18-6546 We propose a Bayesian nonparametric approach to learn sentence planning rules by inducing synchronous tree substitution grammars for pairs of text plans and morphosyntactically - specified *****dependency trees***** . | ||
| bilingual lexicon induction | 54 | |
| L16-1524 Based on the assumption, we propose a constraint-based ***** bilingual lexicon induction ***** for closely related languages by extending constraints and translation pair candidates from recent pivot language approach. | ||
| P18-1075 Bilingual tasks, such as ***** bilingual lexicon induction ***** and cross-lingual classification, are crucial for overcoming data sparsity in the target language. | ||
| P17-1179 We carry out evaluation on the unsupervised ***** bilingual lexicon induction ***** task. | ||
| 2020.emnlp-main.100 This paper designs a Monolingual Lexicon Induction task and observes that two factors accompany the degraded accuracy of ***** bilingual lexicon induction ***** for rare words. | ||
| J17-2001 We systematically explore a wide range of features and phenomena that affect the quality of the translations discovered by ***** bilingual lexicon induction *****. | ||
| categorial grammar | 54 | |
| 2020.repl4nlp-1.23 In this paper, we make use of the primitives and operators that constitute the lexical categories of ***** categorial grammar *****s. | ||
| 2017.jeptalnrecital-recital.12 Finding Missing Categories in Incomplete Utterances This paper introduces an efficient algorithm (O(n4 )) for finding a missing category in an incomplete utterance by using unification technique as when learning ***** categorial grammar *****s, and dynamic programming as in Cocke–Younger–Kasami algorithm. | ||
| 1997.iwpt-1.9 This Coordinative Count Invariant is argued to be the strongest possible instrument to prune search space for parsing coordination in ***** categorial grammar *****. | ||
| 2010.jeptalnrecital-demonstration.11 These developments have been possible thanks to a ***** categorial grammar ***** which has been extracted semi-automatically from the Paris 7 treebank and a semantic lexicon which maps word, part-of-speech tags and formulas combinations to Discourse Representation Structures. | ||
| 1995.iwpt-1.20 We present a system for the investigation of computational properties of *****categorial grammar***** parsing based on a labelled analytic tableaux theorem prover . | ||
| term extraction | 54 | |
| 2020.lrec-1.596 First, a quantitative evaluation of the ***** term extraction ***** results and an additional qualitative evaluation by a domain expert. | ||
| 2020.coling-main.73 The improvements justify the effectiveness of the constituency lattice for aspect ***** term extraction *****. | ||
| L14-1364 Our results show that these automatically generated resources can assist ***** term extraction ***** process with similar performance to manually generated resources. | ||
| 2020.acl-main.340 Aspect-based sentiment analysis (ABSA) involves three subtasks, i.e., aspect ***** term extraction *****, opinion ***** term extraction *****, and aspect-level sentiment classification. | ||
| 2020.emnlp-main.164 Aspect ***** term extraction ***** (ATE) aims to extract aspect terms from a review sentence that users have expressed opinions on. | ||
| universal | 54 | |
| 2021.emnlp-main.676 We show that (i) neural architectures outperform other approaches by more than 20 accuracy points, with the BERT-based model performing the best in both the monolingual and multilingual settings; (ii) while many individual hand-crafted translationese features correlate with neural model predictions, feature importance analysis shows that the most important features for neural and classical architectures differ; and (iii) our multilingual experiments provide empirical evidence for translationese ***** universal *****s across languages. | ||
| D19-5404 To overcome these limitations, we present a novel method, which makes use of two types of sentence embeddings: ***** universal ***** embeddings, which are trained on a large unrelated corpus, and domain-specific embeddings, which are learned during training. | ||
| K19-1048 While the task is well-established, there is no ***** universal *****ly used tagset: often, datasets are annotated for use in downstream applications and accordingly only cover a small set of entity types relevant to a particular task. | ||
| K17-3020 In this work, we design a system based on UDPipe1 for ***** universal ***** dependency parsing, where multilingual transition-based models are trained for different treebanks. | ||
| D19-1252 We present Unicoder, a ***** universal ***** language encoder that is insensitive to different languages. | ||
| neural response generation | 54 | |
| 2021.naacl-industry.4 Furthermore, we verify that using external knowledge based on NEL benefits the ***** neural response generation ***** model. | ||
| 2020.acl-demos.30 The pre-trained model and training pipeline are publicly released to facilitate research into ***** neural response generation ***** and the development of more intelligent open-domain dialogue systems. | ||
| W18-5709 We address this new challenge by learning a ***** neural response generation ***** system from the recently released Multimodal Dialogue (MMD) dataset (Saha et al., 2017). | ||
| P19-1360 Semantically controlled ***** neural response generation ***** on limited-domain has achieved great performance. | ||
| D17-1065 The experimental results show that the proposed approach significantly outperforms existing ***** neural response generation ***** models in diversity metrics, with slight increases in relevance scores as well, when evaluated on both a Mandarin corpus and an English corpus. | ||
| sentence ordering | 54 | |
| 2021.eacl-main.308 To investigate how representative the synthetic tasks are of downstream use cases, we conduct experiments on benchmarking well-known traditional and neural coherence models on synthetic ***** sentence ordering ***** tasks, and contrast this with their performance on three downstream applications: coherence evaluation for MT and summarization, and next utterance prediction in retrieval-based dialog. | ||
| 2020.lrec-1.210 In this paper, to evaluate text coherence, we propose the paragraph ordering task as well as conducting ***** sentence ordering *****. | ||
| 2021.emnlp-main.841 We formulate the ***** sentence ordering ***** task as a conditional text-to-marker generation problem. | ||
| 2021.emnlp-main.683 Our graph network accumulates temporal evidence using knowledge of `past' and `future' and formulates ***** sentence ordering ***** as a constrained edge classification problem. | ||
| P19-1067 Previous work advocates for generative models for cross-domain generalization, because for discriminative models, the space of incoherent ***** sentence ordering *****s to discriminate against during training is prohibitively large. | ||
| chinese grammatical error diagnosis | 54 | |
| W18-3725 In this paper, we propose a *****Chinese grammatical error diagnosis***** (CGED) model with contextualized character representation. | ||
| W18-3730 The main goal of *****Chinese grammatical error diagnosis***** task is to detect word er-rors in the sentences written by Chinese-learning students. | ||
| 2020.nlptea-1.5 This paper introduces our system at NLPTEA-2020 Task: *****Chinese Grammatical Error Diagnosis***** (CGED). | ||
| 2020.nlptea-1.7 This paper describes our proposed model for the *****Chinese Grammatical Error Diagnosis***** (CGED) task in NLPTEA2020. | ||
| W16-4910 This paper discusses how to adapt two new word embedding features to build a more efficient *****Chinese Grammatical Error Diagnosis***** (CGED) systems to assist Chinese foreign learners (CFLs) in improving their written essays. | ||
| Evaluations | 53 | |
| L08-1154 ***** Evaluations ***** conducted on two different domains for Chinese term extraction show significant improvements over existing techniques which verifies its efficiency and domain independent nature. | ||
| N18-2100 ***** Evaluations ***** are performed on benchmark datasets producing state-of-the-art results. | ||
| 2020.inlg-1.18 ***** Evaluations ***** on the widely used WikiBIO and WebNLG benchmarks demonstrate the effectiveness of this framework compared to state-of-the-art models. | ||
| 2015.jeptalnrecital-demonstration.6 ***** Evaluations ***** on Vietnamese –French pair of languages show a good accuracy (F-score of 94.90%) when identifying named entities pairs and building a named entity annotated parallel corpus. | ||
| 2020.sdp-1.39 ***** Evaluations ***** against gold standard summaries using ROUGE metrics prove the effectiveness of our approach | ||
| correlation | 53 | |
| 2021.emnlp-main.474 Sentence-level Quality estimation (QE) of machine translation is traditionally formulated as a regression task, and the performance of QE models is typically measured by Pearson ***** correlation ***** with human labels. | ||
| P17-2004 MT evaluation metrics are tested for ***** correlation ***** with human judgments either at the sentence- or the corpus-level. | ||
| C16-1009 In this work, we show that relying on intrinsic evaluations with Pearson ***** correlation ***** can be misleading. | ||
| L08-1131 This paper describes an attempt to reduce the model size by filtering out the less probable entries based on testing ***** correlation ***** using additional training data in an intermediate third language. | ||
| 2021.semeval-1.115 We are particularly interested in the ***** correlation ***** between toxicity and the emotions expressed in online posts | ||
| keywords | 53 | |
| L16-1066 The prediction is based on historical data of the ***** keywords *****, which in our case, are LREC conference proceedings. | ||
| 2021.ecnlp-1.5 Machine generated ***** keywords ***** can be recommended to advertisers for better campaign discoverability as well as used as features for sourcing and ranking models. | ||
| L14-1511 First, ***** keywords ***** are extracted using a hybrid approach mixing linguistic patterns with statistical information. | ||
| 2021.naacl-main.428 In our recommendation system, different people follow different hot ***** keywords ***** with interest | ||
| 2019.gwc-1.42 In the paper , we study the case of building a *****keywords***** database related to the Polish Classification of Activities ( PKD 2007 ) . | ||
| scalable | 53 | |
| 2021.naacl-main.47 The resulting representation enables ***** scalable ***** neural retrieval that does not require expensive approximate vector search and leads to better performance than its dense counterpart. | ||
| L16-1615 The platform is flexible, ***** scalable *****, provides authentication for access restrictions, and was developed taking into consideration the time and effort of providing new services. | ||
| 2020.sigdial-1.4 Multi-domain and open-vocabulary settings complicate the task considerably and demand ***** scalable ***** solutions. | ||
| P19-1132 We show that our approach is not only ***** scalable ***** but can also perform state-of-the-art on the standard benchmark ACE 2005. | ||
| W16-3916 Text normalization techniques based on rules, lexicons or supervised training requiring large corpora are not ***** scalable ***** nor domain interchangeable, and this makes them unsuitable for normalizing user-generated content (UGC) | ||
| logistic | 53 | |
| 2021.econlp-1.4 We show that BERT outperforms dictionary-based predictions and Word2Vec-based predictions in terms of adjusted R-square in ***** logistic ***** regression, k-nearest neighbor (kNN-5), and linear kernel support vector machine (SVM). | ||
| C16-1184 Next, we propose the task to automatically identify segment boundaries in lyrics and train a ***** logistic ***** regression model for the task with the repeated pattern and textual features. | ||
| L06-1483 Of the combination strategies tested, ***** logistic ***** regression models produced the best results for both location and proper-name questions. | ||
| W19-4723 The results show the relative success of the ***** logistic ***** prediction approach and the limitations of the method, therefore further proposals are made to develop the methodology. | ||
| 2021.semeval-1.67 Our system uses ***** logistic ***** regression and a wide range of linguistic features (e.g. psycholinguistic features, n-grams, word frequency, POS tags) to predict the complexity of single words in this dataset | ||
| recommender | 53 | |
| 2021.emnlp-main.617 To tackle these challenges, we introduce a novel framework called NTRD for ***** recommender ***** dialogue system that can decouple the dialogue generation from the item recommendation. | ||
| W17-4204 In this paper we present a ***** recommender ***** system, What To Write and Why, capable of suggesting to a journalist, for a given event, the aspects still uncovered in news articles on which the readers focus their interest. | ||
| C16-1244 The anecdote ***** recommender ***** can recommend proper anecdotes in response to given topics. | ||
| L08-1279 We test the effectiveness of this approach for building a term ***** recommender ***** system designed to help online advertisers discover additional phrases to describe their product offering. | ||
| 2020.ecomnlp-1.4 Building a ***** recommender ***** that joins a human conversation (RJC), we propose information extraction, discourse and argumentation analyses, as well as dialogue management techniques to compute a recommendation for a product and service that is needed by the customer, as inferred from the conversation | ||
| Distant | 53 | |
| N19-1307 ***** Distant ***** supervision has been widely used in relation extraction tasks without hand-labeled datasets recently. | ||
| W19-2601 ***** Distant ***** supervision is widely applied approach to automatically generate large amounts of labelled data with low manual annotation cost. | ||
| 2020.emnlp-main.300 ***** Distant ***** supervision (DS) has been widely adopted to generate auto-labeled data for sentence-level relation extraction (RE) and achieved great results. | ||
| 2020.bionlp-1.20 ***** Distant ***** supervision offers a viable approach to combat this by quickly producing large amounts of labeled, but considerably noisy, data. | ||
| 2021.acl-long.484 *****Distant***** supervision for relation extraction provides uniform bag labels for each sentence inside the bag , while accurate sentence labels are important for downstream applications that need the exact relation type . | ||
| contextual embeddings | 53 | |
| 2020.acl-main.284 We also introduce a novel component of combining ***** contextual embeddings ***** from multiple language models pre-trained on different data sources, which achieves a marked improvement over using embeddings from a single pre-trained language model. | ||
| 2021.naacl-main.369 Several cluster-based methods for semantic change detection with ***** contextual embeddings ***** emerged recently. | ||
| S19-1008 But while ***** contextual embeddings ***** can also be trained at the character level, the effectiveness of such embeddings has not been studied. | ||
| P19-1590 We also find that fine-tuning to in-domain data is crucial to achieving decent performance from ***** contextual embeddings ***** when working with limited supervision. | ||
| 2020.findings-emnlp.150 Our results show that ***** contextual embeddings ***** are more language-neutral and, in general, more informative than aligned static word-type embeddings, which are explicitly trained for language neutrality | ||
| relevance | 53 | |
| 2019.icon-1.11 Then, we proceed to narrow down four algorithms from each of these categories, implement and analytically compare them based on parameters like context ***** relevance *****, efficiency and precision. | ||
| D19-1429 To take the multi-level domain ***** relevance ***** discrepancy into account, in this paper, we propose a fine-grained knowledge fusion model with the domain ***** relevance ***** modeling scheme to control the balance between learning from the target domain data and learning from the source domain model. | ||
| P19-1002 Our empirical study on a real-world Document Grounded Dataset proves that responses generated by our model significantly outperform competitive baselines on both context coherence and knowledge ***** relevance *****. | ||
| 2021.fever-1.7 By considering the context ***** relevance ***** in the fact extraction and verification task, our system achieves 0.29 FEVEROUS score on the development set and 0.25 FEVEROUS score on the blind test set, both outperforming the FEVEROUS baseline. | ||
| D18-1520 Estimated target values are then propagated backward toward word vectors, and a ***** relevance ***** score is computed for each dimension of word vectors | ||
| terms | 53 | |
| 1997.mtsummit-workshop.4 (1) From a linguistic viewpoint, the expected benefits include a refinement of the aspectual model in (Olsen, 1994; Olsen, 1997) (which provides necessary but not sufficient conditions for aspectual com- position), and a refinement of the verb classifications in (Levin, 1993); we also expect our approach to eventually produce a systematic definition (in ***** terms ***** of LCSs and compositional operations) of the precise meaning components responsible for Levin's classification. | ||
| 2021.wmt-1.89 Our submissions (Tencent AI Lab Machine Translation, TMT) in German/French/Spanish⇒English are ranked 1st respectively according to the official evaluation results in ***** terms ***** of BLEU scores. | ||
| 2021.ranlp-1.74 Evaluation of the models was performed in ***** terms ***** of ROUGE, and a manual evaluation of fluency and adequacy of the models was also performed. | ||
| Q18-1017 Using this approach, we achieve considerable improvements in ***** terms ***** of BLEU score on relatively large parallel corpus (WMT14 English to German) and a low-resource (WIT German to English) setup. | ||
| 2020.lrec-1.400 Besides the non - English idiom , in contrast to other subjectivity lexicons available , these lexicons represent different subjectivity dimensions ( other than sentiment ) and are more compact in number of *****terms***** . | ||
| spontaneous speech | 53 | |
| 2020.lrec-1.788 The data confirm the important role played by low pitch accents in Urdu ***** spontaneous speech *****, in line with previous studies on Urdu/Hindi scripted speech. | ||
| 2020.lrec-1.782 A listening test has shown that with a selection of genre-specific utterances, it is possible to show significant differences across genres between two synthetic voices built from ***** spontaneous speech *****. | ||
| L12-1174 The descriptions are instances of content-controlled monologue: semantically ”””“pre-specified”””” but still bearing most hallmarks of ***** spontaneous speech ***** (hesitations and filled pauses, relaxed syntax, repetitions, self-corrections, incomplete constituents, irrelevant or redundant information, etc.) | ||
| L14-1352 We analyzed 300,000 recorded word tokens in read and ***** spontaneous speech ***** uttered by 162 female and male speakers within the German Alcohol Language Corpus. | ||
| D18-2016 Automatically collected samples contain reading and ***** spontaneous speech ***** recorded in various conditions including background noise and music, distant microphone recordings, and a variety of accents and reverberation. | ||
| structured prediction | 53 | |
| C16-1105 The latter are needed to restrict the search space for the ***** structured prediction ***** task defined by the unaligned datasets. | ||
| W19-5910 To support this argument, the research presented in this paper is structured into three stages: (i) analyzing variable dependencies in dialogue data; (ii) applying an energy-based methodology to model dialogue state tracking as a ***** structured prediction ***** task; and (iii) evaluating the impact of inter-slot relationships on model performance. | ||
| 2021.acl-long.206 Pretrained contextualized embeddings are powerful word representations for ***** structured prediction ***** tasks. | ||
| N18-2021 Such scoring functions are often easy to provide; the SPEN then furnishes an efficient ***** structured prediction ***** inference procedure. | ||
| 2012.amta-papers.14 In particular, large-margin ***** structured prediction ***** methods for discriminative training of feature weights, such as the structured perceptron or MIRA, have started to match or exceed the performance of existing methods such as MERT. | ||
| performance | 53 | |
| 2020.emnlp-main.564 We find when schema linking is done well, SLSQL demonstrates good ***** performance ***** on Spider despite its structural simplicity. | ||
| P19-1624 Experimental results on the WMT14 English-German and English-French benchmarks show that our model consistently improves ***** performance ***** over the strong Transformer model, demonstrating the necessity and effectiveness of exploiting sentential context for NMT. | ||
| 2021.naacl-main.269 We integrate our approach into a self-training framework for boosting ***** performance *****. | ||
| 2021.acl-long.96 We also carry out multiple experiments to measure how much each augmentation strategy improves the ***** performance ***** of automatic scoring systems. | ||
| 2020.sltu-1.7 Overall, we show that the proposed multilingual graphemic hybrid ASR with various data augmentation can not only recognize any within training set languages, but also provide large ASR ***** performance ***** improvements. | ||
| neural relation extraction | 53 | |
| 2020.findings-emnlp.20 These biases not only result in unfair evaluations but also mislead the optimization of ***** neural relation extraction *****. | ||
| C18-1099 To address these issues, we propose an adversarial multi-lingual ***** neural relation extraction ***** (AMNRE) model, which builds both consistent and individual representations for each sentence to consider the consistency and diversity among languages. | ||
| D17-1186 To address this issue, we build inference chains between two target entities via intermediate entities, and propose a path-based ***** neural relation extraction ***** model to encode the relational semantics from both direct sentences and inference chains. | ||
| P17-1004 To address this issue, we introduce a multi-lingual ***** neural relation extraction ***** framework, which employs mono-lingual attention to utilize the information within mono-lingual texts and further proposes cross-lingual attention to consider the information consistency and complementarity among cross-lingual texts. | ||
| P19-1137 Pattern-based labeling methods have achieved promising results in alleviating the inevitable labeling noises of distantly supervised ***** neural relation extraction *****. | ||
| span representation | 53 | |
| D19-1585 We perform experiments comparing different techniques to construct *****span representations*****. | ||
| 2020.repl4nlp-1.20 In this paper, we conduct a comprehensive empirical evaluation of six *****span representation***** methods using eight pretrained language representation models across six tasks, including two tasks that we introduce. | ||
| 2021.acl-short.3 We introduce a lightweight end-to-end coreference model that removes the dependency on *****span representations*****, handcrafted features, and heuristics. | ||
| P19-1051 To address these problems, we propose a span-based extract-then-classify framework, where multiple opinion targets are directly extracted from the sentence under the supervision of target span boundaries, and corresponding polarities are then classified using their *****span representations*****. | ||
| N18-2108 Our approach uses the antecedent distribution from a span-ranking architecture as an attention mechanism to iteratively refine *****span representations*****. | ||
| task - oriented dialogue | 53 | |
| 2020.acl-main.566 Training a *****task - oriented dialogue***** agent with reinforcement learning is prohibitively expensive since it requires a large volume of interactions with users . | ||
| 2021.nlp4convai-1.25 Endowing a *****task - oriented dialogue***** system with adaptiveness to user personality can greatly help improve the performance of a dialogue task . | ||
| W19-5905 Dialogue state tracking is an important component in *****task - oriented dialogue***** systems to identify users ' goals and requests as a dialogue proceeds . | ||
| 2021.dialdoc-1.16 Most prior work on *****task - oriented dialogue***** systems are restricted to limited coverage of domain APIs . | ||
| C18-1105 In this paper , we study the problem of data augmentation for language understanding in *****task - oriented dialogue***** system . | ||
| augmenting | 52 | |
| S19-1020 A critical aspect of these semi-supervised learning techniques is ***** augmenting ***** the input or the network with noise to be able to learn robust models. | ||
| 2021.conll-1.38 This commonsense knowledge is rarely communicated explicitly, however, understanding how commonsense knowledge is represented in different paradigms is important for (a) a deeper understanding of human cognition and (b) ***** augmenting ***** automatic reasoning systems. | ||
| S19-2016 We further demonstrate how ***** augmenting ***** data using the baseline systems provides a consistent advantage in all open submission tracks. | ||
| P18-2002 Increasing the capacity of recurrent neural networks (RNN) usually involves ***** augmenting ***** the size of the hidden layer, with significant increase of computational cost. | ||
| 2020.acl-main.85 By ***** augmenting ***** the previous phrase retrieval model (Seo et al., 2019) with Sparc, we show 4%+ improvement in CuratedTREC and SQuAD-Open | ||
| curation | 52 | |
| W19-2207 We present a portfolio of natural legal language processing and document ***** curation ***** services currently under development in a collaborative European project. | ||
| L12-1492 The generated corpus is called the silver standard corpus since the corpus generation process does not involve any manual ***** curation *****. | ||
| C18-2023 In this paper, we demonstrate a system for the automatic extraction and ***** curation ***** of crime-related information from multi-source digitally published News articles collected over a period of five years. | ||
| L06-1376 Our approach will serve for the support of manual database ***** curation ***** and as a basis for text processing applications. | ||
| 2020.rail-1.1 Finally, we will outline our development process, from digitisation to repository publishing as well as present some of the challenges in data clean-up, the ***** curation ***** of legacy media, multi-lingual support, and site organisation | ||
| thereby | 52 | |
| 2021.starsem-1.1 However, additional analysis consistently suggests that TLMs do not capture important aspects of event knowledge, and their predictions often depend on surface linguistic features, such as frequent words, collocations and syntactic patterns, ***** thereby ***** showing sub-optimal generalization abilities. | ||
| 2020.acl-main.156 In particular, they also improve over multilingual Wikipedia-based contextual embeddings (multilingual BERT), which almost always constitutes the previous state of the art, ***** thereby ***** showing that the benefit of a larger, more diverse corpus surpasses the cross-lingual benefit of multilingual embedding architectures. | ||
| 2020.lrec-1.561 Furthermore, the annotation was designed to avoid burdensome requirements related to medical knowledge, ***** thereby ***** enabling corpus development without medical specialists. | ||
| 2021.emnlp-main.304 Our testbed also encompasses self-reported demographic information, including race, sex, age, income, and education - ***** thereby ***** affording opportunities for measuring bias and benchmarking fairness of text classification methods. | ||
| 2021.emnlp-main.3 We generate pseudo-parallel sentence pairs on a monolingual corpus to enable the learning of semantic alignments between different languages, ***** thereby ***** enhancing the semantic modeling of cross-lingual models | ||
| extensible | 52 | |
| 2020.nlposs-1.3 We provide three ***** extensible ***** main components – parser, embedder, and visualizer that can be tailored to suit specific learning setups. | ||
| 2010.iwslt-papers.14 In this paper we present a general and ***** extensible ***** phrase extraction algorithm, where we have highlighted several control points. | ||
| 2020.nlposs-1.14 A particular goal is to ease distribution of reproducible and ***** extensible ***** experiments by making it easy to document and re-run all steps involved, including data loading, pre-processing, model training and evaluation. | ||
| 2021.acl-long.167 We present IrEne, an interpretable and ***** extensible ***** energy prediction system that accurately predicts the inference energy consumption of a wide range of Transformer-based NLP models. | ||
| L06-1246 The platform follows a modular distributed approach, with a specifically designed ***** extensible ***** network protocol handling the communication with the different modules | ||
| PBSMT | 52 | |
| C16-1243 ***** PBSMT ***** engines by default provide four probability scores in phrase tables which are considered as the main set of bilingual features. | ||
| 2011.iwslt-evaluation.4 We use phrase-based statistical machine translation (***** PBSMT *****) models to create the baseline system. | ||
| 2008.jeptalnrecital-court.14 We consider the value of replacing and/or combining string-basedmethods with syntax-based methods for phrase-based statistical machine translation (***** PBSMT *****), and we also consider the relative merits of using constituency-annotated vs. dependency-annotated training data. | ||
| R19-1004 In this paper, we apply a Long Short-Term Memory (LSTM) model over conventional Phrase-Based Statistical MT (***** PBSMT *****) | ||
| 2011.iwslt-papers.10 We present a novel translation quality informed procedure for both extraction and scoring of phrase pairs in *****PBSMT***** systems . | ||
| NE | 52 | |
| W17-5807 MetaMap is used to annotate the category of bio-medical ***** NE *****. | ||
| I17-3009 We showcase TODAY, a semantics-enhanced task-oriented dialogue translation system, whose novelties are: (i) task-oriented named entity (***** NE *****) definition and a hybrid strategy for ***** NE ***** recognition and translation; and (ii) a novel grounded semantic method for dialogue understanding and task-order management. | ||
| W19-5032 Many existing ***** NE ***** methods rely only on network structure, overlooking other information associated with the nodes, e.g., text describing the nodes. | ||
| P17-1158 In experiments, we compare our model with existing ***** NE ***** models on three real-world datasets | ||
| 2020.semeval-1.294 We use a Linear SVM with document vectors computed from pre - trained word embeddings , and we explore the effectiveness of lexical , part of speech , dependency , and named entity ( *****NE***** ) features . | ||
| Quantitative | 52 | |
| 2020.lrec-1.87 ***** Quantitative ***** evaluations using 37,995 responsive utterances showed the appropriateness of the proposed classification. | ||
| 2020.textgraphs-1.8 ***** Quantitative ***** evaluation of the embeddings show a competitive performance on POS tagging task when compared to other types of embeddings, and qualitative evaluation reveals interesting facts about the syntactic typology learned by these embeddings. | ||
| D18-1475 ***** Quantitative ***** and qualitative analyses on Chinese-English and English-German translation tasks demonstrate the effectiveness and universality of the proposed approach. | ||
| W18-3507 ***** Quantitative ***** evaluation with jointly trained network, augmented with linguistic features, reports best accuracies for emotion prediction; namely joy, sadness, anger, and neutral emotion in text | ||
| K19-1033 *****Quantitative***** reasoning is a higher - order reasoning skill that any intelligent natural language understanding system can reasonably be expected to handle . | ||
| perceptual | 52 | |
| 2020.conll-1.15 In this study, we propose a broad-coverage unsupervised neural network model to test memory and prediction as sources of signal by which children might acquire language directly from the ***** perceptual ***** stream. | ||
| 2021.cmcl-1.1 We then examine for the same task whether the additional ***** perceptual ***** information in the brain representations can complement the contextual information in the word-embeddings. | ||
| W19-2906 The proposed model collects noisy bottom-up evidence over multiple timesteps, integrates it with its top-down expectation, and makes ***** perceptual ***** decisions, producing processing time data directly without relying on any linking hypothesis. | ||
| L08-1519 A first question of interest addressed in this paper is whether homophone words such as et (and); and est (to be), for which ASR systems rely on language model weights, can be discriminated in a ***** perceptual ***** transcription test with similar n-gram constraints | ||
| 2020.lrec-1.716 Human semantic knowledge about concepts acquired through *****perceptual***** inputs and daily experiences can be expressed as a bundle of attributes . | ||
| unlabelled | 52 | |
| N19-1223 We show that decomposing the generation process this way leads to state-of-the-art single model performance generating from AMR without additional ***** unlabelled ***** data. | ||
| 2020.acl-main.163 We also show that the model can be trained in a semi-supervised fashion by utilising ***** unlabelled ***** data to boost its performance. | ||
| 2020.coling-main.416 While state-of-the-art models that rely upon massively multilingual pretrained encoders achieve sample efficiency in downstream applications, they still require abundant amounts of ***** unlabelled ***** text. | ||
| 2021.acl-long.191 The diverse paraphrasing is unsupervised as it is applied to ***** unlabelled ***** data, and then fueled to the Prototypical Network training objective as a consistency loss. | ||
| L12-1523 The first model, which we call zoneLDA aims to cluster the sentences into zone classes using only ***** unlabelled ***** data | ||
| Automated | 52 | |
| 2021.hackashop-1.14 The collected resources were offered to participants of a hackathon organized as part of the EACL Hackashop on News Media Content Analysis and ***** Automated ***** Report Generation in February 2021. | ||
| 2020.acl-main.697 In this theme paper, we focus on ***** Automated ***** Writing Evaluation (AWE), using Ellis Page's seminal 1966 paper to frame the presentation. | ||
| 2020.lrec-1.529 This corpus is part of the PASTEL (Performing ***** Automated ***** Speech Transcription for Enhancing Learning) project aiming to explore the potential of synchronous speech transcription and application in specific teaching situations | ||
| 2021.nuse-1.8 *****Automated***** storytelling has long captured the attention of researchers for the ubiquity of narratives in everyday life . | ||
| W17-1605 *****Automated***** scoring of written and spoken responses is an NLP application that can significantly impact lives especially when deployed as part of high - stakes tests such as the GRE and the TOEFL . | ||
| Toxic Spans | 52 | |
| 2021.semeval-1.125 For this reason, considerable efforts are made to deal with this, and SemEval-2021 Task 5: ***** Toxic Spans ***** Detection is one of those. | ||
| 2021.semeval-1.117 This paper introduces our system at SemEval-2021 Task 5: ***** Toxic Spans ***** Detection. | ||
| 2021.semeval-1.133 This paper presents a system used for SemEval-2021 Task 5: ***** Toxic Spans ***** Detection. | ||
| 2021.semeval-1.123 This paper describes our contribution to SemEval-2021 Task 5 : *****Toxic Spans***** Detection . | ||
| 2021.semeval-1.134 This paper describes the participation of SINAI team at Task 5 : *****Toxic Spans***** Detection which consists of identifying spans that make a text toxic . | ||
| hypothesis | 52 | |
| 2020.calcs-1.2 Natural Language Inference (NLI) is the task of inferring the logical relationship, typically entailment or contradiction, between a premise and ***** hypothesis *****. | ||
| N18-1132 Existing approaches mostly rely on simple reading mechanisms for independent encoding of the premise and ***** hypothesis *****. | ||
| C18-1311 This stability analysis affirms the heterogeneous/homogeneous document category ***** hypothesis ***** first presented in Simonson and Davis (2016), whose technique is problematically limited. | ||
| W17-5051 We present a new longitudinal L1 learner corpus for German (handwritten texts collected in grade 2–4), which is transcribed and annotated with a target ***** hypothesis ***** that strictly only corrects orthographic errors, and is thereby tailored to research and tool development for orthographic issues in primary school. | ||
| L12-1230 We propose an abstract model of objective quantitative evaluation based on rough sets, as well as the notion of potential performance space for describing the performance variations corresponding to the ambiguity present in ***** hypothesis ***** data produced by a computer program, when comparing it to the reference data created by humans | ||
| submission | 52 | |
| 2021.semeval-1.61 In this work, we describe our system ***** submission ***** to the SemEval 2021 Task 11: NLP Contribution Graph Challenge. | ||
| W18-0917 Our results vary between test sets: Neural CRF standalone is the best one on ***** submission ***** data, while combined system scores the highest on a test subset randomly selected from training data. | ||
| S19-2046 In this paper, we present our system ***** submission ***** for the EmoContext, the third task of the SemEval 2019 workshop. | ||
| S19-2197 In the present paper we describe the UPV-28-UNITO system's ***** submission ***** to the RumorEval 2019 shared task. | ||
| W19-3018 This paper describes our system ***** submission ***** for the CLPsych 2019 shared task B on suicide risk assessment | ||
| authorship | 52 | |
| W19-8628 Curiously, in the field of stylometry, content does not figure prominently in practical methods of discriminating stylistic elements, such as ***** authorship ***** and genre. | ||
| C18-1234 In this paper we demonstrate an effective template-based approach for combining various syntactic features of a document for ***** authorship ***** analysis. | ||
| L10-1340 Although the automatic ***** authorship ***** classification imposes a number of limitations on the dataset for further experiments, after overcoming these issues the ***** authorship ***** attribution technique modeling the personalized approach confirms the increase over the baseline with no ***** authorship ***** information used. | ||
| W17-4913 In ***** authorship ***** attribution, many different approaches have successfully resolved this issue at the cost of linguistic interpretability: The resulting algorithms may be able to distinguish one language variety from the other, but do not give us much information on their distinctive linguistic properties. | ||
| W16-5115 In cases where such information is not available, identifying the ***** authorship ***** of publications becomes very challenging. | ||
| Wikipedia | 52 | |
| P19-2044 Our experiments confirm that graph embeddings trained on a graph of hyperlinks between ***** Wikipedia ***** articles improve the performances of simple feed-forward neural ED model and a state-of-the-art neural ED system. | ||
| P19-1094 We focus on relations between ***** Wikipedia ***** concepts, and show that they differ from well-studied lexical-semantic relations such as hypernyms, hyponyms and antonyms. | ||
| 2021.gebnlp-1.9 We provide insights on how such asymmetries can influence other ***** Wikipedia ***** components and propose steps towards reducing the frequency of observed patterns. | ||
| W18-6111 We extend the automatic error annotation tool ERRANT (Bryant et al., 2017) for German and use it to analyze both gold GEC corrections and ***** Wikipedia ***** edits (Grundkiewicz and Junczys-Dowmunt, 2014) in order to select as additional training data ***** Wikipedia ***** edits containing grammatical corrections similar to those in the gold corpus. | ||
| Q16-1011 Key among them is that most knowledge bases do not contain the rich textual and structural information ***** Wikipedia ***** does; consequently, the main supervision signal used to train Wikification rankers does not exist anymore | ||
| intent | 52 | |
| D19-6101 We formulate it as a Few-Shot Integration (FSI) problem where a few examples are used to introduce a new ***** intent *****. | ||
| 2020.emnlp-main.535 We propose a unified, readily scalable neural approach which reconciles all subtasks like ***** intent ***** prediction and knowledge retrieval. | ||
| 2020.acl-main.99 Since user ***** intent ***** may frequently change over time in many realistic scenarios, unknown (new) ***** intent ***** detection has become an essential problem, where the study has just begun. | ||
| D18-1417 The results show that our model achieves state-of-the-art and outperforms other popular methods by a large margin in terms of both ***** intent ***** detection error rate and slot filling F1-score. | ||
| 2021.ranlp-1.127 Unknown or new ***** intent ***** detection is a critical task, as in a realistic scenario a user ***** intent ***** may frequently change over time and divert even to an ***** intent ***** previously not encountered | ||
| polish | 52 | |
| L12-1227 Then we focus on some post-processing steps to ***** polish ***** the harvested records. | ||
| 2021.acl-long.473 The relation graph and the document representation are interacted and ***** polish *****ed iteratively, complementing each other in the training process. | ||
| 2014.amta-researchers.9 We introduce two document-level features to ***** polish ***** baseline sentence-level translations generated by a state-of-the-art statistical machine translation (SMT) system. | ||
| D18-1442 To address this issue we introduce a model which iteratively ***** polish *****es the document representation on many passes through the document. | ||
| D18-1048 In this paper, we propose a novel architecture called adaptive multi-pass decoder, which introduces a flexible multi-pass ***** polish *****ing mechanism to extend the capacity of NMT via reinforcement learning. | ||
| results | 52 | |
| 2021.wnut-1.39 Our ***** results ***** reveal that using Self-Critical Sequence Training to optimize CIDEr-R generates descriptive captions. | ||
| P19-1624 Experimental ***** results ***** on the WMT14 English-German and English-French benchmarks show that our model consistently improves performance over the strong Transformer model, demonstrating the necessity and effectiveness of exploiting sentential context for NMT. | ||
| D19-1566 Experimental ***** results ***** suggest the efficacy of the proposed model for both sentiment and emotion analysis over various existing state-of-the-art systems. | ||
| 2021.wmt-1.89 Our submissions (Tencent AI Lab Machine Translation, TMT) in German/French/Spanish⇒English are ranked 1st respectively according to the official evaluation ***** results ***** in terms of BLEU scores. | ||
| D17-1310 Experiment ***** results ***** over two benchmark datasets demonstrate the effectiveness of our framework. | ||
| story | 52 | |
| D19-5809 To properly generate a question coherent to the grounding text and the current conversation hi***** story *****, the proposed framework first locates the focus of a question in the text passage, and then identifies the question pattern that leads the sequential generation of the words in a question. | ||
| 2021.sigdial-1.30 We hypothesize that a multi-task model that trains on character dialogue plus character relationship information improves transformer-based ***** story ***** continuation. | ||
| L10-1138 In this paper, we report on a study that was performed within the Semantics of Hi***** story ***** project on how descriptions of historical events are realized in different types of text and what the implications are for modeling the event information. | ||
| 2020.rail-1.1 The ǂKhomani San, Hugh Brody Collection features the voices and hi***** story ***** of indigenous hunter gatherer descendants in three endangered languages namely, N|uu, Kora and Khoekhoe as well as a regional dialect of Afrikaans. | ||
| W19-9006 Focus being not just on foreign language tuition, but above all on people, places and events in the hi***** story ***** and culture of the EU member states, the annotation modules of the e-Platform have been accordingly extended. | ||
| response | 52 | |
| 2021.nlp4convai-1.23 Humans make appropriate ***** response *****s not only based on previous dialogue utterances but also on implicit background knowledge such as common sense. | ||
| W17-5503 We test state of the art dialogue systems for their behaviour in ***** response ***** to user-initiated sub-dialogues, i.e. | ||
| 2020.emnlp-main.190 We evaluate models on their generalizability to out-of-domain examples, ***** response *****s to missing or incorrect data, and ability to handle question variations. | ||
| P18-1103 Human generates ***** response *****s relying on semantic and functional dependencies, including coreference relation, among dialogue elements and their context. | ||
| 2021.sigdial-1.49 But, the effects of minimizing an alternate training objective that fosters a model to generate alternate ***** response ***** and score it on semantic similarity has not been well studied. | ||
| historical | 52 | |
| 2021.latechclfl-1.8 We have evaluated multiple traditional machine learning approaches as well as transformer-based models pretrained on ***** historical ***** and contemporary language for a single-label text sequence emotion classification for the different emotion categories. | ||
| L10-1138 In this paper, we report on a study that was performed within the Semantics of History project on how descriptions of ***** historical ***** events are realized in different types of text and what the implications are for modeling the event information. | ||
| L14-1624 To this end it has been equipped with the largest diachronic lexicon and a ***** historical ***** name list developed at the Institute for Dutch Lexicology or INL. | ||
| 2021.emnlp-main.136 However, merely learning the knowledge from the ***** historical ***** tasks, adopted by current meta-learning algorithms, may not generalize well to testing tasks when they are not well-supported by training tasks. | ||
| E17-4002 texts from ***** historical ***** languages that did not develop to a standard variety. | ||
| subword information | 52 | |
| W18-3011 To improve word embedding, ***** subword information ***** has been widely employed in state-of-the-art methods. | ||
| N19-1278 Experiments on standard benchmark show that ***** subword information ***** brings significant gains over strong character-based segmentation models. | ||
| 2020.sltu-1.13 Our results show that our method that leverages ***** subword information ***** outperforms the model without ***** subword information *****, both in intrinsic and extrinsic evaluations of the learned embeddings. | ||
| K19-1021 Recent work has validated the importance of ***** subword information ***** for word representation learning. | ||
| D17-1298 In a controlled experiment of sequence-to-sequence approaches for the task of sentence correction, we find that character-based models are generally more effective than word-based models and models that encode ***** subword information ***** via convolutions, and that modeling the output data as a series of diffs improves effectiveness over standard approaches. | ||
| Machine Translation ( MT | 52 | |
| 2005.mtsummit-papers.7 The main objective of our project is to extract clinical information from thoracic radiology reports in Portuguese using *****Machine Translation ( MT***** ) and cross language information retrieval techniques . | ||
| 2010.jec-1.5 Although *****Machine Translation ( MT***** ) has been attracting more and more attention from the translation industry , the quality of current MT systems still requires humans to post - edit translations to ensure their quality . | ||
| L10-1622 Identification of transliterations is aimed at enriching multilingual lexicons and improving performance in various Natural Language Processing ( NLP ) applications including Cross Language Information Retrieval ( CLIR ) and *****Machine Translation ( MT***** ) . | ||
| 2012.amta-commercial.12 *****Machine Translation ( MT***** ) is said to be the next lingua franca . | ||
| 2020.framenet-1.7 Large coverage lexical resources that bear deep linguistic information have always been considered useful for many natural language processing ( NLP ) applications including *****Machine Translation ( MT***** ) . | ||
| generalized | 51 | |
| L14-1422 This paper reconsiders and enhances the current and ***** generalized ***** representation of annotations. | ||
| Q16-1038 The model abstracts over the specific entities appearing in the articles, grouping them into ***** generalized ***** categories, thus allowing the model to adapt to previously unseen situations. | ||
| 1984.bcs-1.22 For the neural systems (underlying the memory functions of the brain) recent advancements in ***** generalized ***** quantum theoretical methods provide some bases. | ||
| 2020.findings-emnlp.212 We further ***** generalized ***** our model to unsupervised text style transfer task, and achieved significant improvements on two benchmark sentiment style transfer datasets. | ||
| I17-1055 We first apply two state-of-the-art lightly-supervised classification models, ***** generalized ***** expectation (GE) criteria (Druck et al., 2008) and multinomial naive Bayes (MNB) with priors (Settles, 2011) to one-class classification where the user only needs to provide a small list of labeled words for the target class | ||
| predictor | 51 | |
| L06-1277 Moreover, with a lexical type ***** predictor ***** based on a maximum entropy model, new lexical entries are automatically generated. | ||
| 2021.cmcl-1.21 Our results for integration cost corroborate those of Demberg and Keller (2008), finding that it is a negative ***** predictor ***** of reading times overall and a strong positive ***** predictor ***** for nouns, but contrast with their observations for surprisal, finding strong evidence for lexicalized surprisal as a ***** predictor ***** of reading times. | ||
| L12-1007 A 4-class emotion transition ***** predictor *****, a 2-class writer emotion ***** predictor *****, and a 2-class reader emotion ***** predictor ***** are proposed and compared. | ||
| 2020.acl-main.604 RikiNet contains a dynamic paragraph dual-attention reader and a multi-level cascaded answer ***** predictor ***** | ||
| L12-1073 We analyzed tweet messages crawled during the eight weeks leading to the UK General Election in May 2010 and found that activities at Twitter is not necessarily a good *****predictor***** of popularity of political parties . | ||
| bitext | 51 | |
| L14-1170 Guampa enables volunteers and students to work together to translate documents into heritage languages, both to make more materials available in those languages, and also to generate ***** bitext ***** suitable for training machine translation systems. | ||
| 2020.wmt-1.136 Our primary submission is a subword-level Transformer-based neural machine translation model trained on original training ***** bitext *****. | ||
| 2020.wmt-1.8 We explore techniques that leverage ***** bitext ***** and monolingual data from all languages, such as self-supervised model pretraining, multilingual models, data augmentation, and reranking. | ||
| N18-1123 This effectively injects semantic and/or syntactic knowledge into the translation model, which would otherwise require a large amount of training ***** bitext ***** to learn from. | ||
| 2000.amta-papers.12 Departing from an annotated ***** bitext ***** we show how SGML markup can be recycled to produce complementary language resources | ||
| concatenated | 51 | |
| C18-1149 In this paper, we develop a multi-attention-based neural network (MANN) with well-designed optimizations, like Highway Network, and ***** concatenated ***** features with embedding representations into the hierarchical neural network model. | ||
| 2020.insights-1.18 However, we find that the most effective representations overall are learned by simply training with a skip-gram objective over the ***** concatenated ***** text of all entries in the dictionary, giving no particular focus to the structure of the entries. | ||
| 2021.emnlp-main.621 Besides, designing the ***** concatenated ***** actions is laborious to engineers and maybe struggled with edge cases. | ||
| 2021.ranlp-1.77 In an experiment using the livedoor news corpus, which is Japanese, we compared the accuracy of document classification using two methods for selecting documents to be ***** concatenated ***** with that of ordinary document classification. | ||
| W18-6448 To create our training data, we ***** concatenated ***** several parallel corpora, both from in-domain and out-of-domain sources, as well as terminological resources from UMLS | ||
| concreteness | 51 | |
| P18-1239 To improve image-based translation, we introduce a novel method of predicting word ***** concreteness ***** from images, which improves on a previous state-of-the-art unsupervised technique. | ||
| 2020.semeval-1.76 We also explore the utility of external resources that aim to supplement the world knowledge inherent in such language models, including commonsense knowledge graph embedding models, word ***** concreteness ***** ratings, and text-to-image generation models. | ||
| N18-1199 We give an algorithm for automatically computing the visual ***** concreteness ***** of words and topics within multimodal datasets. | ||
| S18-2004 Moreover, our NWS scores positively correlate with psycholinguistic measures such as ***** concreteness *****, and imageability implying a close connection to the salience as perceived by humans | ||
| I17-2018 We present and take advantage of the inherent visualizability properties of words in visual corpora ( the textual components of vision - language datasets ) to compute *****concreteness***** scores for words . | ||
| MLM | 51 | |
| 2020.coling-main.327 In this paper, we propose the Contextualized Language and Knowledge Embedding (CoLAKE), which jointly learns contextualized representation for both language and knowledge with the extended ***** MLM ***** objective. | ||
| 2021.emnlp-main.158 Then, we use a revised masked language model (***** MLM *****) to evaluate the quality of the segmentation results based on the predictions of the ***** MLM *****. | ||
| D19-1448 We chose the Transformers for our analysis as they have been shown effective with various tasks, including machine translation (MT), standard left-to-right language models (LM) and masked language modeling (***** MLM *****). | ||
| 2021.semeval-1.21 We use encoders of transformers-based models pretrained on the ***** MLM ***** task to build our Fill-in-the-blank (FitB) models. | ||
| 2021.emnlp-main.249 In this paper, we explore five simple pretraining objectives based on token-level classification tasks as replacements of ***** MLM ***** | ||
| usability | 51 | |
| 2020.lrec-1.825 We additionally train our model on the SwissText dataset to demonstrate ***** usability ***** on German. | ||
| 2020.emnlp-main.115 Finally, we outline a protocol to evaluate model ***** usability ***** in a clinical decision support context. | ||
| 2021.acl-demo.36 However, existing Text-to-SQL semantic parsers cannot achieve high enough accuracy in the cross-database setting to allow good ***** usability ***** in practice. | ||
| L06-1008 This paper presents the results of the ***** usability ***** evaluations that were conducted within TransType2, an international R&D project the goal of which was to develop a novel approach to interactive machine translation. | ||
| 2001.mtsummit-eval.12 This is a comparative ***** usability *****- and adequacy-oriented evaluation in that it attempts to help such organisations decide which system produces the most adequate output for certain well-defined user types | ||
| handcrafted | 51 | |
| 2020.coling-main.535 Furthermore, hybrid methods that integrate ***** handcrafted ***** features in a DNN-AES model have been recently developed and have achieved state-of-the-art accuracy. | ||
| 2021.acl-short.3 We introduce a lightweight end-to-end coreference model that removes the dependency on span representations, ***** handcrafted ***** features, and heuristics. | ||
| P19-1141 Many existing approaches to incorporating gazetteers into machine learning based NER systems rely on manually defined selection strategies or ***** handcrafted ***** templates, which may not always lead to optimal effectiveness, especially when multiple gazetteers are involved. | ||
| 2020.acl-main.319 Instead of collecting and analyzing bad cases using limited ***** handcrafted ***** error features, here we investigate this issue by generating adversarial examples via a new paradigm based on reinforcement learning. | ||
| P18-1149 Prior document dating systems have largely relied on ***** handcrafted ***** features while ignoring such document-internal structures | ||
| plWordNet | 51 | |
| L12-1555 It was built on the basis of machine learning in a way following the bootstrapping approach: a limited set of derivational pairs described manually by linguists in ***** plWordNet ***** is used to train \emphDerivator. | ||
| 2018.gwc-1.24 The paper presents a feature-based model of equivalence targeted at (manual) sense linking between Princeton WordNet and ***** plWordNet *****. | ||
| 2019.gwc-1.45 The most significant developments since 3.0 version include new relations for nouns and verbs, mapping semantic role-relations from the valency lexicon Walenty onto the ***** plWordNet ***** structure and sense-level inter-lingual mapping. | ||
| 2016.gwc-1.14 In this paper , we present methods of extraction of multi - word lexical units ( MWLUs ) from large text corpora and their description in *****plWordNet***** 3.0 . | ||
| C16-1213 We have released *****plWordNet***** 3.0 , a very large wordnet for Polish . | ||
| colloquial | 51 | |
| W16-4315 We use parallel corpora of movie subtitles as a proxy for ***** colloquial ***** language in social media channels and a multilingual emotion lexicon for fine-grained sentiment analyses. | ||
| W16-3920 The main challenge that we aim to tackle in our participation is the short, noisy and ***** colloquial ***** nature of tweets, which makes named entity recognition in Twitter message a challenging task. | ||
| W18-1208 Our customers use ***** colloquial ***** language, non-standard acronyms and sometimes mis-spell words when they use our Search portal or interact over other channels. | ||
| W17-2504 With the advent of informal electronic communications such as social media, ***** colloquial ***** languages that were historically unwritten are being written for the first time in heavily code-switched environments. | ||
| 2020.lrec-1.328 We present in this paper our work on Algerian language, an under-resourced North African ***** colloquial ***** Arabic variety, for which we built a comparably large corpus of more than 36,000 code-switched user-generated comments annotated for sentiments | ||
| sparse | 51 | |
| 2021.blackboxnlp-1.12 We construct these embeddings through ***** sparse ***** coding, where each vector in the basis set is itself a word embedding. | ||
| 2021.acl-short.77 We show theoretically and empirically that the performance for dense representations decreases quicker than ***** sparse ***** representations for increasing index sizes. | ||
| W18-5422 Previous research on word embeddings has shown that ***** sparse ***** representations, which can be either learned on top of existing dense embeddings or obtained through model constraints during training time, have the benefit of increased interpretability properties: to some degree, each dimension can be understood by a human and associated with a recognizable feature in the data. | ||
| W19-0603 We show that standard evaluation measures do not take into account the semantic richness of a description, and give the impression that ***** sparse ***** machine descriptions outperform rich human descriptions. | ||
| N18-4007 This research proposal describes two algorithms that are aimed at learning word embeddings for data ***** sparse ***** and sentiment rich data sets | ||
| syntactic dependencies | 51 | |
| 2020.udw-1.20 In this paper, we introduce the first Universal Dependencies (UD) treebank for standard Albanian, consisting of 60 sentences collected from the Albanian Wikipedia, annotated with lemmas, universal part-of-speech tags, morphological features and ***** syntactic dependencies *****. | ||
| W17-1901 Word vectors are compositionally combined by ***** syntactic dependencies *****. | ||
| L10-1256 We also use a pattern knowledge base over the ***** syntactic dependencies ***** to extract flat predicative logical representations. | ||
| L06-1326 MW expressions considered in the database include named entities and lexical associations with different degrees of cohesion, ranging from frozen groups, which undergo little or no variation, to lexical collocations composed of words that tend to occur together and that constitute ***** syntactic dependencies *****, although with a low degree of fixedness | ||
| Q15-1038 Given a large corpus of definitions we leverage ***** syntactic dependencies ***** to reduce data sparsity, then disambiguate the arguments and content words of the relation strings, and finally exploit the resulting information to organize the acquired relations hierarchically. | ||
| Visual | 51 | |
| P19-1351 Hence, we propose a combined ***** Visual ***** and Textual Question Answering (VTQA) model which takes as input a paragraph caption as well as the corresponding image, and answers the given question based on both inputs. | ||
| 2021.emnlp-main.390 *****Visual***** Dialog is assumed to require the dialog history to generate correct responses during a dialog . | ||
| W19-1805 *****Visual***** storytelling is an intriguing and complex task that only recently entered the research arena . | ||
| D18-1118 *****Visual***** reasoning is a special visual question answering problem that is multi - step and compositional by nature , and also requires intensive text - vision interactions . | ||
| 2021.eacl-main.290 *****Visual***** dialog is a vision - language task where an agent needs to answer a series of questions grounded in an image based on the understanding of the dialog history and the image . | ||
| recognizing textual entailment | 51 | |
| L10-1469 Many natural language processing tasks, including information extraction, question answering and ***** recognizing textual entailment *****, require analysis of the polarity, focus of polarity, tense, aspect, mood and source of the event mentions in a text in addition to its predicate-argument structure analysis. | ||
| D19-1340 Here, we investigate the importance that a model assigns to various aspects of data while learning and making predictions, specifically, in a ***** recognizing textual entailment ***** (RTE) task. | ||
| 2014.lilt-9.3 From a purely theoretical point of view, it makes sense to approach ***** recognizing textual entailment ***** (RTE) with the help of logic. | ||
| N18-1101 ***** recognizing textual entailment *****), improving upon available resources in both its coverage and difficulty. | ||
| Q17-1027 We propose an evaluation of automated common-sense inference based on an extension of ***** recognizing textual entailment *****: predicting ordinal human responses on the subjective likelihood of an inference holding in a given context. | ||
| literature | 51 | |
| W19-2912 Inspired by the ***** literature ***** on multisensory integration, we develop a computational model to ground quantifiers in perception. | ||
| 2020.acl-main.22 Our work, inspired by pre-ordering ***** literature ***** in machine translation, uses syntactic transformations to softly “reorder” the source sentence and guide our neural paraphrasing model. | ||
| 2020.lrec-1.468 This raises the question whether such systems are able to produce high-quality translations for more creative text types such as ***** literature ***** and whether they are able to generate coherent translations on document level. | ||
| D19-6203 Unfortunately, the models in the ***** literature ***** tend to employ different strategies to perform pooling for RE, leading to the challenge to determine the best pooling mechanism for this problem, especially in the biomedical domain. | ||
| 2021.sigmorphon-1.20 Suggested by previous ***** literature *****, this class of languages should approach the characterization of natural language word sets. | ||
| unsupervised learning | 51 | |
| W19-2511 We have created two sets of labels for Hafez (1315-1390) poems, using ***** unsupervised learning *****. | ||
| S17-2110 We propose two Arabic sentiment classification models implemented using supervised and ***** unsupervised learning ***** strategies. | ||
| 2020.fnp-1.28 First, we identify text blocks as candidates for titles using ***** unsupervised learning ***** based on character-level information of each document. | ||
| W19-4321 Therefore, we explore and evaluate several sub-word unit based embedding strategies – character n-grams, lemmatization provided by an NLP-pipeline, and segments obtained in ***** unsupervised learning ***** (morfessor) – to boost semantic consistency in Hungarian word vectors. | ||
| W18-6428 These systems were used to participate in the WMT18 news translation shared task and more specifically, for the ***** unsupervised learning ***** sub-track. | ||
| bilingual lexicon | 51 | |
| L16-1524 Based on the assumption, we propose a constraint-based ***** bilingual lexicon ***** induction for closely related languages by extending constraints and translation pair candidates from recent pivot language approach. | ||
| P18-1075 Bilingual tasks, such as ***** bilingual lexicon ***** induction and cross-lingual classification, are crucial for overcoming data sparsity in the target language. | ||
| L12-1400 We illustrate the use of such a trilingual resource for automatic induction of ***** bilingual lexicon *****s, which is a real challenge for under-represented languages. | ||
| 1998.amta-papers.1 We present two problems for statistically extracting ***** bilingual lexicon *****: (1) How can noisy parallel corpora be used? | ||
| 2001.mtsummit-papers.36 In this paper, we present a way to integrate ***** bilingual lexicon *****s into an operational probabilistic translation assistant (TransType). | ||
| hierarchical attention network | 51 | |
| 2021.acl-srw.9 For spatial features, we propose a ***** hierarchical attention network ***** to model the spatial information from object-level to video-level. | ||
| S18-1042 Our system consists of three main modules: preprocessing module, stacking module to solve the intensity prediction of emotion and sentiment, LSTM network module to solve multi-label classification, and the ***** hierarchical attention network ***** module for solving emotion and sentiment classification problem. | ||
| I17-1102 To this end, we propose multilingual ***** hierarchical attention network *****s for learning document structures, with shared encoders and/or shared attention mechanisms across languages, using multi-task learning and an aligned semantic space as input. | ||
| 2020.findings-emnlp.220 Our proposed pipeline includes (1) a combined event extraction method that utilizes Open Information Extraction and neural co-reference resolution, (2) a BERT/ALBERT enhanced representation of events, and (3) an extended ***** hierarchical attention network ***** that includes attentions on event, news and temporal levels. | ||
| 2020.autosimtrans-1.5 Our encoder is based on a ***** hierarchical attention network ***** (HAN) (Miculicich et al., 2018). | ||
| offensive tweet | 51 | |
| 2020.semeval-1.248 Then, for *****offensive tweets*****, sub-task B requires determining whether the toxicity is targeted. | ||
| S19-2097 *****Offensive tweets***** have to be identified, captured and processed further, for a variety of reasons, which include i) identifying offensive tweets in order to prevent violent/abusive behavior in Twitter (or any social media for that matter), ii) creating and maintaining a history of offensive tweets for individual users (would be helpful in creating meta-data for user profile), iii) inferring the sentiment of the users on particular event/issue/topic . | ||
| S19-2123 This paper examines different approaches and models towards *****offensive tweet***** classification which were used as a part of the OffensEval 2019 competition. | ||
| W18-3504 The paper focuses on the classification of *****offensive tweets***** written in Hinglish language, which is a portmanteau of the Indic language Hindi with the Roman script. | ||
| 2020.semeval-1.281 I present the system based on the architecture of bidirectional long short-term memory networks (BiLSTM) concatenated with lexicon-based features and a social-network specific feature and then followed by two fully connected dense layers for detecting Turkish *****offensive tweets*****. | ||
| natural language processing ( NLP | 51 | |
| 2016.gwc-1.38 Wordnets play an important role not only in linguistics but also in *****natural language processing ( NLP***** ) . | ||
| 2020.emnlp-main.318 Word - level information is important in *****natural language processing ( NLP***** ) , especially for the Chinese language due to its high linguistic complexity . | ||
| 2020.emnlp-main.155 Machine learning techniques have been widely used in *****natural language processing ( NLP***** ) . | ||
| 2021.naacl-main.213 Source code processing heavily relies on the methods widely used in *****natural language processing ( NLP***** ) , but involves specifics that need to be taken into account to achieve higher quality . | ||
| 2021.teachingnlp-1.16 Introducing biomedical informatics ( BMI ) students to *****natural language processing ( NLP***** ) requires balancing technical depth with practical know - how to address application - focused needs . | ||
| adaptive | 50 | |
| 2020.starsem-1.1 To remedy this, we devise a knowledge ***** adaptive ***** approach for medical NLI that encodes the premise/hypothesis texts by leveraging supplementary external knowledge, alongside the UMLS, based on the word contexts. | ||
| L16-1500 The project aims to develop a dialogue system with flexible dialogue management to enable system's ***** adaptive *****, reactive, interactive and proactive dialogue behavior in setting goals, choosing appropriate strategies and monitoring numerous parallel interpretation and management processes. | ||
| L08-1235 Our approach is based on the ***** adaptive ***** selection of candidate interactions sentences, which are then parsed using our own dependency parser. | ||
| 2020.acl-srw.1 In this work, we extend ***** adaptive ***** approaches to learn more about model interpretability and computational efficiency. | ||
| P18-1095 In this paper, we propose ***** adaptive ***** scaling, an algorithm which can handle the positive sparsity problem and directly optimize over F-measure via dynamic cost-sensitive learning | ||
| RDF | 50 | |
| L12-1427 The resources are annotated for macro-area, content language, and document type and are available in XHTML and ***** RDF *****. | ||
| 2020.lrec-1.401 We describe the mapping and harmonization of the underlying data structures into a unified representation, its serialization in ***** RDF ***** and TSV, and the release of a massive and coherent amount of lexical data under open licenses. | ||
| L16-1143 As a result, there are hardly any language resources of morphemic data available in ***** RDF ***** to date. | ||
| 2020.lrec-1.889 Unfortunately, generic ontology and ***** RDF ***** editors were considered inconvenient to use with OntoLex-Lemon because of its complex design patterns and other peculiarities, including indirection, reification and subtle integrity constraints. | ||
| 2016.gwc-1.44 Our wordnet is distributed in Resource Description Format (***** RDF *****) and we want to guarantee not only the syntax correctness but also its semantics soundness | ||
| specificity | 50 | |
| L08-1352 The results show that even if each resource has a useful ***** specificity *****, the global recall is low. | ||
| W17-5006 We propose several methods and feature sets capable of outperforming the state of the art in ***** specificity ***** prediction. | ||
| 2021.gwc-1.13 The data show that for each semantic relation an affix prevails in creating new words, although we cannot talk about their ***** specificity ***** with respect to such a relation. | ||
| D19-1182 We propose deep ordinal regression approaches for ***** specificity ***** prediction, under both supervised and semi-supervised settings, and provide empirical results demonstrating the effectiveness of the proposed techniques over several baseline approaches. | ||
| L16-1620 We introduce improved guidelines for annotation of sentence ***** specificity *****, addressing the issues encountered in prior work | ||
| bridging | 50 | |
| 2021.emnlp-main.494 Different types of reasoning are simulated, including intersecting multiple pieces of evidence, ***** bridging ***** from one piece of evidence to another, and detecting unanswerable cases. | ||
| L14-1427 The paper includes also a frame-semantic parsing use-case for extracting structured information from unstructured newswire texts, sometimes referred to as ***** bridging ***** of the semantic gap. | ||
| P18-1164 We experiment with three strategies: (1) a source-side ***** bridging ***** model, where source word embeddings are moved one step closer to the output target sequence; (2) a target-side ***** bridging ***** model, which explores the more relevant source word embeddings for the prediction of the target sequence; and (3) a direct ***** bridging ***** model, which directly connects source and target word embeddings seeking to minimize errors in the translation of ones by the others. | ||
| 2021.codi-sharedtask.2 We describe the systems that we developed for the three tracks of the CODI-CRAC 2021 shared task, namely entity coreference resolution, ***** bridging ***** resolution, and discourse deixis resolution. | ||
| 2021.codi-sharedtask.8 The CODI-CRAC 2021 shared task is the first shared task that focuses exclusively on anaphora resolution in dialogue and provides three tracks, namely entity coreference resolution, ***** bridging ***** resolution, and discourse deixis resolution | ||
| language learners | 50 | |
| P19-3034 We introduce a system aimed at improving and expanding second ***** language learners *****' English vocabulary. | ||
| 2020.lrec-1.34 Accordingly, we also report on on-going proof-of-concept efforts aiming at developing the first prototypical implementation of the approach in order to correct and extend an LR called ConceptNet based on the input crowdsourced from ***** language learners *****. | ||
| 2020.emnlp-main.312 Many English-as-a-second ***** language learners ***** have trouble using near-synonym words (e.g., small vs.little; briefly vs.shortly) correctly, and often look for example sentences to learn how two nearly synonymous terms differ. | ||
| W19-4407 To address this, we present and release an annotated data set of 6,121 spelling errors in context, based on a corpus of essays written by English ***** language learners *****. | ||
| 2020.inlg-1.31 Japanese sentence-ending predicates intricately combine content words and functional elements, such as aspect, modality, and honorifics; this can often hinder the understanding of ***** language learners ***** and children. | ||
| articles | 50 | |
| P17-1028 We evaluate a suite of methods across two different applications and text genres: Named-Entity Recognition in news ***** articles ***** and Information Extraction from biomedical abstracts. | ||
| 2020.lrec-1.641 The texts come from different sources: daily newspaper ***** articles *****, Czech translation of the Wall Street Journal, transcribed dialogs and a small amount of user-generated, short, often non-standard language segments typed into a web translator. | ||
| L16-1453 In the automatic alignments of parallel corpora, most of the p***** articles ***** align to NULL. | ||
| D19-1664 We first produce a new dataset, BASIL, of 300 news ***** articles ***** annotated with 1,727 bias spans and find evidence that informational bias appears in news ***** articles ***** more frequently than lexical bias. | ||
| 2021.bionlp-1.16 BioELECTRA pretrained on PubMed and PMC full text ***** articles ***** performs very well on Clinical datasets as well. | ||
| sentence compression | 50 | |
| 2020.acl-srw.13 To overcome this limitation, we present a novel unsupervised deep learning framework (SCAR) for deletion-based ***** sentence compression *****. | ||
| P19-1609 We conduct experiments across various seq2seq text generation tasks including machine translation, formality style transfer, ***** sentence compression ***** and simplification. | ||
| K18-1040 In ***** sentence compression *****, the task of shortening sentences while retaining the original meaning, models tend to be trained on large corpora containing pairs of verbose and compressed sentences. | ||
| L10-1626 This paper presents two corpora produced within the RPM2 project : a multi - document summarization corpus and a *****sentence compression***** corpus . | ||
| D18-1267 In this paper we advocate the use of bilingual corpora which are abundantly available for training *****sentence compression***** models . | ||
| chinese spelling check | 50 | |
| 2021.acl-long.464 *****Chinese Spelling Check***** (CSC) is a challenging task due to the complex characteristics of Chinese characters. | ||
| 2021.emnlp-main.281 In our experiments on the SIGHAN 2015 *****Chinese spelling check***** task, we show that SSCL is superior to previous norm-based and uncertainty-aware approaches, and establish a new state of the art (74.38% F1). | ||
| 2020.acl-main.81 *****Chinese Spelling Check***** (CSC) is a task to detect and correct spelling errors in Chinese natural language. | ||
| 2020.findings-emnlp.184 *****Chinese spelling check***** is a challenging task due to the characteristics of the Chinese language, such as the large character set, no word boundary, and short word length. | ||
| 2021.emnlp-main.287 *****Chinese Spelling Check***** (CSC) is to detect and correct Chinese spelling errors. | ||
| world | 50 | |
| 2021.acl-long.320 Open pit mines left many regions ***** world *****wide inhospitable or uninhabitable. | ||
| L12-1351 In addition to being a rich source of language direct quotations from business leaders can have ”real ***** world *****” consequences. | ||
| 2021.emnlp-main.292 We avoid crucial assumptions of previous work that do not transfer well to real-***** world ***** settings, including exploiting knowledge of the fixed number of retrieval steps required to answer each question or using structured metadata like knowledge bases or web links that have limited availability. | ||
| 2020.codi-1.1 With their huge speaking populations in the ***** world *****, Spanish and Chinese occupy important positions in linguistic studies. | ||
| R17-1079 It can successfully be applied to a different real-***** world ***** dataset without requiring additional modifications. | ||
| based | 50 | |
| 2010.amta-papers.6 In this paper, we present the insights gained from a detailed study of coupling a highly modular English-Hindi RBMT system with a standard phrase-***** based ***** SMT system. | ||
| 2021.acl-demo.41 To guarantee acceptability, all the text transformations are linguistically ***** based ***** and all the transformed data selected (up to 100,000 texts) scored highly under human evaluation. | ||
| I17-4035 In this paper, we propose the use of an attention-***** based ***** LSTM (AT-LSTM) model for these tasks. | ||
| L14-1009 Our formalization is ***** based ***** on the BDI model (Belief, Desire and Intetion) and constitues a first step toward a unifying model for subjective information extraction. | ||
| 2020.tacl-1.15 We use Viterbi EM with a margin - based criterion to train a span - *****based***** discourse parser in an unsupervised manner . | ||
| simultaneous machine translation | 50 | |
| 2021.wmt-1.119 Recent work in ***** simultaneous machine translation ***** is often trained with conventional full sentence translation corpora, leading to either excessive latency or necessity to anticipate as-yet-unarrived words, when dealing with a language pair whose word orders significantly differ. | ||
| 2021.emnlp-main.536 We propose a generative framework for ***** simultaneous machine translation *****. | ||
| 2021.eacl-main.281 This paper addresses the problem of ***** simultaneous machine translation ***** (SiMT) by exploring two main concepts: (a) adaptive policies to learn a good trade-off between high translation quality and low latency; and (b) visual information to support this process by providing additional (visual) contextual information which may be available before the textual input is produced. | ||
| W19-3648 We describe work in progress for evaluating performance of sequence-to-sequence neural networks on the task of syntax-based reordering for rules applicable to ***** simultaneous machine translation *****. | ||
| 2020.iwslt-1.29 In *****simultaneous machine translation*****, the objective is to determine when to produce a partial translation given a continuous stream of source words, with a trade-off between latency and quality. | ||
| open multilingual wordnet | 50 | |
| 2019.gwc-1.49 Wordnets can be built for any language in GeoNames, we give results for those wordnets in the *****Open Multilingual Wordnet*****. | ||
| W17-7908 Then we connect it to the *****Open Multilingual WordNet***** (OMW) through two attempts, and use confidence scores to quantify accuracy. | ||
| 2015.jeptalnrecital-invite.1 I will start by presenting BabelNet 3.0, available at http://babelnet.org, a very large multilingual encyclopedic dictionary and semantic network, which covers 271 languages and provides both lexicographic and encyclopedic knowledge for all the open-class parts of speech, thanks to the seamless integration of WordNet, Wikipedia, Wiktionary, OmegaWiki, Wikidata and the *****Open Multilingual WordNet*****. | ||
| 2020.lrec-1.390 To this end, we introduce a new version of the *****Open Multilingual Wordnet***** (Bond and Foster, 2013), that integrates a new set of tools that tests the extensions introduced by this new format, while also ensuring the integrity of the Collaborative Interlingual Index (CILI: | ||
| 2019.gwc-1.50 This paper aims to study auto-hyponymy and auto-troponymy relations (or vertical polysemy) in 11 wordnets uploaded into the new *****Open Multilingual Wordnet***** (OMW) webpage. | ||
| visual reasoning | 50 | |
| D19-6403 In this paper, we experiment with a recently proposed ***** visual reasoning ***** task dealing with quantities – modeling the multimodal, contextually-dependent meaning of size adjectives (`big', `small') – and explore the impact of varying the training data on the learning behavior of a state-of-art system. | ||
| P18-1242 In this paper, we study the problem of geometric reasoning (a form of ***** visual reasoning *****) in the context of question-answering. | ||
| 2020.findings-emnlp.253 We present the first study focused on generating natural language rationales across several complex ***** visual reasoning ***** tasks: visual commonsense reasoning, visual-textual entailment, and visual question answering. | ||
| P17-2034 We present a new ***** visual reasoning ***** language dataset, containing 92,244 pairs of examples of natural statements grounded in synthetic images with 3,962 unique sentences. | ||
| D18-1118 Experiments show that CMM significantly outperforms most related models, and reach state-of-the-arts on two ***** visual reasoning ***** benchmarks: CLEVR and NLVR, collected from both synthetic and natural languages. | ||
| europarl corpus | 50 | |
| L12-1480 Three different corpora are used: two subsets of the *****Europarl corpus***** and a third corpus built using newspaper articles. | ||
| 2006.amta-papers.15 We introduce a particular formalism, probabilistic synchronous tree-insertion grammar (PSTIG) that we argue satisfies the desiderata optimally within the class of formalisms that can be parsed no less efficiently than context-free grammars and demonstrate that it outperforms state-of-the-art word-based and phrase-based finite-state translation models on training and test data taken from the *****EuroParl corpus***** (Koehn, 2005). | ||
| L16-1483 The corpora comprise both the well-known *****Europarl corpus***** and a domain-specific question-answer troubleshooting corpus on the IT domain. | ||
| L06-1386 We propose a bootstrapping approach to creating a phrase-level alignment over a sentence-aligned parallel corpus, reporting concrete treebank annotation work performed on a sample of sentence tuples from the *****Europarl corpus*****, currently for English, French, German, and Spanish. | ||
| L08-1131 We conducted experiments using *****Europarl corpus***** to evaluate our approach. | ||
| interactively | 49 | |
| P17-1124 Our method ***** interactively ***** obtains user feedback to gradually improve the results of a state-of-the-art integer linear programming (ILP) framework for MDS. | ||
| 2021.dash-1.5 Finally, we extract utterance embeddings from the clustering model and plot the data to ***** interactively ***** bulk label the samples, reducing the time and effort for data labeling of the whole dataset significantly. | ||
| D18-1095 We run a small user study that demonstrates that untrained users can ***** interactively ***** update topics in order to improve classification accuracy. | ||
| 2020.lrec-1.44 We present a concrete realization: an executable application for mobile devices with which users can explore their environment ***** interactively ***** in different languages. | ||
| P18-1062 Code and data are publicly available, and our system can be ***** interactively ***** tested | ||
| gradients | 49 | |
| P19-1556 We evaluate reversing ***** gradients ***** for adversarial adaptation on multiple domains, and demonstrate that it significantly outperforms other methods on question deduplication as well as on recognizing textual entailment (RTE) tasks, achieving up to 7% absolute boost in base model accuracy on some datasets. | ||
| 2021.mtsummit-research.10 Finally and a study of softmax entropies and ***** gradients ***** reveal the impact of our method on the internal behavior of our NMT models. | ||
| 2020.acl-main.323 We propose to automatically and dynamically determine batch sizes by accumulating ***** gradients ***** of mini-batches and performing an optimization step at just the time when the direction of ***** gradients ***** starts to fluctuate. | ||
| 2021.acl-long.152 Through extensive experiments, we show that (1) pruning a number of attention heads in a multi-lingual Transformer-based model has, in general, positive effects on its performance in cross-lingual and multi-lingual tasks and (2) the attention heads to be pruned can be ranked using ***** gradients ***** and identified with a few trial experiments. | ||
| P19-1559 In this paper, we propose MHA, which addresses both problems by performing Metropolis-Hastings sampling, whose proposal is designed with the guidance of ***** gradients ***** | ||
| parameters | 49 | |
| L12-1644 In addition to the classical collaborative filtering and content based approaches, taking into account ratings, preferences and demographic characteristics of the users, a new type of Recommender System, based on personality ***** parameters *****, has been emerging recently. | ||
| 2020.lrec-1.682 We also performed an experimental analysis of the impact of the method's ***** parameters ***** on the final result. | ||
| 2014.iwslt-papers.10 Furthermore, the models for different target words can share ***** parameters ***** and therefore data sparsity problems are effectively reduced. | ||
| D18-1543 Based on this result, we propose an architecture where the transition classifier is shared, and the sharing of word and character ***** parameters ***** is controlled by a parameter that can be tuned on validation data. | ||
| D17-1154 Our final model achieves 10 times speedup, 17 times ***** parameters ***** reduction, less than 35MB storage size and comparable performance compared to the baseline model | ||
| labelling | 49 | |
| P18-1030 Results on various classification and sequence ***** labelling ***** benchmarks show that the proposed model has strong representation power, giving highly competitive performances compared to stacked BiLSTM models with similar parameter numbers. | ||
| 2021.naacl-main.27 We present a fast and scalable architecture called Explicit Modular Decomposition (EMD), in which we incorporate both classification-based and extraction-based methods and design four modules (for clas- sification and sequence ***** labelling *****) to jointly extract dialogue states. | ||
| D19-3042 Specifically, we apply semantic role ***** labelling ***** to understand relationships between key roles in the news. | ||
| 2021.emnlp-main.686 Prior work on outcome detection has modelled this task as either (a) a sequence ***** labelling ***** task, where the goal is to detect which text spans describe health outcomes, or (b) a classification task, where the goal is to classify a text into a predefined set of categories depending on an outcome that is mentioned somewhere in that text. | ||
| W17-6318 To improve grammatical function labelling for German , we augment the *****labelling***** component of a neural dependency parser with a decision history . | ||
| correlates | 49 | |
| N19-1008 Integrating prosodic cues has proved difficult because of the many sources of variability affecting the acoustic ***** correlates *****. | ||
| W18-5405 Additionally, we show that the quantitative analysis technique ***** correlates ***** with the judgment of a human expert evaluator in terms of alignment. | ||
| 2020.findings-emnlp.9 Experiments on seven datasets over four language generation tasks show that the proposed metric ***** correlates ***** highly with human judgments. | ||
| L08-1167 Second, we find that on average, learning performance for a given functional semantic category ***** correlates ***** with the overall agreement among the seven annotators for that category. | ||
| 2021.acl-demo.33 The quantitative evaluation demonstrates that our backbone translation models achieve state-of-the-art translation performance and our quality estimation well ***** correlates ***** with both BLEU and human judgment | ||
| bidirectional LSTM | 49 | |
| N18-2019 Experimentally, we show that leveraging these two representations can significantly improve the f-score of a strong ***** bidirectional LSTM ***** baseline model by 10.1%. | ||
| N19-1345 The neural network has a standard ***** bidirectional LSTM ***** at its core. | ||
| K17-3016 The ***** bidirectional LSTM ***** approach by Kiperwasser and Goldberg (2016) is used to train a greedy parser with a dynamic oracle to mitigate error propagation. | ||
| 2020.nlptea-1.2 Accordingly, we made use of a ***** bidirectional LSTM ***** language model (LM) for our context-sensitive spelling detection and correction model which is shown to have much control over the correction process. | ||
| P17-2064 By combining CWINDOW word embedding features and POS information, the best ***** bidirectional LSTM ***** model achieves accuracy 0.5138 and MRR 0.6789 on the HSK dataset | ||
| correlate | 49 | |
| 2021.mrqa-1.15 We find that semantic similarity metrics based on recent transformer models ***** correlate ***** much better with human judgment than traditional lexical similarity metrics on our two newly created datasets and one dataset from related work. | ||
| 2021.cmcl-1.23 We carry out an in-depth analysis to detect which linguistic features ***** correlate ***** more with complexity judgments and with the degree of agreement among annotators. | ||
| N18-2002 With these “Winogender schemas,” we evaluate and confirm systematic gender bias in three publicly-available coreference resolution systems, and ***** correlate ***** this bias with real-world and textual gender statistics. | ||
| L08-1519 The perceptual test reveals that even though automatic and perceptual errors ***** correlate ***** positively, human listeners deal with local ambiguity more efficiently than the ASR system in conditions which attempt to approximate the information available for decision for a 4-gram language model. | ||
| 2020.nlpcovid19-acl.14 We see, for example, that lockdown announcements ***** correlate ***** with a deterioration of mood in almost all surveyed countries, which recovers within a short time span | ||
| runtime | 49 | |
| Q17-1019 We find that optimizing end-to-end performance in this way leads to a better Pareto frontier—i.e., parsers which are more accurate for a given ***** runtime *****. | ||
| 2020.acl-main.325 Our method does not require any modification to the training procedure and can be easily applied at ***** runtime ***** with custom dictionaries. | ||
| D18-1015 We further show the consistent effectiveness and efficiency of TGN through an ablation study and a ***** runtime ***** test. | ||
| L12-1036 It utilises a set of ontologies used as dialogue models that can be combined dynamically during ***** runtime *****. | ||
| 2020.lrec-1.267 Indexing minimizes the amount of expensive pattern matching that must take place at ***** runtime ***** | ||
| LDA | 49 | |
| C18-1212 Extensive experiments on social media data and news articles show the benefits of on-line ***** LDA ***** versus standard ***** LDA *****, and of on-line change point detection compared to off-line algorithms. | ||
| S19-1011 We find that qualitative judgments significantly favor our approach, the method outperforms ***** LDA ***** on topic coherence, and is comparable to ***** LDA ***** on document classification tasks. | ||
| N18-1034 Finally, we find that supervision leads to faster convergence as compared to an ***** LDA ***** baseline and that dDMR's model fit is less sensitive to training parameters than DMR. | ||
| 2020.nlposs-1.19 From ***** LDA ***** to neural models, different topic modeling approaches have been proposed in the literature. | ||
| 2020.lrec-1.348 Active Learning Strategies have shown promising results in the Persian language, and ***** LDA ***** sampling showed a competitive performance compared to other approaches | ||
| glosses | 49 | |
| W16-4020 We present work on morphosyntactic taggers trained on transcribed and linguistically analyzed recordings and dependency parsers using English ***** glosses ***** to project annotation for creating synthetic treebanks. | ||
| 2021.isa-1.6 This paper presents work carried out to transform ***** glosses ***** of a fable in Italian Sign Language (LIS) into a text which is then read by a TTS synthesizer from an SSML modified version of the same text. | ||
| 2021.emnlp-main.610 We then train a model to identify semantic equivalence between a target word in context and one of its ***** glosses ***** using these aligned inventories, which exhibits strong transfer capability to many WSD tasks. | ||
| L14-1671 The proposed approach simplifies the machine translation of the ***** glosses *****. | ||
| P18-1230 Therefore, we propose GAS: a gloss-augmented WSD neural network which jointly encodes the context and ***** glosses ***** of the target word | ||
| CL | 49 | |
| 2021.naacl-main.378 Although some ***** CL ***** techniques have been proposed for document sentiment classification, we are not aware of any ***** CL ***** work on ASC. | ||
| L14-1509 This paper presents an overview of the findings from an exploratory study carried out to investigate if the appropriateness level of text alternatives for images in French can be improved when applying controlled language (***** CL *****) rules. | ||
| 2021.wassa-1.13 Our axes of analysis include Task difficulty on ***** CL *****, comparing ***** CL ***** pacing techniques, and qualitative analysis by visualizing the movement of attention scores in the model as curriculum phases progress. | ||
| 2021.emnlp-main.550 This paper studies continual learning (***** CL *****) of a sequence of aspect sentiment classification (ASC) tasks in a particular ***** CL ***** setting called domain incremental learning (DIL). | ||
| 2021.acl-short.111 Specifically, we propose a model-agnostic framework called Schema-aware Curriculum Learning for Dialog State Tracking (Sa***** CL *****og), which consists of a preview module that pre-trains a DST model with schema information, a curriculum module that optimizes the model with ***** CL *****, and a review module that augments mispredicted data to reinforce the ***** CL ***** training | ||
| RC | 49 | |
| P19-1260 Multi-hop reading comprehension (***** RC *****) across documents poses new challenge over single-document ***** RC ***** because it requires reasoning over multiple documents to reach the final answer. | ||
| 2021.emnlp-main.447 We find that pairwise attributions are better suited to ***** RC ***** than token-level attributions across these different ***** RC ***** settings, with our best performance coming from a modification that we propose to an existing pairwise attribution method. | ||
| 2020.emnlp-main.86 However, most existing reading comprehension (***** RC *****) tasks only focus on questions for which the contexts provide all the information required to answer them, thus not evaluating a system's performance at identifying a potential lack of sufficient information and locating sources for that information. | ||
| P19-1225 Question answering (QA) using textual sources for purposes such as reading comprehension (***** RC *****) has attracted much attention. | ||
| P19-1220 First, unlike most studies on ***** RC ***** that have focused on extracting an answer span from the provided passages, our model instead focuses on generating a summary from the question and multiple passages | ||
| Grammar | 49 | |
| L06-1429 This paper describes an ongoing Portuguese Language grammar checker project , called CoGrOO1 - Corretor Gramatical para OpenOffice ( *****Grammar***** Checker for OpenOffice ) , based on CETENFOLHA , a Brazilian Portuguese morphosyntactic annotated Corpus . | ||
| C16-1003 *****Grammar***** induction is the task of learning syntactic structure in a setting where that structure is hidden . | ||
| W18-5452 *****Grammar***** induction is the task of learning syntactic structure without the expert - labeled treebanks ( Charniak and Carroll , 1992 ; Klein and Manning , 2002 ) . | ||
| D19-1148 *****Grammar***** induction aims to discover syntactic structures from unannotated sentences . | ||
| L14-1227 *****Grammar***** models conceived for parsing purposes are often poorer than models that are motivated linguistically . | ||
| factoid | 49 | |
| K17-1029 For example, the BioASQ dataset for biomedical QA comprises less then 900 ***** factoid ***** (single answer) and list (multiple answers) QA instances. | ||
| 2021.iwcs-1.13 Research in NLP has mainly focused on ***** factoid ***** questions, with the goal of finding quick and reliable ways of matching a query to an answer. | ||
| S17-1020 However, datasets for semantic parsing contain many ***** factoid ***** questions that can be answered from a single web document. | ||
| D18-1259 We introduce HotpotQA, a new dataset with 113k Wikipedia-based question-answer pairs with four key features: (1) the questions require finding and reasoning over multiple supporting documents to answer; (2) the questions are diverse and not constrained to any pre-existing knowledge bases or knowledge schemas; (3) we provide sentence-level supporting facts required for reasoning, allowing QA systems to reason with strong supervision and explain the predictions; (4) we offer a new type of ***** factoid ***** comparison questions to test QA systems' ability to extract relevant facts and perform necessary comparison | ||
| E17-1036 In recent years , knowledge graphs such as Freebase that capture facts about entities and relationships between them have been used actively for answering *****factoid***** questions . | ||
| TED | 49 | |
| 2020.iwslt-1.9 We took part in the offline End-to-End English to German ***** TED ***** lectures translation task. | ||
| 2021.acl-long.571 Empirical results show that VOLT beats widely-used vocabularies in diverse scenarios, including WMT-14 English-German translation, ***** TED ***** bilingual translation, and ***** TED ***** multilingual translation. | ||
| 2021.iwslt-1.11 The task consists of building a system capable of translating English audio recordings extracted from ***** TED ***** talks into German text. | ||
| 2011.iwslt-papers.5 We experiment with two large-scale translation tasks, the Arabic-to-English and English-to-French IWSLT 2011 ***** TED ***** Talks MT tasks. | ||
| N19-1388 We report results on the publicly available ***** TED ***** talks multilingual corpus where we show that massively multilingual many-to-many models are effective in low resource settings, outperforming the previous state-of-the-art while supporting up to 59 languages in 116 translation directions in a single model | ||
| ACL | 49 | |
| L14-1155 It follows similar exercises that have been conducted, such as the survey on the IEEE ICASSP conference series from 1976 to 1990, which served in the launching of the ESCA Eurospeech conference, a survey of the Association of Computational Linguistics (***** ACL *****) over 50 years of existence, which was presented at the ***** ACL ***** conference in 2012, or a survey over the 25 years (1987-2012) of the conferences contained in the ISCA Archive, presented at Interspeech 2013. | ||
| J78-3001 ***** ACL *****: Minutes Of the 16th Annual Business Meeting; ***** ACL ***** Secretary-Treasurer's Report; ***** ACL ***** Officers For 1979; ***** ACL ***** Officers 1963-1979; NSF: Support for Computational Linguistics (Paul G. Chapin); News: Short Notes; News: ARIST Reprint Request (Martha E. Williams); News: Summer Linguistics at Texas; PhD Programs in Computational Linguistics; Journal: Computational Linguistics and Computer Languages (T. Frey; T. Vamos); Journal: Discourse Processes (Roy D. Freedle); Book Notices (Mel'cuk R. Ravic); Yale AI Project Research Reports Available; Summary of Research on Computational Aspects of Evolution Theories (Raymond D. Gumb); | ||
| 2020.figlang-1.37 This research was performed in conjunction with the sarcasm detection shared task section in the Second Workshop on Figurative Language Processing, co-located with ***** ACL ***** 2020 | ||
| L08-1251 We describe methods for extracting interesting factual relations from scientific texts in computational linguistics and language technology taken from the *****ACL***** Anthology . | ||
| L14-1697 In this paper we present a comparative analysis of two series of conferences in the field of Computational Linguistics , the LREC conference and the *****ACL***** conference . | ||
| pivot | 49 | |
| P19-1591 Our algorithms iteratively train the PBLM model, gradually increasing the information exposed about each ***** pivot *****. | ||
| 2012.iwslt-papers.12 The main idea is to use an automatic translation as ***** pivot ***** to infer alignments between the source sentence and the reference translation, or user correction. | ||
| W19-8604 Based on these ***** pivot ***** words, we propose a lexical analysis framework, the Pivot Analysis, to quantitatively analyze the effects of these words in text attribute classification and transfer | ||
| 1999.mtsummit-1.19 This goal could be attained by : ( 1 ) giving users , free of charge , TA client tools and server resources in exchange for the permission to store and refine on the server linguistic resources produced while using TA ; ( 2 ) establishing a synergy between MT and TA , in particular by using them jointly in translation projects where translators codevelop the lexical resources specific to MT ; ( 3 ) renouncing the illusion of fully automatic general purpose high quality MT ( FAHQMT ) and go for semi - automaticity ( SAHQMT ) , where user participation , made possible by recent technical network - oriented advances , is used to solve ambiguities otherwise computationnally unsolvable due to the impossibility , intractability or cost of accessing the necessary knowledge ; ( 4 ) adopting a hybrid ( symbolic & numerical ) and pivot approach for MT , where *****pivot***** lexemes arc UNL or UNL inspired English - oriented denotations of ( sets of ) interlingual acceptions or word / term senses , and the rest of the representation of utterances is either fully abstract and interlingual as in UNL , or , less ambitiously but more realistically , obtained by adding to an abstract English multilevel structure features underspecified in English but essential for other languages , including minority languages . | ||
| L16-1439 We show that linear translation really provides a more reliable method for triangle scoring than *****pivot***** count . | ||
| Stance | 49 | |
| W19-6122 ***** Stance ***** labels are then used to predict veracity across platforms and also across languages, training on conversations held in one language and using the model on conversations held in another. | ||
| 2020.aacl-main.92 ***** Stance ***** classification can be a powerful tool for understanding whether and which users believe in online rumours. | ||
| D19-1665 ***** Stance ***** detection in social media is a well-studied task in a variety of domains. | ||
| S17-2084 ***** Stance ***** classification is interesting since it can provide a basis for rumour veracity assessment | ||
| N19-1185 *****Stance***** detection in twitter aims at mining user stances expressed in a tweet towards a single or multiple target entities . | ||
| Named Entity | 49 | |
| L08-1252 The approach consists in applying grammar induced extraction patterns on a large corpus - Wikipedia - for the extraction of relations between a given ***** Named Entity ***** and other Named Entities. | ||
| P19-1484 To generate such triples, we first sample random context paragraphs from a large corpus of documents and then random noun phrases or ***** Named Entity ***** mentions from these paragraphs as answers. | ||
| 2020.lrec-1.834 This tool can find superconductivity terms relevant to a query term within a specified ***** Named Entity ***** category, which demonstrates the power of our SC-CoMIcs, efficiently providing knowledge for Materials Informatics applications from rapidly expanding publications. | ||
| 2020.lrec-1.546 We present RONEC - the *****Named Entity***** Corpus for the Romanian language . | ||
| W16-3920 In this paper , we present our approach for named entity recognition in Twitter messages that we used in our participation in the *****Named Entity***** Recognition in Twitter shared task at the COLING 2016 Workshop on Noisy User - generated text ( WNUT ) . | ||
| lexical entailment | 49 | |
| 2020.semeval-1.31 As important branches of ***** lexical entailment *****, predicting multilingual and cross-lingual ***** lexical entailment ***** (LE) are two subtasks of SemEval2020 Task2. | ||
| 2020.semeval-1.15 In this paper we present a novel rule-based, language independent method for determining ***** lexical entailment ***** relations using semantic representations built from Wiktionary definitions. | ||
| C18-1023 We present a standard neural network model and a novel set-theoretic model to learn these entailment vectors from word pairs with known ***** lexical entailment ***** relations derived from WordNet. | ||
| 2020.blackboxnlp-1.16 We address whether neural models for Natural Language Inference (NLI) can learn the compositional interactions between ***** lexical entailment ***** and negation, using four methods: the behavioral evaluation methods of (1) challenge test sets and (2) systematic generalization tasks, and the structural evaluation methods of (3) probes and (4) interventions. | ||
| L10-1291 Evaluation results show that Wikipedia can be effectively used as a source of ***** lexical entailment ***** rules, featuring both higher coverage and context sensitivity with respect to other resources. | ||
| concept | 49 | |
| L10-1360 This study showed that it was possible to connect brain activities to the semantic relation among ***** concept *****s, and that it would improve the method for ***** concept ***** distance calculation in order to build a more human-like ontology model. | ||
| 2021.rocling-1.29 In this paper, we propose to predict topical stances from social media by ***** concept ***** expansion, sentiment classification, and stance aggregation based on word embeddings. | ||
| L10-1586 The lexical unit candidates for ***** concept ***** mapping have been selected from two large and well-developed lexical resources for Bulgarian - a machine readable explanatory dictionary and a morphological lexicon. | ||
| D19-5708 We present a neural pipeline approach that performs named entity recognition (NER) and ***** concept ***** indexing (CI), which links them to ***** concept ***** unique identifiers (CUIs) in a knowledge base, for the PharmaCoNER shared task on pharmaceutical drugs and chemical entities. | ||
| 2020.acl-main.760 We propose an approach to ***** concept ***** linking that leverages recent work in contextualized neural models, such as ELMo (Peters et al. 2018), which create a token representation that integrates the surrounding context of the mention and ***** concept ***** name | ||
| contextual embedding | 49 | |
| 2021.eacl-main.215 To understand if and how morphosyntactic alignment affects ***** contextual embedding ***** spaces, we train classifiers to recover the subjecthood of mBERT embeddings in transitive sentences (which do not contain overt information about morphosyntactic alignment) and then evaluate them zero-shot on intransitive sentences (where subjecthood classification depends on alignment), within and across languages. | ||
| 2020.wmt-1.99 Although the recently proposed ***** contextual embedding ***** based metrics, YiSi-1, significantly outperform BLEU and other metrics in correlating with human judgment on translation quality, we have yet to understand the full strength of using pretrained language models for machine translation evaluation. | ||
| 2021.acl-short.115 BERTScore is a scoring function based on ***** contextual embedding *****s that overcomes the typical limitations of n-gram-based metrics (e.g. synonyms, paraphrases), allowing translations that are different from the references, yet close in the ***** contextual embedding ***** space, to be treated as substantially correct. | ||
| 2021.eacl-main.297 We propose a novel method to estimate polysemy based on simple geometry in the ***** contextual embedding ***** space | ||
| 2020.semeval-1.219 We approach this emphasis selection problem as a sequence labeling task where we represent the underlying text with various *****contextual embedding***** models. | ||
| label | 49 | |
| P19-1521 Furthermore, we also propose two estimators which can effectively measure such ***** label ***** confusion based on instance-level or population-level statistics. | ||
| C16-1091 Using Wikipedia document titles as ***** label ***** candidates, we compute neural embeddings for documents and words to select the most relevant ***** label *****s for topics. | ||
| 2020.emnlp-main.280 Also, we find that our model can be trained to generate an adequate knowledge path even when the paths are not available and only the destination nodes are given as ***** label *****, making it more applicable to real-world dialogue systems. | ||
| 2021.emnlp-main.643 Resampling and re-weighting are common approaches used for addressing the class imbalance problem, however, they are not effective when there is ***** label ***** dependency besides class imbalance because they result in oversampling of common ***** label *****s. | ||
| 2021.acl-short.51 This puts our approach atop the competitive FEVER leaderboard at the time of our work, scoring higher than the second place submission by almost two points in ***** label ***** accuracy and over one point in FEVER score | ||
| database | 49 | |
| L10-1502 Our promising results suggest that our corpora can be effectively used to carry out research in the field of natural language interface to ***** database *****. | ||
| 2020.coling-main.34 We highlight the three most important directions, namely linking question tokens to ***** database ***** schema elements (schema linking), better architectures for encoding the textual query taking into account the schema (schema encoding), and improved generation of structured queries using autoregressive neural models (grammar-based decoders). | ||
| L08-1511 This paper presents the results of the NEOLOGOS project: a children ***** database ***** and an optimized adult ***** database ***** for the French language. | ||
| 2020.coling-main.465 In this paper, we evaluate the progress of our field toward solving simple factoid questions over a knowledge base, a practically important problem in natural language interface to ***** database *****. | ||
| L12-1124 We introduce two ***** database ***** resources | ||
| electronic health records | 49 | |
| 2020.lrec-1.547 Multiple efforts have been done to protect the integrity of patients while making ***** electronic health records ***** usable for research by removing personally identifiable information in patient records. | ||
| W19-1915 Recently natural language processing (NLP) tools have been developed to identify and extract salient risk indicators in ***** electronic health records ***** (EHRs). | ||
| W19-5003 This paper proposes a dataset and method for automatically generating paraphrases for clinical questions relating to patient-specific information in ***** electronic health records ***** (EHRs). | ||
| 2021.naacl-main.318 Given the clinical notes written in ***** electronic health records ***** (EHRs), it is challenging to predict the diagnostic codes which is formulated as a multi-label classification task. | ||
| L16-1598 This paper discusses the creation of a semantically annotated corpus of questions about patient data in ***** electronic health records ***** (EHRs). | ||
| semantic web | 49 | |
| L14-1185 We present an approach that combines the state-of-the art from named entity recognition in the natural language processing domain and named entity linking from the ***** semantic web ***** community. | ||
| L10-1073 We propose a solution which anchors on using controlled languages as interfaces to ***** semantic web ***** applications. | ||
| 2020.ai4hi-1.1 An ongoing challenge is how to increase discovery and access through structured data and the ***** semantic web *****. | ||
| 2021.emnlp-main.708 Our pipeline consists of two parts: a neural semantic parser that converts natural language questions into the intermediate representations and a non-trainable transpiler to the SPARQL query language (a standard language for accessing knowledge graphs and ***** semantic web *****). | ||
| 2020.lrec-1.596 Building ontologies is a crucial part of the ***** semantic web ***** endeavour. | ||
| native language | 49 | |
| P17-2086 In this paper, we explore spelling errors as a source of information for detecting the ***** native language ***** of a writer, a previously under-explored area. | ||
| W17-5045 The system was submitted to the NLI Shared Task 2017 fusion track which featured students essays and spoken responses in form of audio transcriptions and iVectors by non-native English speakers of eleven ***** native language *****s. | ||
| 2003.mtsummit-systems.10 In response to growing needs for cross-lingual patent retrieval, we propose PRIME (Patent Retrieval In Multilingual Environment system), in which users can retrieve and browse patents in foreign languages only by their ***** native language *****. | ||
| N18-4009 The outcome of the computational task is connected to a position in second language acquisition research that holds all learners acquire English grammatical morphemes in the same order, regardless of ***** native language ***** background. | ||
| P17-1050 We provide analysis of classifier uncertainty and learned features, which indicates that differences in English reading are likely to be rooted in linguistic divergences across ***** native language *****s. | ||
| Natural Language Processing | 49 | |
| 2021.teachingnlp-1.23 We present a scaffolded discovery learning approach to introducing concepts in a *****Natural Language Processing***** course aimed at computer science students at liberal arts institutions . | ||
| 2021.naacl-main.31 While cross - lingual techniques are finding increasing success in a wide range of *****Natural Language Processing***** tasks , their application to Semantic Role Labeling ( SRL ) has been strongly limited by the fact that each language adopts its own linguistic formalism , from PropBank for English to AnCora for Spanish and PDT - Vallex for Czech , inter alia . | ||
| 2020.lrec-1.37 Named Entity Recognition ( NER ) is an essential component of many *****Natural Language Processing***** pipelines . | ||
| C18-1215 Question - Answer ( QA ) matching is a fundamental task in the *****Natural Language Processing***** community . | ||
| N19-1098 Pre - trained word vectors are ubiquitous in *****Natural Language Processing***** applications . | ||
| Information | 49 | |
| W19-4613 Segmentation serves as an integral part in many NLP applications including Machine Translation , Parsing , and *****Information***** Retrieval . | ||
| 2021.acl-short.77 *****Information***** Retrieval using dense low - dimensional representations recently became popular and showed out - performance to traditional sparse - representations like BM25 . | ||
| 2020.lrec-1.27 *****Information***** extraction from unstructured texts plays a vital role in the field of natural language processing . | ||
| 2021.emnlp-main.79 With the advent of contextualized embeddings , attention towards neural ranking approaches for *****Information***** Retrieval increased considerably . | ||
| 2021.emnlp-main.439 The task of Event Detection ( ED ) in *****Information***** Extraction aims to recognize and classify trigger words of events in text . | ||
| reproducible | 48 | |
| D18-1157 We have made RESIDE's source code available to encourage ***** reproducible ***** research. | ||
| 2020.eval4nlp-1.2 Our goal is to measure the functional performance of a summary with an objective, ***** reproducible *****, and fully automated method. | ||
| 2021.latechclfl-1.18 Finally, because of the full WMD model's high time-complexity, we additionally suggest a method of sampling document pairs from large datasets in a ***** reproducible ***** way, with tight bounds that prevent extrapolation of unreliable results due to poor sampling practices. | ||
| C18-1217 We have made our code and corpus publicly available to make our results ***** reproducible *****. | ||
| 2020.lrec-1.428 Our code and corpus are publicly available to make our results ***** reproducible ***** | ||
| ambiguities | 48 | |
| L10-1527 Furthermore, we provide a novel strategy to handle origin ***** ambiguities ***** or multiple origins in a name. | ||
| 1999.mtsummit-1.19 This goal could be attained by: (1) giving users, free of charge, TA client tools and server resources in exchange for the permission to store and refine on the server linguistic resources produced while using TA; (2) establishing a synergy between MT and TA, in particular by using them jointly in translation projects where translators codevelop the lexical resources specific to MT; (3) renouncing the illusion of fully automatic general purpose high quality MT (FAHQMT) and go for semi-automaticity (SAHQMT), where user participation, made possible by recent technical network-oriented advances, is used to solve ***** ambiguities ***** otherwise computationnally unsolvable due to the impossibility, intractability or cost of accessing the necessary knowledge; (4) adopting a hybrid (symbolic & numerical) and “pivot” approach for MT, where pivot lexemes arc UNL or UNL inspired English-oriented denotations of (sets of) interlingual acceptions or word/term senses, and the rest of the representation of utterances is either fully abstract and interlingual as in UNL, or, less ambitiously but more realistically, obtained by adding to an abstract English multilevel structure features underspecified in English but essential for other languages, including minority languages. | ||
| 1999.mtsummit-1.64 Most of them enumerated a following list of the problems that had not seemed to be easy to solve in the near future : 1) processing of non-continuous idiomatic expressions 2) reduction of too many POS or structural ***** ambiguities ***** 3) robust processing for long sentence and parsing failure 4) selecting correct word correspondence between several alternatives. | ||
| 2020.wanlp-1.8 We train a model that has the capability to memorize words in the output language, and that also utilizes context for distinguishing ***** ambiguities ***** in the transliteration. | ||
| D19-1096 Based on the multiple graph-based interactions among characters, potential words, and the whole-sentence semantics, word ***** ambiguities ***** can be effectively tackled | ||
| subcategorization | 48 | |
| L06-1052 These results show that, contra (Korhonen et al. 2000), binomial hypothesis testing can be robust for determining ***** subcategorization ***** frames given corpus data. | ||
| L12-1349 Moreover, data-driven automatic acquisition naturally associates probabilistic information with ***** subcategorization ***** frames and LDD paths. | ||
| 2003.mtsummit-papers.3 Information on ***** subcategorization ***** and selectional restrictions is important for natural language processing tasks such as deep parsing, rule-based machine translation and automatic summarization. | ||
| L08-1210 The lexicon includes ***** subcategorization ***** frame and frequency information for 3297 French verbs. | ||
| 2002.amta-papers.9 This non-interlingual non-transfer approach is accomplished by using target-language lexical semantics, categorial variations and ***** subcategorization ***** frames to overgenerate multiple lexico-structural variations from a target-glossed syntactic dependency of the source-language sentence | ||
| phylogenetic | 48 | |
| D17-1268 Experiments show that the proposed method is able to infer not only syntactic, but also phonological and phonetic inventory features, and improves over a baseline that has access to information about the languages geographic and ***** phylogenetic ***** neighbors. | ||
| N19-1017 Their evolutionary history is often depicted in the shape of a ***** phylogenetic ***** tree. | ||
| R19-1040 We obtain ***** phylogenetic ***** trees which sometimes outperform the ones obtained by Atkinson and Gray. | ||
| D18-1468 Here we propose latent representation-based analysis in which (1) a sequence of discrete surface features is projected to a sequence of independent binary variables and (2) ***** phylogenetic ***** inference is performed on the latent space | ||
| N18-2063 We evaluate the performance of state - of - the - art algorithms for automatic cognate detection by comparing how useful automatically inferred cognates are for the task of *****phylogenetic***** inference compared to classical manually annotated cognate sets . | ||
| ungrammatical | 48 | |
| D18-1151 We expect a language model to assign a higher probability to the grammatical sentence than the ***** ungrammatical ***** one. | ||
| W16-4914 This is a necessary step to provide accurate coaching on how to correct ***** ungrammatical ***** input, and it will allow us to overcome a current bottleneck in the field — an exponential burst of ambiguity caused by ambiguous lexical items (Flickinger, 2010). | ||
| 2021.mtsummit-research.20 The nature of reviews provided by customers in any multilingual country poses unique challenges for machine translation such as code-mixing and ***** ungrammatical ***** sentences and presence of colloquial terms and lack of e-commerce parallel corpus etc. | ||
| 2021.emnlp-main.611 We apply this LM-Critic and BIFI along with a large set of unlabeled sentences to bootstrap realistic ***** ungrammatical ***** / grammatical pairs for training a corrector. | ||
| W19-4821 The results show that both in the difficult and highly symmetrical task of detecting subject islands and in the more open CoLA dataset, grammatical sentences give rise to better scores than ***** ungrammatical ***** ones, possibly because they can be better integrated within the body of linguistic structural knowledge that the language model has accumulated | ||
| probability | 48 | |
| 2020.semeval-1.216 First, we propose fine-tuning many pre-trained language models, predicting an emphasis ***** probability ***** distribution over tokens. | ||
| W17-5110 Using simple and intuitive empirical observations, we derive a claim sentence query by which we are able to directly retrieve sentences in which the prior ***** probability ***** to include topic-relevant claims is greatly enhanced. | ||
| 2020.acl-main.501 We compare previously used ***** probability ***** space and distant supervision assumptions (assumptions on the correspondence between the weak answer string labels and possible answer mention spans). | ||
| 2021.rocling-1.40 By grouping collocates of 臺灣 `Taiwan' into clusters of topics via either word embeddings clustering or Latent Dirichlet allocation, lists of collocates can be converted to ***** probability ***** distributions such that distances and similarities can be defined and computed. | ||
| 1997.iwpt-1.6 We also propose three orthogonal approaches fo r backing off ***** probability ***** estimates to cope with the large number of parameters involved | ||
| validation | 48 | |
| W19-3802 By filtering out irrelevant sentences, the remaining pool of candidate sentences are sent for human ***** validation *****. | ||
| 2020.semeval-1.50 The ability of common sense ***** validation ***** and explanation is very important for most models. | ||
| I17-3015 The proposed framework provides a pioneering example of on-demand knowledge ***** validation ***** in dialog environment to address such needs in AI agents/chatbots. | ||
| P18-1127 We propose MAEGE, an automatic methodology for GEC metric ***** validation *****, that overcomes many of the difficulties in the existing methodology. | ||
| L06-1039 In this paper we describe the properties, ***** validation *****, and availability of these resources | ||
| aspect based sentiment | 48 | |
| 2020.acl-main.192 We perform extensive experiments to test this insight on 10 disparate tasks spanning dependency parsing (syntax), semantic role labeling (semantics), relation extraction (information content), ***** aspect based sentiment ***** analysis (sentiment), and many others, achieving performance comparable to state-of-the-art specialized models. | ||
| 2020.coling-main.72 Most of the ***** aspect based sentiment ***** analysis research aims at identifying the sentiment polarities toward some explicit aspect terms while ignores implicit aspects in text. | ||
| 2020.emnlp-main.719 Targeted opinion word extraction (TOWE) is a sub-task of ***** aspect based sentiment ***** analysis (ABSA) which aims to find the opinion words for a given aspect-term in a sentence. | ||
| D19-1663 Identification of sarcasm targets can help in many core natural language processing tasks such as ***** aspect based sentiment ***** analysis, opinion mining etc. | ||
| 2020.lrec-1.617 In this paper, we create a reliable resource for ***** aspect based sentiment ***** analysis in Telugu. | ||
| conditional random field | 48 | |
| 2020.lrec-1.361 We implement several baseline approaches of ***** conditional random field ***** (CRF) and recent popular state-of-the-art bi-directional long-short term memory (Bi-LSTM) models. | ||
| L08-1291 At the core of ParsCit is a trained ***** conditional random field ***** (CRF) model used to label the token sequences in the reference string. | ||
| L06-1069 This paper presents a framework for Thai morphological analysis based on the theoretical background of ***** conditional random field *****s. | ||
| L16-1178 We provide a strong baseline with a linear-chain ***** conditional random field ***** and word-embedding features with a performance of 0.62 for aspect detection and 0.63 for the extraction of subjective phrases. | ||
| 2020.figlang-1.27 In this paper we present a novel resource-inexpensive architecture for metaphor detection based on a residual bidirectional long short-term memory and ***** conditional random field *****s. | ||
| networks | 48 | |
| S19-1018 In doing so, it provides a reformalisation (in TTR) of enthymemes and topoi as ***** networks ***** rather than functions, and information state update rules for conditionals. | ||
| W19-5409 More specifically, one of the proposed approaches employs the translation knowledge between the two languages from two different translation directions; while the other one employs extra monolingual knowledge from both source and target sides, obtained by pre-training deep self-attention ***** networks *****. | ||
| W18-6230 This paper describes an approach to solve implicit emotion classification with the use of pre-trained word embedding models to train multiple neural ***** networks *****. | ||
| 2021.emnlp-main.11 Recent studies have leveraged graph neural ***** networks ***** to capture the inter-sentential relationship (e.g., the discourse graph) within the documents to learn contextual sentence embedding. | ||
| P18-1182 In this paper, we present a new approach (NeuralREG), relying on deep neural ***** networks *****, which makes decisions about form and content in one go without explicit feature extraction. | ||
| lexical semantic change | 48 | |
| 2020.semeval-1.29 This paper presents an approach to ***** lexical semantic change ***** detection based on Bayesian word sense induction suitable for novel word sense identification. | ||
| 2020.semeval-1.9 In this paper, we present our contribution in SemEval-2020 Task 1: Unsupervised Lexical Semantic Change Detection, where we systematically combine existing models for unsupervised capturing of ***** lexical semantic change ***** across time in text corpora of German, English, Latin and Swedish. | ||
| 2021.eacl-main.10 Our results provide a guide for the application and optimization of ***** lexical semantic change ***** detection models across various learning scenarios. | ||
| 2021.starsem-1.3 We use distributional methods to quantify ***** lexical semantic change ***** and induce a social network on communities, based on interactions between members. | ||
| N18-2027 We propose a framework that extends synchronic polysemy annotation to diachronic changes in lexical meaning, to counteract the lack of resources for evaluating computational models of ***** lexical semantic change *****. | ||
| topic modeling | 48 | |
| E17-1033 Our results show that the ***** topic modeling ***** experts reach substantial improvements when compared to the general versions. | ||
| D19-1513 This model captures structural features by a sequential variational autoencoder component and leverages a ***** topic modeling ***** component based on Gaussian distribution to enhance the recognition of text semantics. | ||
| Q17-1037 We introduce Correlation Explanation (CorEx), an alternative approach to ***** topic modeling ***** that does not assume an underlying generative model, and instead learns maximally informative topics through an information-theoretic framework. | ||
| 2021.naacl-main.332 Within neural ***** topic modeling *****, we quantify the quality of topics and document representations via generalization (perplexity), interpretability (topic coherence) and information retrieval (IR) using short-text, long-text, small and large document collections from news and medical domains. | ||
| D17-1249 In this paper, we present a method combining standard ***** topic modeling ***** with signature mining for analyzing topic recurrence in speeches of Clinton and Trump during the 2016 American presidential campaign. | ||
| biomedical text | 48 | |
| L10-1229 Although several studies have focused on processing negation in ***** biomedical text *****s, we are not aware of publicly available resources that describe the scope of negation cues in detail. | ||
| 2019.icon-1.26 The proposed approach is applied over a ***** biomedical text ***** corpus to learn word representation and compared with GloVe, which is one of the most popular word embedding approaches. | ||
| E17-1109 Entity extraction is one of the fundamental components for ***** biomedical text ***** mining. | ||
| 2020.louhi-1.2 Recognising and linking entities is a crucial first step to many tasks in *****biomedical text***** analysis , such as relation extraction and target identification . | ||
| W18-2311 Event and relation extraction are central tasks in *****biomedical text***** mining . | ||
| multiple | 48 | |
| 2020.emnlp-main.162 Trained with these contextually generated vokens, our visually-supervised language models show consistent improvements over self-supervised alternatives on ***** multiple ***** pure-language tasks such as GLUE, SQuAD, and SWAG. | ||
| 2021.acl-long.96 We also carry out ***** multiple ***** experiments to measure how much each augmentation strategy improves the performance of automatic scoring systems. | ||
| 2020.wanlp-1.32 In this paper, several techniques with ***** multiple ***** algorithms are applied for Arabic dialects identification starting from removing noise till classification task using all Arabic countries as 21 classes. | ||
| D18-1270 This enables our approach to: (a) augment the limited supervision in the target language with additional supervision from a high-resource language (like English), and (b) train a single entity linking model for ***** multiple ***** languages, improving upon individually trained models for each language. | ||
| 2021.inlg-1.11 Because existing datasets do not have such alignments of data in ***** multiple ***** modalities, this setting has not been explored in depth. | ||
| memory | 48 | |
| 2008.amta-srw.4 The system attempts this by assembling a combination of terms from its terminology database, translations from its ***** memory *****, and even portions of them. | ||
| 2003.mtsummit-papers.40 The goal of the AMETRA project is to make a computer-assisted translation tool from the Spanish language to the Basque language under the ***** memory *****-based translation framework. | ||
| 2021.ltedi-1.22 This paper proposes a bidirectional long short-term ***** memory ***** (BiLSTM) with the attention-based approach, in solving the hope speech detection problem. | ||
| 2020.acl-main.53 This mechanism consists of two steps: (1) predicting state operation on each of the ***** memory ***** slots, and (2) overwriting the ***** memory ***** with new values, of which only a few are generated according to the predicted state operations. | ||
| 2021.naacl-main.288 Each element of the fact ***** memory ***** is formed from a triple of vectors, where each vector corresponds to a KB entity or relation. | ||
| video | 48 | |
| L06-1485 Its features include synchronized multi-channel audio and ***** video ***** playback, compatibility with several corpora, platform independence, and mixed display of capabilities and a well-defined method for layering datasets. | ||
| W17-1606 Speakers' dialect and gender was controlled for by using ***** video *****s uploaded as part of the “accent tag challenge”, where speakers explicitly identify their language background. | ||
| 2020.findings-emnlp.98 In the proposed study, we make the first attempt to train the ***** video ***** captioning model on labeled data and unlabeled data jointly, in a semi-supervised learning manner. | ||
| W18-3303 Most of the current multimodal research in this area deals with various techniques to fuse the modalities, and mostly treat the segments of a ***** video ***** independently. | ||
| 2021.naacl-main.193 Comprehensive experiments on three ***** video *****-and-language tasks (text-to-***** video ***** retrieval, ***** video ***** captioning, and ***** video ***** question answering) across five datasets demonstrate that our approach outperforms previous state-of-the-art methods. | ||
| pronoun translation | 48 | |
| D19-1294 We further propose an evaluation measure to differentiate good and bad ***** pronoun translation *****s. | ||
| D19-6501 We define an error typology that aims to go further than ***** pronoun translation ***** adequacy and includes types such as incorrect word selection or missing words. | ||
| 2021.wat-1.11 We show that the proposed method significantly improves the accuracy of zero ***** pronoun translation ***** with machine translation experiments in the conversational domain. | ||
| 2020.coling-main.417 ContraPro is a notable example of a contrastive challenge set for English→German ***** pronoun translation *****. | ||
| W17-1505 The experimental results for Spanish to English MT on the AnCora-ES corpus show that the second approach yields a substantial increase in the accuracy of ***** pronoun translation *****, with BLEU scores remaining constant. | ||
| squad dataset | 48 | |
| 2020.findings-emnlp.145 On the *****SQuAD dataset*****, our proposed method achieves 70.14% F1 score with supervision from 26 explanations, comparable to plain supervised learning using 1,100 labeled instances, yielding a 12x speed up. | ||
| N19-1362 Our representations also aid in better generalization with gains of around 6-7% on adversarial *****SQuAD datasets*****, and 8.8% on the adversarial entailment test set by Glockner et al. | ||
| D17-1085 We evaluate our approach using a state-of-the-art neural attention model on the *****SQuAD dataset*****. | ||
| 2020.emnlp-main.84 Among them, we found that using the prior distribution of answer positions as a bias model is very effective at reducing position bias, recovering the performance of BERT from 37.48% to 81.64% when trained on a biased *****SQuAD dataset*****. | ||
| P19-1604 We report improvements obtained over the state-of-the-art on the *****SQuAD dataset***** according to automated metrics (BLEU, ROUGE), as well as qualitative human assessments of the system outputs. | ||
| pre - trained language | 48 | |
| 2020.nuse-1.14 Here we experiment with the use of information retrieval as an augmentation for *****pre - trained language***** models . | ||
| 2020.emnlp-main.395 While a lot of analysis has been carried to demonstrate linguistic knowledge captured by the representations learned within deep NLP models , very little attention has been paid towards individual neurons . We carry outa neuron - level analysis using core linguistic tasks of predicting morphology , syntax and semantics , on *****pre - trained language***** models , with questions like : i ) do individual neurons in pre - trained models capture linguistic information ? | ||
| 2021.emnlp-main.646 The primary paradigm for multi - task training in natural language processing is to represent the input with a shared *****pre - trained language***** model , and add a small , thin network ( head ) per task . | ||
| 2020.findings-emnlp.401 Many efforts have been devoted to extracting constituency trees from *****pre - trained language***** models , often proceeding in two stages : feature definition and parsing . | ||
| 2021.emnlp-main.836 While *****pre - trained language***** models have obtained state - of - the - art performance for several natural language understanding tasks , they are quite opaque in terms of their decision - making process . | ||
| taxonomies | 47 | |
| 2021.acl-long.545 Developers often include such knowledge, structure as ***** taxonomies *****, in the documentation of chatbots. | ||
| N18-1030 We find that while transitive algorithms out-perform their non-transitive counterparts, the top-performing transitive algorithm is prohibitively slow for ***** taxonomies ***** with as few as 50 entities. | ||
| 2020.acl-main.199 Specifically, our proposed Graph2Taxo uses a noisy graph constructed from automatically extracted noisy hyponym hypernym candidate pairs, and a set of ***** taxonomies ***** for some known domains for training. | ||
| P18-1229 All components are trained in an end-to-end manner with cumulative rewards, measured by a holistic tree metric over the training ***** taxonomies *****. | ||
| D17-1123 While a large number of ***** taxonomies ***** have been constructed from human-compiled resources (e.g., Wikipedia), learning ***** taxonomies ***** from text corpora has received a growing interest and is essential for long-tailed and domain-specific knowledge acquisition | ||
| workflows | 47 | |
| L16-1388 This article wants to help fill this gap by proposing an initial version of a generic Language Resource Life Cycle that can be used to inform, direct, control and evaluate LR research and development activities (including description, management, production, validation and evaluation ***** workflows *****). | ||
| 2020.eamt-1.44 A new machine translation paradigm, neural machine translation (NMT), is displacing its corpus-based predecessor, statistical machine translation (SMT), in the translation ***** workflows ***** currently implemented because it usually increases the fluency and accuracy of the MT output. | ||
| 2020.sigdial-1.32 This demo paper presents Emora STDM (State Transition Dialogue Manager), a dialogue system development framework that provides novel ***** workflows ***** for rapid prototyping of chat-based dialogue managers as well as collaborative development of complex interactions. | ||
| 2001.mtsummit-papers.8 Today, translation is not only important for reaching global audiences, it is becoming an indispensable component inside other systems and ***** workflows *****. | ||
| L14-1708 Users can combine those language services in service ***** workflows ***** to meet their requirements | ||
| paragraphs | 47 | |
| D19-1300 We adopt a convolutional neural network to encode gist of ***** paragraphs ***** for rough reading, and a decision making policy with an adapted termination mechanism for careful reading. | ||
| 2020.coling-main.506 The former processes discourse units from the end to the beginning in a document to utilize the left-branching bias of discourse structure in Chinese, while the latter reverses the position of ***** paragraphs ***** in a discourse unit to enhance the differentiation of coherence between adjacent discourse units. | ||
| 2021.naacl-industry.21 As a result, we successfully stored emotion probabilities for 95 million ***** paragraphs ***** within 96 hours. | ||
| 2020.findings-emnlp.416 To the best of our knowledge, we are the first to tackle the challenge of multi-hop reasoning over ***** paragraphs ***** without any sentence-level information. | ||
| D19-3030 Generating syntactically and semantically valid and relevant questions from ***** paragraphs ***** is useful with many applications | ||
| testbed | 47 | |
| P17-1147 Neither approach comes close to human performance (23% and 40% vs. 80%), suggesting that TriviaQA is a challenging ***** testbed ***** that is worth significant future study. | ||
| 2020.lrec-1.257 Furthermore, we conduct experiments on entity salience detection; the results demonstrate that WN-Salience is a challenging ***** testbed ***** that is complementary to existing ones. | ||
| L14-1237 To test the hypothesis that this technology is close to applicability, and to provide a ***** testbed ***** for reducing any accuracy gaps, we have developed an evaluation paradigm for historical record handwriting recognition. | ||
| D18-1418 We also release permuted-bAbI dialog tasks, our proposed ***** testbed *****, to the community for evaluating dialog systems in a goal-oriented setting. | ||
| E17-2033 2 dataset - a popular belief tracking ***** testbed ***** with dialogs from restaurant information system | ||
| STS | 47 | |
| D19-1410 We evaluate SBERT and SRoBERTa on common ***** STS ***** tasks and transfer learning tasks, where it outperforms other state-of-the-art sentence embeddings methods. | ||
| 2020.signlang-1.29 In this paper we describe ***** STS *****-korpus, a web corpus tool for Swedish Sign Language (***** STS *****) which we have built during the past year, and which is now publicly available on the internet. | ||
| C16-1009 To determine the ***** STS ***** of two texts, hundreds of different ***** STS ***** systems exist, however, for an NLP system designer, it is hard to decide which system is the best one. | ||
| 2020.findings-emnlp.39 Motivated by this, we construct and release new datasets for Korean NLI and ***** STS *****, dubbed KorNLI and Kor***** STS *****, respectively | ||
| S17-2030 We use referential translation machines for predicting the semantic similarity of text in all *****STS***** tasks which contain Arabic , English , Spanish , and Turkish this year . | ||
| pooling | 47 | |
| P19-1140 Each channel encodes KGs via different relation weighting schemes with respect to self-attention towards KG completion and cross-KG attention for pruning exclusive entities respectively, which are further combined via ***** pooling ***** techniques. | ||
| D19-6203 The experimental results suggest that dependency-based ***** pooling ***** is the best ***** pooling ***** strategy for RE in the biomedical domain, yielding the state-of-the-art performance on two benchmark datasets for this problem. | ||
| P18-1041 Based upon this understanding, we propose two additional ***** pooling ***** strategies over learned word embeddings: (i) a max-***** pooling ***** operation for improved interpretability; and (ii) a hierarchical ***** pooling ***** operation, which preserves spatial (n-gram) information within text sequences. | ||
| 2020.acl-main.267 However, their ***** pooling ***** norms are always fixed and may not be optimal for learning accurate text representations in different tasks. | ||
| 2021.emnlp-main.48 We further demonstrate how the selective ***** pooling ***** can add insights into the CT termination status prediction | ||
| preposition | 47 | |
| 2021.emnlp-main.766 Furthermore, feedback comments can be made on other grammatical and writing items than ***** preposition ***** use, which is still unaddressed. | ||
| 2020.lrec-1.42 Then, we describe two corpora that we have manually annotated with feedback comments (approximately 50,000 general comments and 6,700 on ***** preposition ***** use). | ||
| C18-1155 We use these representations to decide the correct ***** preposition *****. | ||
| Q13-1019 Further, by jointly predicting the relation, arguments, and their types along with ***** preposition ***** sense, we show that we can not only improve the relation accuracy, but also significantly improve sense prediction accuracy. | ||
| D18-1180 The crucial abstraction of ***** preposition ***** senses as word representations permits their use in downstream applications–phrasal verb paraphrasing and ***** preposition ***** selection–with new state-of-the-art results | ||
| unimodal | 47 | |
| 2021.emnlp-main.720 Specifically, to improve ***** unimodal ***** representations, a ***** unimodal ***** refinement module is designed to refine modality-specific learning via iteratively updating the distribution with transformer-based attention layers. | ||
| 2020.emnlp-main.62 This function projection modifies model predictions so that cross-modal interactions are eliminated, isolating the additive, ***** unimodal ***** structure. | ||
| 2021.emnlp-main.723 In this work, we propose a framework named MultiModal InfoMax (MMIM), which hierarchically maximizes the Mutual Information (MI) in ***** unimodal ***** input pairs (inter-modality) and between multimodal fusion result and ***** unimodal ***** input in order to maintain task-related information through multimodal fusion. | ||
| W18-3309 Our work also improves feature selection for ***** unimodal ***** sentiment analysis, while proposing a novel and effective multimodal fusion architecture for this task | ||
| N19-1197 We demonstrate the surprising strength of *****unimodal***** baselines in multimodal domains , and make concrete recommendations for best practices in future research . | ||
| Structured | 47 | |
| 2021.fever-1.1 The Fact Extraction and VERification Over Unstructured and ***** Structured ***** information (FEVEROUS) shared task, asks participating systems to determine whether human-authored claims are Supported or Refuted based on evidence retrieved from Wikipedia (or NotEnoughInfo if the claim cannot be verified). | ||
| 2021.acl-long.529 *****Structured***** information is an important knowledge source for automatic verification of factual claims . | ||
| 2020.findings-emnlp.87 *****Structured***** representations like graphs and parse trees play a crucial role in many Natural Language Processing systems . | ||
| D17-3006 *****Structured***** prediction is one of the most important topics in various fields , including machine learning , computer vision , natural language processing ( NLP ) and bioinformatics . | ||
| 2020.findings-emnlp.406 *****Structured***** prediction is often approached by training a locally normalized model with maximum likelihood and decoding approximately with beam search . | ||
| intermediate | 47 | |
| 2021.naacl-main.381 We argue that the structured ***** intermediate ***** representations enable the model to take better control of the contents (salient facts) and structures (the syntax that connects the facts) when generating the summary. | ||
| 2021.acl-long.322 However, we find that models trained to predict mood often also capture private user identities in their ***** intermediate ***** representations. | ||
| 2021.emnlp-main.603 To tackle the aforementioned problems all together, we propose Universal-KD to match ***** intermediate ***** layers of the teacher and the student in the output space (by adding pseudo classifiers on ***** intermediate ***** layers) via the attention-based layer projection. | ||
| 2019.iwslt-1.14 The task consists in the “direct” translation (i.e. without ***** intermediate ***** discrete representation) of English speech data derived from TED Talks or lectures into German texts. | ||
| 2021.emnlp-main.708 We chose SPARQL because its queries are structurally closer to our ***** intermediate ***** representations (compared to SQL) | ||
| Online | 47 | |
| P18-3008 While growing code-mixed content on ***** Online ***** Social Networks(OSN) provides a fertile ground for studying various aspects of code-mixing, the lack of automated text analysis tools render such studies challenging. | ||
| L16-1131 In this paper we present the OnForumS corpus developed for the shared task of the same name on ***** Online ***** Forum Summarisation (OnForumS at MultiLing'15). | ||
| W17-5217 Patients turn to ***** Online ***** Health Communities not only for information on specific conditions but also for emotional support | ||
| 2021.nlp4posimpact-1.10 *****Online***** shopping is an ever more important part of the global consumer economy , not just in times of a pandemic . | ||
| S19-2223 In this paper we describe a deep - learning system that competed as SemEval 2019 Task 9 - SubTask A : Suggestion Mining from *****Online***** Reviews and Forums . | ||
| entity disambiguation | 47 | |
| 2021.acl-long.345 By covering the set of entities for polysemous names, AmbER sets act as a challenging test of ***** entity disambiguation *****. | ||
| R17-1038 Typically those studies focus on named entity recognition, entity linking, and ***** entity disambiguation ***** or clustering. | ||
| L14-1036 Within the project, OLD provides a named entity repository for ***** entity disambiguation *****. | ||
| L16-1088 We propose a novel approach that combines two state-of-the-art models ― for ***** entity disambiguation ***** and for paraphrase detection ― to overcome these challenges | ||
| D17-1277 We propose a novel deep learning model for joint document-level ***** entity disambiguation *****, which leverages learned neural representations. | ||
| approaches | 47 | |
| N19-1218 Among the tested models, an LSTM-based approach obtains the best performance for frequent actions and large scene descriptions, but ***** approaches ***** such as logistic regression behave well on infrequent actions. | ||
| N18-1056 We show that BiSparse-Dep can significantly improve performance on this task, compared to ***** approaches ***** based only on lexical context. | ||
| 2021.tacl-1.4 This work builds upon two lines of research: It combines the modeling flexibility of prior work on content-based sparse attention with the efficiency gains from ***** approaches ***** based on local, temporal sparse attention. | ||
| 2020.inlg-1.23 We use the annotations as a basis for examining information included in evaluation reports, and levels of consistency in ***** approaches *****, experimental design and terminology, focusing in particular on the 200+ different terms that have been used for evaluated aspects of quality. | ||
| D19-1540 Evaluations on multiple IR and NLP benchmarks demonstrate state-of-the-art effectiveness compared to ***** approaches ***** that do not exploit pretraining on external data | ||
| dependency parsers | 47 | |
| 2020.lrec-1.633 A wide variety of transition-based algorithms are currently used for ***** dependency parsers *****. | ||
| L12-1415 We present MaltOptimizer, a freely available tool developed to facilitate parser optimization using the open-source system MaltParser, a data-driven parser-generator that can be used to train ***** dependency parsers ***** given treebank data. | ||
| N19-1253 To test our hypothesis, we train ***** dependency parsers ***** on an English corpus and evaluate their transfer performance on 30 other languages. | ||
| L12-1115 We highlight the use of this resource via three experiments, that (1) compare tagging accuracies across languages, (2) present an unsupervised grammar induction approach that does not use gold standard part-of-speech tags, and (3) use the universal tags to transfer ***** dependency parsers ***** between languages, achieving state-of-the-art results. | ||
| L10-1311 We finally propose a wrapper program that, as a proof of concept, converts output data from different ***** dependency parsers ***** in proprietary XML formats to the GrAF-compliant XML representation | ||
| complexity | 47 | |
| 2020.winlp-1.6 As a result, SIMPLEX-PB 2.0 features much more reliable and numerous candidate substitutions to complex words, as well as word ***** complexity ***** rankings produced by a group underprivileged children. | ||
| 2020.coling-demos.13 In response to the increasing prevalence of cyberbullying, online social networks have increased efforts to clamp down on online abuse but unfortunately, the nature, ***** complexity ***** and sheer volume of cyberbullying means that many cyberbullying incidents go undetected. | ||
| R19-1037 Entropy is used to quantify the information content in each gap, which can be used to estimate ***** complexity *****. | ||
| L06-1365 A terminological database was built by exploiting the computational tools of ItalWordNet (IWN) and its lexical-semantic model (EuroWordNet).This paper concerns the development of database structure and data coding, relevance of the concepts of term and domain, information potential of the terms, ***** complexity ***** of this domain and detailed ontology structuring recently undertaken and still in progress. | ||
| W17-5003 This paper is a preliminary report on using text ***** complexity ***** measurement in the service of a new educational application | ||
| reference | 47 | |
| W18-6320 In GF, certain words are removed from ***** reference ***** translations and readers are asked to fill the gaps left using the machine-translated text as a hint. | ||
| 2013.iwslt-papers.5 We present a method to estimate the quality of automatic translations when ***** reference ***** translations are not available. | ||
| W17-3515 I present ongoing work on modeling ***** reference ***** with a distribu-ted model aimed at capturing both aspects, and learns to refer directly from ***** reference ***** acts. | ||
| 2020.emnlp-main.111 Then we propose a novel reading comprehension model KMQA, which can fully exploit the structural medical knowledge (i.e., medical knowledge graph) and the ***** reference ***** medical plain text (i.e., text snippets retrieved from ***** reference ***** books). | ||
| 2012.amta-wptp.5 In this paper, we describe a probabilistic approach for learning reinsertion rules for specific languages and MT systems, as well as a method for synthesizing training data from ***** reference ***** translations | ||
| semantic dependency parsing | 47 | |
| S19-2014 The system is applied to the CONLLU format of the input data and is best suited for ***** semantic dependency parsing *****. | ||
| 2021.eacl-main.66 We illustrate MTI with a system that performs part-of-speech tagging, syntactic dependency parsing and ***** semantic dependency parsing *****. | ||
| P18-2106 Previous approaches to multilingual ***** semantic dependency parsing ***** treat languages independently, without exploiting the similarities between semantic structures across languages. | ||
| P18-1173 We show that training with SPIGOT leads to a larger improvement on the downstream task than a modularly-trained pipeline, the straight-through estimator, and structured attention, reaching a new state of the art on ***** semantic dependency parsing *****. | ||
| P17-1186 By using efficient, nearly arc-factored inference and a bidirectional-LSTM composed with a multi-layer perceptron, our base system is able to significantly improve the state of the art for ***** semantic dependency parsing *****, without using hand-engineered features or syntax. | ||
| syntactic structure | 47 | |
| D19-6123 Recently, neural network models which automatically infer ***** syntactic structure ***** from raw text have started to achieve promising results. | ||
| D18-1239 The model learns a variety of challenging semantic operators, such as quantifiers, disjunctions and composed relations, and infers latent ***** syntactic structure *****. | ||
| N19-1018 The parsing strategy is based on the assumption that most ***** syntactic structure *****s can be parsed incrementally and that the set –the memory of the parser– remains reasonably small on average. | ||
| L06-1471 The development of this corpus was motivated by the need to have both metadata and ***** syntactic structure ***** annotated in order to support synergistic work on speech parsing and structural event detection. | ||
| L12-1613 For each MWE its basic morphological form and the base forms of its constituents are specified but also each MWE is assigned to a class on the basis of its ***** syntactic structure *****. | ||
| semantic change detection | 47 | |
| 2020.emnlp-main.682 Through extensive experimentation under various settings with synthetic and real data we showcase the importance of sequential modelling of word vectors through time for ***** semantic change detection *****. | ||
| 2020.semeval-1.29 This paper presents an approach to lexical ***** semantic change detection ***** based on Bayesian word sense induction suitable for novel word sense identification. | ||
| 2021.eacl-main.10 Our results provide a guide for the application and optimization of lexical ***** semantic change detection ***** models across various learning scenarios. | ||
| 2021.hackashop-1.17 We conduct automatic sentiment and viewpoint analysis of the newly created Slovenian news corpus containing articles related to the topic of LGBTIQ+ by employing the state-of-the-art news sentiment classifier and a system for ***** semantic change detection *****. | ||
| 2021.emnlp-main.847 This study investigates the applicability of *****semantic change detection***** methods in descriptively oriented linguistic research . | ||
| data collection | 47 | |
| 2020.lrec-1.218 A weak TLS algorithm can even match a stronger one by employing a stronger IR method in the ***** data collection ***** phase. | ||
| 2020.acl-demos.5 We present a large improvement over classic search engine baseline on several standard QA datasets and provide the community a collaborative ***** data collection ***** tool to curate the first natural language processing research QA dataset via a community effort. | ||
| L14-1383 Nevertheless, interesting legal stumbling blocks exist, both with respect to the ***** data collection ***** and data sharing phases, due to the strict rules of copyright and database law. | ||
| 2020.trac-1.25 We describe the process of ***** data collection *****, the tagset used for annotation, and issues and challenges faced during the process of annotation. | ||
| D18-1547 Even though machine learning has become the major scene in dialogue research community, the real breakthrough has been blocked by the scale of data available.To address this fundamental obstacle, we introduce the Multi-Domain Wizard-of-Oz dataset (MultiWOZ), a fully-labeled collection of human-human written conversations spanning over multiple domains and topics.At a size of 10k dialogues, it is at least one order of magnitude larger than all previous annotated task-oriented corpora.The contribution of this work apart from the open-sourced dataset is two-fold:firstly, a detailed description of the ***** data collection ***** procedure along with a summary of data structure and analysis is provided. | ||
| text understanding | 47 | |
| C18-1027 Although it is known that conceptual complexity plays a significant role in ***** text understanding *****, no attempts have been made at assessing it automatically. | ||
| D17-1021 Resolving abstract anaphora is an important, but difficult task for ***** text understanding *****. | ||
| 2021.naacl-main.362 This result indicates that our approach is effective for procedural ***** text understanding ***** in general. | ||
| 2020.acl-main.668 Machine reading is an ambitious goal in NLP that subsumes a wide range of ***** text understanding ***** capabilities. | ||
| 2021.acl-demo.1 This paper introduces TexSmart, a ***** text understanding ***** system that supports fine-grained named entity recognition (NER) and enhanced semantic analysis functionalities. | ||
| applications | 47 | |
| 2021.acl-long.523 We show that Polyjuice produces diverse sets of realistic counterfactuals, which in turn are useful in various distinct ***** applications *****: improving training and evaluation on three different tasks (with around 70% less annotation effort than manual generation), augmenting state-of-the-art explanation techniques, and supporting systematic counterfactual error analysis by revealing behaviors easily missed by human experts. | ||
| P17-1028 We evaluate a suite of methods across two different ***** applications ***** and text genres: Named-Entity Recognition in news articles and Information Extraction from biomedical abstracts. | ||
| 1999.mtsummit-1.88 A multi-user, networkable application, Logos 8 allows Internet or Intranet use of its ***** applications ***** with client interfaces that communicate with dictionaries and translation servers through a common gateway. | ||
| P17-1115 Accurate identification and interpretation of metonymy can be directly beneficial to various NLP ***** applications *****, such as Named Entity Recognition and Geographical Parsing. | ||
| P19-1162 While these approaches offer great geometric insights into unintended biases in the embedding vector space, they fail to offer an interpretable meaning for how the embeddings could cause discrimination in downstream NLP ***** applications *****. | ||
| computer vision | 47 | |
| L16-1220 Recently, with the availability of inexpensive depth cameras, groups from the ***** computer vision ***** community have started collecting corpora with large number of repetitions for sign language recognition research. | ||
| W18-1503 These texts are made possible by advances and cross-disciplinary approaches in natural language processing, generation, and ***** computer vision *****. | ||
| P19-1555 While data augmentation is an important trick to boost the accuracy of deep learning methods in ***** computer vision ***** tasks, its study in natural language tasks is still very limited. | ||
| E17-1104 However, these architectures are rather shallow in comparison to the deep convolutional networks which have pushed the state-of-the-art in ***** computer vision *****. | ||
| 2020.lrec-1.185 With the tremendous success of deep learning models on *****computer vision***** tasks , there are various emerging works on the Natural Language Processing ( NLP ) task of Text Classification using parametric models . | ||
| answer sentence selection | 47 | |
| D18-1210 In our benchmarks on four different tasks, including ontology classification, sentiment analysis, ***** answer sentence selection *****, and paraphrase identification, our proposed model, a modified CNN with context-sensitive filters, consistently outperforms the standard CNN and attention-based CNN baselines. | ||
| D18-1109 We show that simple CNN architectures equipped with recurrent neural filters (RNFs) achieve results that are on par with the best published ones on the Stanford Sentiment Treebank and two ***** answer sentence selection ***** datasets. | ||
| N18-1058 We also achieved new state-of-the-art results on two competitive ***** answer sentence selection ***** tasks: WikiQA and TrecQA. | ||
| E17-1002 We implemented and evaluated a binary tree model of NTI, showing the model achieved the state-of-the-art performance on three different NLP tasks: natural language inference, ***** answer sentence selection *****, and sentence classification, outperforming state-of-the-art recurrent and recursive neural networks. | ||
| 2021.acl-long.252 This paper studies joint models for selecting correct answer sentences among the top k provided by ***** answer sentence selection ***** (AS2) modules, which are core components of retrieval-based Question Answering (QA) systems. | ||
| generative lexicon | 47 | |
| L06-1281 In this paper we describe the structure and development of the Brandeis Semantic Ontology (BSO), a large ***** generative lexicon ***** ontology and lexical database. | ||
| L16-1156 In this paper, corpus annotation for argument mining is first developed, then we show how the ***** generative lexicon ***** approach must be adapted and how it can be paired with language processing patterns to extract and specify the nature of arguments. | ||
| L12-1259 The goal of this paper is to provide an annotation scheme for compounds based on ***** generative lexicon ***** theory (GL, Pustejovsky, 1995; Bassac and Bouillon, 2001). | ||
| L10-1492 Our corpus consists of ca 4000 sentences from the PAROLE sottoinsieme corpus (Bindi et al. 2000) annotated with Selection and Coercion relations among verb-noun pairs formatted in XML according to the *****Generative Lexicon***** Mark-up Language (GLML) format (Pustejovsky et al., 2008). | ||
| W19-3318 These representations use a *****Generative Lexicon*****-inspired subevent structure to track attributes of event participants across time, highlighting oppositions and temporal and causal relations among the subevents. | ||
| digital | 47 | |
| 2020.acl-main.51 The problem of comparing two bodies of text and searching for words that differ in their usage between them arises often in ***** digital ***** humanities and computational social science. | ||
| N18-3019 Extensive experimentation over a dataset of 10 domains drawn from data relevant to our commercial personal ***** digital ***** assistant shows that our BoE models outperform the baseline models with a statistically significant average margin of 5.06% in absolute F1-score when training with 2000 instances per domain, and achieve an even higher improvement of 12.16% when only 25% of the training data is used. | ||
| C16-1320 As an additional objective, we discuss two novel use cases including automatically extracting links to public datasets from the proceedings, which would further accelerate the advancement in ***** digital ***** libraries. | ||
| W19-6117 The essential resources include a morphological analyzer, ***** digital ***** dictionaries, and corpora of Sakha texts. | ||
| L16-1300 Text analysis methods widely used in ***** digital ***** humanities often involve word co-occurrence, e.g. | ||
| hypernyms | 46 | |
| P17-1128 Finding the correct ***** hypernyms ***** for entities is essential for taxonomy learning, fine-grained entity categorization, query understanding, etc. | ||
| D17-2016 In word sense disambiguation (WSD), knowledge-based systems tend to be much more interpretable than knowledge-free counterparts as they rely on the wealth of manually-encoded elements representing word senses, such as ***** hypernyms *****, usage examples, and images. | ||
| L16-1722 The second possibility seems to be the most likely, even though ROOT9 can be trained on negative examples (i.e., switched ***** hypernyms *****) to drastically reduce this bias. | ||
| 2020.cogalex-1.6 Although the system reached ra-ther weak results for the subcategories of synonyms, antonyms and ***** hypernyms *****, with some dif-ferences from one language to another, it is able to measure general semantic associations (as being random or not-random) with an F1 score close to 0.80. | ||
| 2021.blackboxnlp-1.20 We find that, in a setting where all ***** hypernyms ***** are guessable via prompting, BERT knows ***** hypernyms ***** with up to 57% accuracy | ||
| implementations | 46 | |
| 2021.emnlp-main.671 We present an analysis of possible ***** implementations ***** of dual decoding, and experiment with four applications. | ||
| L16-1239 Moreover, we contrast these with old(er) methods and ***** implementations ***** for POS tagging. | ||
| W16-4106 Our study addresses these concerns by comparing several ***** implementations ***** of prominent sentence processing theories on an exploratory corpus and evaluating the most successful of these on a confirmatory corpus, using a new self-paced reading corpus of seemingly natural narratives constructed to contain an unusually high proportion of memory-intensive constructions. | ||
| 2020.cl-2.6 The development of GF started in 1998, first as a tool for controlled language ***** implementations *****, where it has gained an established position in both academic and commercial projects. | ||
| W17-4804 We provide ***** implementations ***** of all the methods described and they are freely available as an open-source framework | ||
| Notably | 46 | |
| 2020.acl-main.45 ***** Notably *****, we are able to achieve SOTA results on CTB5, CTB6 and UD1.4 for the part of speech tagging task; SOTA results on CoNLL03, OntoNotes5.0, MSRA and OntoNotes4.0 for the named entity recognition task; along with competitive results on the tasks of machine reading comprehension and paraphrase identification. | ||
| 2020.acl-main.574 ***** Notably *****, our methods surpass the model fine-tuned on pre-trained language models without external resource. | ||
| 2020.acl-main.702 ***** Notably *****, this percentage has not improved since the mid 2000s. | ||
| 2020.wnut-1.16 ***** Notably ***** enough, on dialogue clarity and optimality, the two paraphrase sources' human-perceived quality does not differ significantly. | ||
| 2020.emnlp-main.408 ***** Notably *****, we improve the best known results on DSTC2 by up to 5 points for slot-carryover | ||
| Nowadays | 46 | |
| 2020.inlg-1.3 ***** Nowadays *****, they are used as summaries presenting all the steps of a judicial case. | ||
| N19-2009 ***** Nowadays *****, more and more customers browse and purchase products in favor of using mobile E-Commerce Apps such as Taobao and Amazon. | ||
| 2020.iwltp-1.11 ***** Nowadays ***** the scarcity and dispersion of open-source NLP resources and tools in and for African languages make it difficult for researchers to truly fit these languages into current algorithms of artificial intelligence, resulting in the stagnation of these numerous languages, as far as technological progress is concerned. | ||
| L08-1404 ***** Nowadays *****, there are hundreds of Natural Language Processing applications and resources for different languages that are developed and/or used, almost exclusively with a few but notable exceptions, by their creators. | ||
| 2021.triton-1.3 ***** Nowadays ***** there is a pressing need to develop interpreting-related technolo-gies, with practitioners and other end-users increasingly calling for tools tai-lored to their needs and their new interpreting scenarios | ||
| masking | 46 | |
| 2020.wmt-1.84 First, a BERT-like cross-lingual language model is pre-trained by randomly ***** masking ***** target sentences alone. | ||
| 2020.emnlp-main.174 Extensive evaluations of ***** masking ***** BERT, RoBERTa, and DistilBERT on eleven diverse NLP tasks show that our ***** masking ***** scheme yields performance comparable to finetuning, yet has a much smaller memory footprint when several tasks need to be inferred. | ||
| 2020.emnlp-main.707 We introduce X-LXMERT, an extension to LXMERT with training refinements including: discretizing visual representations, using uniform ***** masking ***** with a large range of ***** masking ***** ratios and aligning the right pre-training datasets to the right objectives which enables it to paint. | ||
| 2020.emnlp-main.722 Building upon entity-level masked language models, our first contribution is an entity ***** masking ***** scheme that exploits relational knowledge underlying the text. | ||
| 2021.wnut-1.21 Based on these findings, we propose ***** masking ***** methods using Wikidata to mitigate the influence of person names and validate whether they make fake news detection models robust through experiments with in-domain and out-of-domain data | ||
| benchmarking | 46 | |
| 2020.blackboxnlp-1.30 Without ablation studies ***** benchmarking ***** the search algorithm change with the search space held constant, one cannot tell if an increase in attack success rate is a result of an improved search algorithm or a less restrictive search space. | ||
| E17-2038 We also present the first ***** benchmarking ***** results on translating to and from Arabic for 22 European languages. | ||
| 2020.alta-1.7 However, there is a lack of ***** benchmarking ***** platform to provide a unified environment under consistent evaluation criteria for ABSA, resulting in the difficulties for fair comparisons. | ||
| W17-5705 This paper describes the characteristics of the created datasets and reports on our ***** benchmarking ***** experiments on word-level QE, sentence-level QE, and APE conducted using the created datasets. | ||
| 2021.eacl-main.257 We use this dataset and other publicly available datasets to conduct a comprehensive ***** benchmarking ***** study on using various state-of-the-art multilingual pre-trained models for task-oriented semantic parsing | ||
| controllable | 46 | |
| 2020.findings-emnlp.190 In this work, we formulate high-fidelity NLG as generation from logical forms in order to obtain ***** controllable ***** and faithful generations. | ||
| 2021.acl-short.88 We explore targeted question generation as a ***** controllable ***** sequence generation task. | ||
| 2021.emnlp-main.335 We also consider a new headline generation strategy that takes advantage of the ***** controllable ***** generation order of Transformer. | ||
| 2021.naacl-demos.4 DiSCoL is an open-domain dialogue system that leverages conversational lines (briefly convlines) as ***** controllable ***** and informative content-planning elements to guide the generation model produce engaging and informative responses. | ||
| 2021.emnlp-main.617 Our approach combines the strengths of both classical slot filling approaches (that are generally ***** controllable *****) and modern neural NLG approaches (that are generally more natural and accurate) | ||
| obtained | 46 | |
| L14-1302 According to the ***** obtained ***** results, the developed technique allows to increase the emotion recognition performance by up to 26.08 relative improvement in accuracy. | ||
| 2020.lrec-1.499 Our experiments confirm that the ***** obtained ***** bilingual dictionaries outperform previously-available ones, and that word embeddings from a low-resource language can benefit from resource-rich closely-related languages when they are aligned together. | ||
| 2020.calcs-1.8 In this paper, we explore the methods of obtaining parse trees of code-mixed sentences and analyse the ***** obtained ***** trees. | ||
| R19-1124 Evaluation shows that about 75% of the ***** obtained ***** expressions are correct, actual errors are rare. | ||
| 2020.lrec-1.253 Based on the ***** obtained ***** results, the best recall was 1.000, best precision was 0.940, and best F-measure was 0.895 | ||
| Bengali | 46 | |
| 2005.mtsummit-posters.8 The present work describes a Phrasal Example Based Machine Translation system from English to ***** Bengali ***** that identifies the phrases in the input through a shallow analysis, retrieves the target phrases using a Phrasal Example base and finally combines the target language phrases employing some heuristics based on the phrase ordering rules for ***** Bengali *****. | ||
| 2019.icon-1.16 We also obtain equivalent datasets for ***** Bengali ***** and English from a collaboration. | ||
| P17-1136 To train the model on ***** Bengali *****, we develop a gold lemma annotated dataset (having 1,702 sentences with a total of 20,257 word tokens), which is an additional contribution of this work. | ||
| 2021.calcs-1.16 Abusive text detection in low-resource languages such as ***** Bengali ***** is a challenging task due to the inadequacy of resources and tools | ||
| 2018.gwc-1.1 Despite being a popular language in the world , the *****Bengali***** language lacks in having a good wordnet . | ||
| subjectivity | 46 | |
| L10-1250 So far, there has been little work examining the distinction between definite polar ***** subjectivity ***** and indefinite polar ***** subjectivity *****. | ||
| L16-1701 We find that the task can be annotated consistently over time, but that ***** subjectivity ***** issues impacts the quality of the annotation. | ||
| W17-1318 Our best models achieve significantly above the baselines, with 67.93% and 69.37% accuracies for ***** subjectivity ***** and sentiment classification respectively. | ||
| W19-1310 The experiments demonstrate that the proposed learning-to-rank method outperforms the baseline method in ranking documents based on their ***** subjectivity ***** degree. | ||
| L08-1086 This paper introduces a method for creating a *****subjectivity***** lexicon for languages with scarce resources . | ||
| disambiguating | 46 | |
| W17-4603 We recorded a native English speaker saying several of each type of sentence both with and without ***** disambiguating ***** contextual information. | ||
| N18-1071 We present a gradient-tree-boosting-based structured learning model for jointly ***** disambiguating ***** named entities in a document. | ||
| L06-1207 Thus, discovering and ***** disambiguating ***** acronyms and their expanded forms are essential aspects of text mining and terminology management. | ||
| 2021.emnlp-main.344 Each example in Cryptonite is a cryptic clue, a short phrase or sentence with a misleading surface reading, whose solving requires ***** disambiguating ***** semantic, syntactic, and phonetic wordplays, as well as world knowledge | ||
| 1997.iwpt-1.5 In this paper , we propose a *****disambiguating***** technique called controlled disjunctions . | ||
| inputs | 46 | |
| P19-1351 Hence, we propose a combined Visual and Textual Question Answering (VTQA) model which takes as input a paragraph caption as well as the corresponding image, and answers the given question based on both ***** inputs *****. | ||
| D19-1132 Moreover, we also explore conditioning the controller on the source ***** inputs ***** of the target task, since certain strategies may not apply to ***** inputs ***** that do not contain that strategy's required linguistic features. | ||
| 2021.fever-1.4 The first one allows having a robust retrieval of full evidence sets, while the second one enables entailment to take full advantage of noisy evidence ***** inputs *****. | ||
| 2010.amta-papers.11 Finally, the PBSMT system is tuned and tested on the generated word lattices to show the benefits of adding potential source-side reorderings in the ***** inputs *****. | ||
| 2020.acl-main.603 In this paper, we study machine reading comprehension (MRC) on long texts: where a model takes as ***** inputs ***** a lengthy document and a query, extracts a text span from the document as an answer | ||
| interlingual | 46 | |
| 1997.mtsummit-workshop.3 We take the view that the two types of information—shallower, transfer-like knowledge as well as deeper, compositional knowledge—can be reconciled in ***** interlingual ***** machine translation, the former for overcoming the intractability of LCS-based lexical selec- tion, and the latter for relating the underlying semantics of two words cross-linguistically. | ||
| L06-1396 The main target is to ease the application of algorithms for monolingual and ***** interlingual ***** studies. | ||
| 2020.emnlp-main.476 Furthermore, the ***** interlingual ***** space of the M2 allows convenient modification of the model. | ||
| 1991.mtsummit-papers.3 ULTRA (Universal Language TRAnslator) is a multilingual, ***** interlingual ***** machine translation system currently under development at the Computing Research Laboratory at New Mexico State University | ||
| 1997.mtsummit-workshop.11 This paper describes the outline of the EDR Concept Dictionary and gives some examples of *****interlingual***** representations as the semantic representations for an input sentence . | ||
| Romance | 46 | |
| 2020.lrec-1.394 In this paper, we apply a method for producing related words based on sequence labeling, aiming to fill in the gaps in incomplete cognate sets in ***** Romance ***** languages with Latin etymology (producing Romanian cognates that are missing) and to reconstruct uncertified Latin words. | ||
| 2021.sigmorphon-1.18 Accuracy for one-shot transfer can be unexpectedly high for some target languages (88% in Shona) and language families (53% across ***** Romance *****). | ||
| 2020.lrec-1.120 The final outcome of the project Open Access Database: Adjective-Adverb Interfaces in ***** Romance ***** is an annotated and lemmatised corpus of various linguistic phenomena related to ***** Romance ***** adjectives with adverbial functions. | ||
| 2021.wmt-1.43 The system aims to solve the Subtask 2: Wikipedia cultural heritage articles, which involves translation in four ***** Romance ***** languages: Catalan, Italian, Occitan and Romanian | ||
| 2020.vardial-1.13 Occitan is a *****Romance***** language spoken mainly in the south of France . | ||
| caption | 46 | |
| P19-1351 Hence, we propose a combined Visual and Textual Question Answering (VTQA) model which takes as input a paragraph ***** caption ***** as well as the corresponding image, and answers the given question based on both inputs. | ||
| 2021.maiworkshop-1.6 In this paper, we proposed two training technique for making effective use of multiple reference ***** caption *****s: 1) validity-based ***** caption ***** sampling (VBCS), which prioritizes the use of ***** caption *****s that are estimated to be highly valid during training, and 2) weighted ***** caption ***** smoothing (WCS), which applies smoothing only to the relevant words the reference ***** caption ***** to reflect multiple reference ***** caption *****s simultaneously. | ||
| N18-4020 While, there has been advanced research in the English ***** caption ***** generation, research on generating Arabic descriptions of an image is extremely limited. | ||
| D19-1517 We also establish a baseline of step ***** caption ***** generation for future comparison. | ||
| 2021.emnlp-demo.23 We use the re-translation strategy to translate the streamed speech, resulting in ***** caption ***** flicker | ||
| microblog | 46 | |
| 2020.crac-1.6 This article introduces TwiConv, an English coreference-annotated corpus of ***** microblog ***** conversations from Twitter. | ||
| J18-4008 Conventional topic models are ineffective for topic extraction from ***** microblog ***** messages, because the data sparseness exhibited in short messages lacking structure and contexts results in poor message-level word co-occurrence patterns. | ||
| 2012.amta-government.3 Social media covers a broad category of communications formats, ranging from threaded conversations on Facebook, to ***** microblog ***** and short message content on platforms like Twitter and Weibo – but it also includes user-generated comments on YouTube, as well as the contents of the video itself, and even includes `traditional' blogs and forums. | ||
| D18-1031 Most of existing personalized ***** microblog ***** sentiment classification methods suffer from the insufficiency of discriminative tweets for personalization learning. | ||
| 2021.acl-long.530 Here we create a corpus of ***** microblog ***** clusters from three different domains and time windows and define the task of evaluating thematic coherence | ||
| graphical | 46 | |
| W17-4308 Advances in neural variational inference have facilitated the learning of powerful directed ***** graphical ***** models with continuous latent variables, such as variational autoencoders. | ||
| D19-1048 In this paper, we improve generative text classifiers by introducing discrete latent variables into the generative story, and explore several ***** graphical ***** model configurations. | ||
| L12-1523 We thus believe that ***** graphical ***** models are a promising avenue of research for automatic document zoning. | ||
| N19-1203 We develop a neural hybrid ***** graphical ***** model that explicitly reconstructs morphological features before predicting the inflected forms, and compare this to a system that directly predicts the inflected forms without relying on any morphological annotation. | ||
| P18-3019 The main contributions of this paper are: (1) positional tokenization, which incorporates the sequential notion; (2) ***** graphical ***** error modelling, which calculates the morphological shifts | ||
| compression | 46 | |
| 2021.acl-long.86 While the performance of these models on standard benchmarks has scaled with size, ***** compression ***** techniques such as knowledge distillation have been key in making them practical. | ||
| C18-1091 The central idea of Operation Network is to model the sentence ***** compression ***** process as an editing procedure. | ||
| 2021.eacl-main.238 Existing knowledge distillation methods used for model ***** compression ***** cannot be directly applied to train student models with reduced vocabulary sizes. | ||
| 2021.repl4nlp-1.32 However, due to deployment constraints in edge devices, there has been a rising interest in the ***** compression ***** of these models to improve their inference time and memory footprint. | ||
| 2021.acl-long.334 The rapid development of large pre-trained language models has greatly increased the demand for model ***** compression ***** techniques, among which quantization is a popular solution | ||
| neural architectures | 46 | |
| 2020.emnlp-main.396 We analyze span ID tasks via performance prediction, estimating how well ***** neural architectures ***** do on different tasks. | ||
| 2021.emnlp-main.273 This dissertation set out to investigate the usefulness of such untapped information for ***** neural architectures *****. | ||
| W18-3217 Owing to the sparse and non-linear relationships between words in Twitter data, we explored ***** neural architectures ***** that are capable of non-linearities fairly well. | ||
| 2021.emnlp-main.676 By contrast, to date, ***** neural architectures ***** without manual feature engineering have been less explored for this task. | ||
| W19-8608 Encoder-decoder based ***** neural architectures ***** serve as the basis of state-of-the-art approaches in end-to-end open domain dialog systems | ||
| manual | 46 | |
| L16-1287 Previous studies have only applied ***** manual ***** content analysis on a small scale to reveal such a bias in the narrative section of annual financial reports. | ||
| K19-1096 However, this analysis is ***** manual ***** and labor-intensive, thus making it impractical as a first-response tool for newly-discovered troll farms. | ||
| W16-3702 The G2P bootstrapping experimental results were measured with both automatic phoneme error rate (PER) calculation and also ***** manual ***** checking in terms of voiced/unvoiced, tones, consonant and vowel errors. | ||
| 2018.gwc-1.24 The paper presents a feature-based model of equivalence targeted at (***** manual *****) sense linking between Princeton WordNet and plWordNet. | ||
| L10-1500 We evaluate the parsers against ***** manual ***** gold standard annotations and find that the projected parsers substantially outperform our heuristic baselines by 9―25% UAS, which corresponds to a 21―43% reduction in error rate | ||
| classification tasks | 46 | |
| D18-1221 The ideas of this work are demonstrated on large-scale text-to-entity mapping and entity ***** classification tasks *****, with state of the art results. | ||
| 2021.eacl-main.8 This paper aims to address two weaknesses of previous work: (1) existing fine-tuning strategies for early exiting models fail to take full advantage of BERT; (2) methods to make exiting decisions are limited to ***** classification tasks *****. | ||
| 2020.findings-emnlp.130 Based on our results we encourage using data balancing prior to training for text ***** classification tasks *****. | ||
| W18-6127 To better understand how CNNs and RNNs differ in handling long sequences, we use them for text ***** classification tasks ***** in several character-level social media datasets. | ||
| W19-3020 The NB model had the best performance in two additional binary-***** classification tasks *****, i.e., no risk vs. flagged risk (any risk level other than no risk) with F1 score 0.836 and no or low risk vs. urgent risk (moderate or severe risk) with F1 score 0.736. | ||
| speech corpus | 46 | |
| L16-1741 Previously, a seniors' ***** speech corpus ***** named S-JNAS was developed, but the average age of the participants was 67.6 years, but the target age for nursing home care is around 75 years old, much higher than that of the S-JNAS samples. | ||
| L16-1309 This paper describes speech data recording, processing and annotation of a new ***** speech corpus ***** CoRuSS (Corpus of Russian Spontaneous Speech), which is based on connected communicative speech recorded from 60 native Russian male and female speakers of different age groups (from 16 to 77). | ||
| 2021.alta-1.8 Experimental results show that a phone recognition based approach provides better overall performances than Dynamic Time Warping when working with clean data, and highlight the benefits of each methods for two types of ***** speech corpus *****. | ||
| L08-1150 The target corpus for the word-level dependency annotation is a large spontaneous Japanese-***** speech corpus *****, the Corpus of Spontaneous Japanese (CSJ). | ||
| P19-1107 We evaluated the models on the Switchboard conversational ***** speech corpus ***** and show that our model outperforms standard end-to-end speech recognition models. | ||
| document retrieval | 46 | |
| W18-2313 Our method, evaluated using the TREC 2016 clinical decision support challenge dataset, shows statistically significant improvement not only over various baselines that use standard MeSH terms and UMLS concepts for query expansion, but also over baselines using human expert–assigned concept tags for the queries, run on top of a standard Okapi BM25–based ***** document retrieval ***** system. | ||
| L10-1319 In this paper, we extend the method by (1) using neighboring context to index the target passage, and (2) applying a language modeling approach for ***** document retrieval *****. | ||
| W18-5516 The shared task organizers provide a large-scale dataset for the consecutive steps involved in claim verification, in particular, ***** document retrieval *****, fact extraction, and claim classification. | ||
| 2021.nlp4if-1.4 Fact Extraction and VERification (FEVER) is a recently introduced task that consists of the following subtasks (i) ***** document retrieval *****, (ii) sentence retrieval, and (iii) claim verification. | ||
| D19-1352 This paper applies BERT to ad hoc ***** document retrieval ***** on news articles, which requires addressing two challenges: relevance judgments in existing test collections are typically provided only at the document level, and documents often exceed the length that BERT was designed to handle. | ||
| document summarization | 46 | |
| Q13-1008 Supervised learning methods and LDA based topic model have been successfully applied in the field of multi-***** document summarization *****. | ||
| 2020.acl-main.124 We study unsupervised multi-***** document summarization ***** evaluation metrics, which require neither human-written reference summaries nor human annotations (e.g. | ||
| L16-1452 To our knowledge, this is the first study that investigates using dependency tree based sentence similarity for multi-***** document summarization *****. | ||
| 2020.findings-emnlp.231 We also verify the helpfulness of single-***** document summarization ***** to abstractive multi-***** document summarization ***** task. | ||
| L14-1145 This article presents the Polish Summaries Corpus, a new resource created to support the development and evaluation of the tools for automated single-***** document summarization ***** of Polish. | ||
| dialog system | 46 | |
| L12-1156 In this paper, we present the acquisition and labeling processes of the EDECAN-SPORTS corpus, which is a corpus that is oriented to the development of multimodal ***** dialog system *****s acquired in Spanish and Catalan. | ||
| L10-1398 In this paper, we propose an estimation method of user satisfaction for a spoken ***** dialog system ***** using an N-gram-based dialog history model. | ||
| D18-1077 The main goal of this paper is to develop out-of-domain (OOD) detection for ***** dialog system *****s. | ||
| 2021.nlp4convai-1.26 This increase in usage of code-mixed language has prompted ***** dialog system *****s in a similar language. | ||
| P17-1062 HCNs attain state-of-the-art performance on the bAbI dialog dataset (Bordes and Weston, 2016), and outperform two commercially deployed customer-facing ***** dialog system *****s at our company. | ||
| types | 46 | |
| 2020.sigdial-1.29 A total of 20 papers from the last two years are surveyed to analyze three ***** types ***** of evaluation protocols: automated, static, and interactive. | ||
| 2020.lrec-1.647 The following resources are open-sourced with ÆTHEL: the lexical mappings between words and ***** types *****, a subset of the dataset consisting of 7 924 semantic parses, and the Python code that implements the extraction algorithm. | ||
| 1963.earlymt-1.30 Formats and functions dealing with set-relations, part-whole and numeric relations, and left-toright spatial relations have been included in the system, which is being expanded to handle other ***** types ***** of relations. | ||
| 2020.computerm-1.12 The results show a lot of variation between different systems and illustrate how some methodologies reach higher precision or recall, how different systems extract different ***** types ***** of terms, how some are exceptionally good at finding rare terms, or are less impacted by term length. | ||
| 2021.acl-long.523 While counterfactual examples are useful for analysis and training of NLP models, current generation methods either rely on manual labor to create very few counterfactuals, or only instantiate limited ***** types ***** of perturbations such as paraphrases or word substitutions. | ||
| communication | 46 | |
| C16-1037 Speech prosody is known to be central in advanced ***** communication ***** technologies. | ||
| 2001.mtsummit-eval.12 During the experiment, MT output of three different systems is compared in order to establish which MT system best serves the organisation's multilingual ***** communication ***** and information needs. | ||
| L06-1229 This paper presents research on Greeklish, that is, a transliteration of Greek using the Latin alphabet, which is used frequently in Greek e-mail ***** communication *****. | ||
| 2021.acl-long.570 Signed languages are the primary means of ***** communication ***** for many deaf and hard of hearing individuals. | ||
| 2021.calcs-1.2 Code-mixing is a frequent ***** communication ***** style among multilingual speakers where they mix words and phrases from two different languages in the same utterance of text or speech. | ||
| financial | 46 | |
| 2020.fnp-1.1 FNS summarisation shared task is the first to target ***** financial ***** annual reports. | ||
| P19-1038 Nowadays, firm CEOs communicate information not only verbally through press releases and ***** financial ***** reports, but also nonverbally through investor meetings and earnings conference calls. | ||
| P17-1157 Volatility prediction—an essential concept in ***** financial ***** markets—has recently been addressed using sentiment analysis methods. | ||
| C16-2010 In this study we develop a system that tags and extracts ***** financial ***** concepts called ***** financial ***** named entities (FNE) along with corresponding numeric values – monetary and temporal. | ||
| 2021.eacl-main.122 For example, a ***** financial ***** news article mentioned “Apple Inc.” may be also related to Samsung, even though Samsung is not explicitly mentioned in this article. | ||
| dictionary induction | 46 | |
| K18-1021 Most recent approaches to bilingual ***** dictionary induction ***** find a linear alignment between the word vector spaces of two languages. | ||
| Q18-1014 Most existing methods for automatic bilingual ***** dictionary induction ***** rely on prior alignments between the source and target languages, such as parallel corpora or seed dictionaries. | ||
| 2020.coling-main.531 Bilingual ***** dictionary induction ***** (BDI) is the task of accurately translating words to the target language. | ||
| 2020.lrec-1.499 Then, we evaluate the results using the bilingual ***** dictionary induction ***** task. | ||
| P18-1072 We show that a simple trick, exploiting a weak supervision signal from identical words, enables more robust induction and establish a near-perfect correlation between unsupervised bilingual ***** dictionary induction ***** performance and a previously unexplored graph similarity metric. | ||
| bioasq challenge | 46 | |
| W18-5301 This paper presents the results of the sixth edition of the *****BioASQ challenge*****. | ||
| W18-5310 Generating a non-redundant, human-readable summary that satisfies the information need of a given biomedical question is the focus of the Ideal Answer Generation task, part of the *****BioASQ challenge*****. | ||
| W17-2308 Macquarie University's contribution to the *****BioASQ challenge***** (Task 5b Phase B) focused on the use of query-based extractive summarisation techniques for the generation of the ideal answers. | ||
| W17-2306 The goal of the *****BioASQ challenge***** is to engage researchers into creating cuttingedge biomedical information systems. | ||
| W17-2307 In this paper, we describe our participation in phase B of task 5b of the fifth edition of the annual *****BioASQ challenge*****, which includes answering factoid, list, yes-no and summary questions from biomedical data. | ||
| antecedents | 45 | |
| 1991.iwpt-1.8 Binding of anaphors and coreference of pronouns is extensively shown to depend on structural properties of f-structures, on thematic roles and grammatical functions associated with the ***** antecedents ***** or controller, on definiteness of NPs and mood of clausal f-structures. | ||
| D17-1135 The fundamental reason is that zero pronouns have no descriptive information, which brings difficulty in explicitly capturing their semantic similarities with ***** antecedents *****. | ||
| 2020.semeval-1.54 We consider detection of the span of ***** antecedents ***** and consequents in argumentative prose a structural, grammatical task. | ||
| L06-1147 Corpus investigations give rise to the supposition that logical text structure influences the search scope of candidates for ***** antecedents *****. | ||
| 2020.crac-1.8 This paper critically examines the assumption prevalent in previous research that SNs are typically accompanied by a specific antecedent, arguing that SNs like “issue” and “decision” are frequently used to refer, not to specific ***** antecedents *****, but to global discourse topics, in which case they are out of reach of previously proposed resolution strategies that are tailored to SNs with explicit ***** antecedents ***** | ||
| generalizing | 45 | |
| 2017.iwslt-1.11 We explore three rare word ***** generalizing ***** schemes using part-of-speech (POS) tokens. | ||
| W19-4810 We train six high performing neural network models on different datasets and show that each one of these has problems of ***** generalizing ***** when we replace the original test set with a test set taken from another corpus designed for the same task. | ||
| D19-1204 CoSQL introduces new challenges compared to existing task-oriented dialogue datasets: (1) the dialogue states are grounded in SQL, a domain-independent executable representation, instead of domain-specific slot value pairs, and (2) because testing is done on unseen databases, success requires ***** generalizing ***** to new domains. | ||
| K19-1009 This model is substantially better at ***** generalizing ***** to unseen combinations of concepts compared to state-of-the-art captioning models. | ||
| 2020.acl-main.105 Using this framework, we point out and discuss the difficulties encountered with supplementing documents with -not present in text- keyphrases, and ***** generalizing ***** models across domains | ||
| optimizing | 45 | |
| 2020.findings-emnlp.14 Several approaches to neural speed reading have been presented at major NLP and machine learning conferences in 2017–20; i.e., “human-inspired” recurrent network architectures that learn to “read” text faster by skipping irrelevant words, typically ***** optimizing ***** the joint objective of minimizing classification error rate and FLOPs used at inference time. | ||
| N19-1117 Unlike previous methods, however, we train with an end-to-end predictive objective ***** optimizing ***** the perplexity of text. | ||
| N18-2010 Many simple NLG models are based on recurrent neural networks (RNN) and sequence-to-sequence (seq2seq) model, which basically contains a encoder-decoder structure; these NLG models generate sentences from scratch by jointly ***** optimizing ***** sentence planning and surface realization using a simple cross entropy loss training criterion. | ||
| C18-1297 Starting with a set of extracted textual fragments related to the snippet based on the query words in it, the proposed approach builds the desired text from these fragment by simultaneously ***** optimizing ***** the information coverage, relevance, diversity and coherence in the generated content. | ||
| 2021.emnlp-main.807 Compared to existing baselines, greedy rationalization is best at ***** optimizing ***** the sequential objective and provides the most faithful rationales | ||
| combinatorial | 45 | |
| 2005.mtsummit-papers.32 LOD uses an agglomerative method to attack the ***** combinatorial ***** explosion that results when generating candidate phrase translations. | ||
| L04-1243 Thirdly, we describe their lexical transducers (i.e., morphological rules) to recognize all inflected forms of lemmas for nouns and adverbs according to the ***** combinatorial ***** restrictions between lemmas and their inflectional suffixes. | ||
| S19-2079 For task A, we trained a Support Vector Machine using a ***** combinatorial ***** framework, whereas for task B we followed a multi-labeled approach using the Random Forest classifier. | ||
| D19-6302 Recent advances in deep learning have shown promises in solving complex ***** combinatorial ***** optimization problems, such as sorting variable-sized sequences. | ||
| 2020.emnlp-main.74 In our model, besides matching T and S predictions we have a ***** combinatorial ***** mechanism to inject layer-level supervision from T to S. In this paper, we target low-resource settings and evaluate our translation engines for Portuguese→English, Turkish→English, and English→German directions | ||
| lattice | 45 | |
| N19-2001 We propose an approach that incrementally builds a subset vocabulary from the word ***** lattice *****. | ||
| 2007.iwslt-1.20 The main contribution of this paper concerns the proposal of a ***** lattice ***** decomposition algorithm that allows transforming a word ***** lattice ***** into a sub word ***** lattice ***** compatible with our MT model that uses word segmentation on the Arabic part. | ||
| 2020.emnlp-main.471 We solve difficult word-based substitution codes by constructing a decoding ***** lattice ***** and searching that ***** lattice ***** with a neural language model. | ||
| Q15-1026 The ***** lattice ***** parser predicts a dependency tree over a path in the ***** lattice ***** and thus solves the joint task of segmentation, morphological analysis, and syntactic parsing. | ||
| 2007.iwslt-1.3 By introducing a new ***** lattice ***** weighting factor and by reordering the training source data, an improvement is reported on TER and BLEU | ||
| UCCA | 45 | |
| N19-1047 We target this gap, and take Universal Dependencies (UD) and ***** UCCA ***** as a test case. | ||
| S19-2015 This paper describes our recursive system for SemEval-2019 Task 1: Cross-lingual Semantic Parsing with ***** UCCA *****. | ||
| D19-1392 Experiments separately conducted on three broad-coverage semantic parsing tasks – AMR, SDP and ***** UCCA ***** – demonstrate that our attention-based neural transducer improves the state of the art on both AMR and ***** UCCA *****, and is competitive with the state of the art on SDP. | ||
| 2020.wmt-1.104 It leverages the power of ***** UCCA ***** to identify semantic core words, and then calculates sentence similarity scores on the overlap of semantic core words. | ||
| S19-2001 ***** UCCA ***** is a cross-linguistically applicable framework for semantic representation, which builds on extensive typological work and supports rapid annotation | ||
| triplet | 45 | |
| 2020.coling-main.548 However,and in contrast with ***** triplet ***** networks, the proposed method uses a novel deep architecture that better exploits the particularities of text and takes into consideration complementary relatedness measures. | ||
| 2021.emnlp-main.467 Many recent successes in sentence representation learning have been achieved by simply fine-tuning on the Natural Language Inference (NLI) datasets with ***** triplet ***** loss or siamese loss. | ||
| 2021.bionlp-1.2 We propose a vector-space model for concept normalization, where mentions and concepts are encoded via transformer networks that are trained via a ***** triplet ***** objective with online hard ***** triplet ***** mining. | ||
| P18-2009 We show that the ***** triplet ***** network learns useful thematic metrics, that significantly outperform state-of-the-art semantic similarity methods and multipurpose embeddings on the task of thematic clustering of sentences. | ||
| 2021.semeval-1.57 The task is divided into three sub-tasks: extracting contribution sentences that show important contributions in the research article, extracting phrases from the contribution sentences, and predicting the information units in the research article together with ***** triplet ***** formation from the phrases | ||
| optimized | 45 | |
| 2021.emnlp-main.541 Automatic Speech Recognition (ASR) systems are often ***** optimized ***** to work best for speakers with canonical speech patterns. | ||
| S18-2007 We investigate entity linking in the context of question answering task and present a jointly ***** optimized ***** neural architecture for entity mention detection and entity disambiguation that models the surrounding context on different levels of granularity. | ||
| 2020.findings-emnlp.145 We use learnable neural modules and soft logic to handle linguistic variation and overcome sparse coverage; the modules are jointly ***** optimized ***** with the MRC model to improve final performance. | ||
| L14-1147 We propose a dependency-based pre-ordering model with parameters ***** optimized ***** using a reordering score to pre-order the source sentence. | ||
| 2020.coling-main.54 For the evaluation of our methods we built our own Chinese biomedical patents NER dataset, and our ***** optimized ***** model achieved an F1 score of 0.540.15 | ||
| trainable | 45 | |
| W19-4304 In contrast to related previous work, we demonstrate that the performance in translation does correlate with ***** trainable ***** downstream tasks. | ||
| 2021.acl-long.380 By making the risk function ***** trainable *****, we draw a connection between minimum risk training and latent variable model learning. | ||
| W19-8645 We suggest four extensions to that framework: (1) we introduce a ***** trainable ***** neural planning component that can generate effective plans several orders of magnitude faster than the original planner; (2) we incorporate typing hints that improve the model's ability to deal with unseen relations and entities; (3) we introduce a verification-by-reranking stage that substantially improves the faithfulness of the resulting texts; (4) we incorporate a simple but effective referring expression generation module. | ||
| 2021.wmt-1.114 After investigating the recent advances of ***** trainable ***** metrics, we conclude several aspects of vital importance to obtain a well-performed metric model by: 1) jointly leveraging the advantages of source-included model and reference-only model, 2) continuously pre-training the model with massive synthetic data pairs, and 3) fine-tuning the model with data denoising strategy. | ||
| P18-1256 Our attention mechanism augments a baseline recurrent neural network without the need for additional ***** trainable ***** parameters, minimizing the added computational cost of our mechanism | ||
| markup | 45 | |
| 2021.emnlp-demo.4 A first evaluation shows that this strategy yields highly accurate ***** markup ***** in the translated documents that outperforms the ***** markup ***** quality found in documents translated with popular translation services. | ||
| 2021.emnlp-main.665 Alignments are useful for typological research, transferring formatting like ***** markup ***** to translated texts, and can be used in the decoding of machine translation systems. | ||
| L06-1227 In particular, the paper deals with ongoing work concerning encoding, visualization and processing of characters; current work on language and locale identification; and current work on internationalization of ***** markup *****. | ||
| L16-1654 For instance, its set of nonstandardized annotations, referred to as the wiki ***** markup *****, is language-dependent and needs specific parsers from language to language, for English, French, Italian, etc. | ||
| 2020.wmt-1.138 This paper compares two commonly used methods of representing ***** markup ***** tags and tests the ability of MT models to learn tag placement via training data augmentation | ||
| modal | 45 | |
| W19-4008 Based on an analysis of the inter-annotator consistency, we argue that our tag set for the ***** modal ***** domain is efficient for our subject languages and might be useful for other languages and purposes. | ||
| 2021.dravidianlangtech-1.26 Moreover, in the view of multilingual models, this ***** modal ***** ranked 3rd and achieved favorable results and confirmed the model as the best among all systems submitted to these shared tasks in these three languages. | ||
| L10-1465 We also discuss some of linguistic resources (key factors for distinguishing facts from opinions, opinion lexicon, intensifier lexicon, pre-modifier lexicon, ***** modal ***** verb lexicon, reporting verb lexicon, general opinion patterns from the corpus etc.) | ||
| L12-1288 We validated the annotation scheme on a corpus sample of approximately 2000 sentences that we fully annotated with ***** modal ***** information using the MMAX2 annotation tool to produce XML annotation | ||
| 2016.lilt-14.2 Classical theories of discourse semantics , such as Discourse Representation Theory ( DRT ) , Dynamic Predicate Logic ( DPL ) , predict that an indefinite noun phrase can not serve as antecedent for an anaphor if the noun phrase is , but the anaphor is not , in the scope of a *****modal***** expression . | ||
| declarative | 45 | |
| 1997.iwpt-1.24 The framework of filters provides a ***** declarative ***** description of disambiguation methods independent of parsing. | ||
| W19-1103 The basic idea is to assign the same type to both ***** declarative ***** sentences and interrogative sentences, partly building on the recent proposal in Inquisitive Semantics. | ||
| 2020.emnlp-main.320 Specifically, we leverage the ***** declarative ***** knowledge expressed in both first-order logic and natural language. | ||
| L06-1205 We have automatically classified adverbs as either ***** declarative ***** or not ***** declarative ***** using a machine-learning method such as the maximum entropy method. | ||
| 2020.acl-main.69 We implement this observation by developing Syn-QG, a set of transparent syntactic rules leveraging universal dependencies, shallow semantic parsing, lexical resources, and custom rules which transform ***** declarative ***** sentences into question-answer pairs | ||
| Slavic | 45 | |
| R19-2010 The corpora under construction can be considered a crucial contribution to the linguistic research on the languages in the Balkans as they provide the lacking data needed for the studies of linguistic variation in the Balkan ***** Slavic *****, and enable the comparison of the said varieties with other neighbouring languages. | ||
| W17-1210 This paper deals with the development of morphosyntactic taggers for spoken varieties of the ***** Slavic ***** minority language Rusyn | ||
| W19-3717 The paper presents a generic approach to the supervised sentiment analysis of social media content in *****Slavic***** languages . | ||
| W19-3709 We describe the Second Multilingual Named Entity Challenge in *****Slavic***** languages . | ||
| W17-1412 This paper describes the outcomes of the first challenge on multilingual named entity recognition that aimed at recognizing mentions of named entities in web documents in *****Slavic***** languages , their normalization / lemmatization , and cross - language matching . | ||
| ambiguous | 45 | |
| P18-1117 Standard machine translation systems process sentences in isolation and hence ignore extra-sentential information, even though extended context can both prevent mistakes in ***** ambiguous ***** cases and improve translation coherence. | ||
| 1997.iwpt-1.2 The main obstacle to the use of context-free grammars and parsing technology for RNA folding and other closely related problems is the following: suitable grammars are exponentially ***** ambiguous *****, and sentences to parse (i.e. RNA or DNA sequences) typically have more than 200 words, and sometimes more than 4000 words. | ||
| C18-1067 We also present an experiment with augmented test dataset and demon- strate it helps to understand the model's behavior on locally ***** ambiguous ***** points. | ||
| 2021.acl-long.65 In this paper, we ask several questions: What contexts do human translators use to resolve ***** ambiguous ***** words? | ||
| L16-1064 The hypothesis is that previous queries provide context that helps to solve ***** ambiguous ***** translations in the current query | ||
| coreference annotation | 45 | |
| L16-1025 To ease the ***** coreference annotation ***** process, we built a semi-automatic Coreference Annotation Tool (CAT). | ||
| R19-2006 The paper presents several common approaches towards cross- and multi-lingual coreference resolution in a search of the most effective practices to be applied within the work on Bulgarian-English manual ***** coreference annotation ***** of a short story. | ||
| L06-1325 This paper describes a pilot project which developed a methodology for NP and event ***** coreference annotation ***** consisting of detailed annotation schemes and guidelines. | ||
| W19-3319 We propose a ***** coreference annotation ***** scheme as a layer on top of the Universal Conceptual Cognitive Annotation foundational layer, treating units in predicate-argument structure as a basis for entity and event mentions | ||
| 2020.emnlp-demos.27 To enable cheaper and more efficient annotation, we present CoRefi, a web-based ***** coreference annotation ***** suite, oriented for crowdsourcing. | ||
| generating | 45 | |
| 2020.coling-main.220 We propose a hierarchical approach, by first ***** generating ***** video descriptions as sequences of simple sentences, followed at the next level by a more complex and fluent description in natural language. | ||
| 2021.emnlp-main.93 A recent work used this technique for modifying scene graphs (He et al. 2020), by first encoding the original graph and then ***** generating ***** the modified one based on this encoding. | ||
| 2021.emnlp-main.53 Training data for the classifiers is obtained using a 2-stage approach of first ***** generating ***** synthetic data using a combination of existing and new model-based approaches followed by a novel validation framework to filter and sort the synthetic data into acceptable and unacceptable classes. | ||
| 2020.acl-main.706 We use a two-step approach of first identifying the pivotal physical events in an environment and then ***** generating ***** natural language descriptions of those events using a data-to-text approach. | ||
| 2020.acl-main.351 We propose a novel framework for predicting utterance level labels directly from speech features, thus removing the dependency on first ***** generating ***** transcripts, and transcription free behavioral coding | ||
| recommendation | 45 | |
| 2020.coling-main.369 Recently, the use of external knowledge in the form of knowledge graphs has shown to improve the performance in ***** recommendation ***** and dialogue systems. | ||
| 2021.hackashop-1.7 We reflect on the role of NLP in ***** recommendation ***** systems with this specific goal in mind and show that this theory of democracy helps to identify which NLP tasks and techniques can support this goal, and what work still needs to be done. | ||
| 2020.aacl-main.80 The ability to predict semantic place information from a tweet has applications in ***** recommendation ***** systems, personalization services and cultural geography. | ||
| 2020.emnlp-main.654 To better understand how humans make ***** recommendation *****s in communication, we design an annotation scheme related to ***** recommendation ***** strategies based on social science theories and annotate these dialogs. | ||
| C16-1202 Traditional ***** recommendation ***** algorithms, e.g. collaborative filtering and matrix completion, are not designed to exploit the key information hidden in the text comments, while existing opinion mining methods do not provide direct support to ***** recommendation ***** systems with useful features on users and items | ||
| spelling correction | 45 | |
| 2020.lrec-1.835 The lack of large-scale datasets has been a major hindrance to the development of NLP tasks such as ***** spelling correction ***** and grammatical error correction (GEC). | ||
| W19-4407 We also develop a minimallysupervised context-aware approach to ***** spelling correction *****. | ||
| 2020.lrec-1.508 This publicly available resource is intended to support research on ***** spelling correction ***** and text normalization for Arabic dialects. | ||
| 2020.emnlp-main.383 Additional text normalization experiments and case studies show that TNT is a new potential approach to mis***** spelling correction *****. | ||
| W19-4411 We show that ***** spelling correction ***** can provide larger gains than character representations, and that ***** spelling correction ***** improves the performance of models with character representations. | ||
| similar languages | 45 | |
| D19-1076 An important motivation is to support lower resourced languages, however, most efforts focus on demonstrating the effectiveness of the techniques using embeddings derived from ***** similar languages ***** to English with large parallel content. | ||
| 2020.loresmt-1.4 We further noted that, although translation between ***** similar languages ***** is no cakewalk, linguistically distinct languages require more data to give better results. | ||
| 2021.naacl-main.16 While cross-lingual pretraining works for ***** similar languages ***** with abundant corpora, it performs poorly in low-resource and distant languages. | ||
| 2021.wmt-1.27 We investigate transfer learning based on pre-trained neural machine translation models to translate between (low-resource) ***** similar languages *****. | ||
| W18-3920 In this paper we present a system based on SVM ensembles trained on characters and words to discriminate between five ***** similar languages ***** of the Indo-Aryan family: Hindi, Braj Bhasha, Awadhi, Bhojpuri, and Magahi. | ||
| project | 45 | |
| L12-1283 This work is part of a ***** project ***** for MWE extraction and characterization using different techniques aiming at measuring the properties related to idiomaticity, as institutionalization, non-compositionality and lexico-syntactic fixedness. | ||
| Q14-1005 We propose a new method that ***** project *****s model expectations rather than labels, which facilities transfer of model uncertainty across language boundaries. | ||
| C16-1095 In the absence of large annotated corpora, parallel corpora, treebanks, bilingual lexica, etc., we found the following to be effective: exploiting distributional regularities in monolingual data, ***** project *****ing information across closely related languages, and utilizing human linguist judgments. | ||
| L08-1399 ANNALIST has been used extensively for evaluation tasks within the VIKEF (EU FP6) and CLEF (UK MRC) ***** project *****s. | ||
| P19-3028 %consists in ***** project *****ing them in two-dimensional planes without any interpretable semantics associated to the axes of the ***** project *****ion, which makes detailed analyses and comparison among multiple sets of embeddings challenging. | ||
| contextual information | 45 | |
| N19-1037 Moreover, we promote the framework to two variants, Hi-GRU with individual features fusion (HiGRU-f) and HiGRU with self-attention and features fusion (HiGRU-sf), so that the word/utterance-level individual inputs and the long-range ***** contextual information ***** can be sufficiently utilized. | ||
| 2020.coling-main.344 Both models consist of two parts: an encoder enhanced by deep neural networks (DNN) that can utilize the ***** contextual information ***** to encode the input into latent variables, and a decoder which is a generative model able to reconstruct the input. | ||
| 2020.lrec-1.862 Using semantic and ***** contextual information *****, non-speakers of a language familiar with the Latin script can produce high quality named entity annotations to support construction of a name tagger. | ||
| L08-1159 Yet, building such models requires appropriate definition of various levels for representing the emotions themselves but also some ***** contextual information ***** such as the events that elicit these emotions. | ||
| 2020.wat-1.19 Unlike sentence-level MT, which translates the sentences independently, document-level MT aims to utilize ***** contextual information ***** while translating a given source sentence. | ||
| publications | 45 | |
| D19-1236 The review and selection process for scientific paper publication is essential for the quality of scholarly ***** publications ***** in a scientific field. | ||
| 2021.semeval-1.44 There is currently a gap between the natural language expression of scholarly ***** publications ***** and their structured semantic content modeling to enable intelligent content search. | ||
| W18-5618 Rapidly expanding volume of ***** publications ***** in the biomedical domain makes it increasingly difficult for a timely evaluation of the latest literature. | ||
| L12-1476 In this paper, we describe the online repository that we have created as a one-stop resource for obtaining NLG task materials, both from Generation Challenges tasks and from other sources, where the set of materials provided for each task consists of (i) task definition, (ii) input and output data, (iii) evaluation software, (iv) documentation, and (v) ***** publications ***** reporting previous results. | ||
| W16-5115 In cases where such information is not available, identifying the authorship of ***** publications ***** becomes very challenging. | ||
| automatic extraction | 45 | |
| 2021.latechclfl-1.6 We demonstrate the usefulness of our data set by providing baseline experiments for the ***** automatic extraction ***** of character networks, applying a rule-based pipeline as well as a neural approach, and find the neural approach outperforming the rule-approach in most evaluation settings. | ||
| R17-1005 The task has powerful applications, such as the detection of fake news or the ***** automatic extraction ***** of attitudes toward entities or events in the media. | ||
| W18-5035 In this paper we have proposed a linguistically informed recursive neural network architecture for ***** automatic extraction ***** of cause-effect relations from text. | ||
| L10-1006 This paper presents and analyzes an annotated corpus of definitions, created to train an algorithm for the ***** automatic extraction ***** of definitions and hypernyms from web documents. | ||
| L14-1690 The growing investment on *****automatic extraction***** procedures , together with the need for extensive resources , makes semi - automatic construction a new viable and efficient strategy for developing of language resources , combining accuracy , size , coverage and applicability . | ||
| textual data | 45 | |
| 2021.sigdial-1.54 Automatic summarization aims to extract important information from large amounts of ***** textual data ***** in order to create a shorter version of the original texts while preserving its information. | ||
| P17-1032 Kernel methods enable the direct usage of structured representations of ***** textual data ***** during language learning and inference tasks. | ||
| 2020.acl-main.570 Most of the recent PPI tasks in BioNLP domain have been carried out solely using ***** textual data *****. | ||
| L12-1368 Our main idea is to provide a unified and dynamic way of annotating ***** textual data *****. | ||
| 2021.naacl-main.314 Most of privacy protection studies for ***** textual data ***** focus on removing explicit sensitive identifiers. | ||
| biomedical domain | 45 | |
| D19-6203 The experimental results suggest that dependency-based pooling is the best pooling strategy for RE in the ***** biomedical domain *****, yielding the state-of-the-art performance on two benchmark datasets for this problem. | ||
| P19-2008 However, there are particular challenges in extending these open-domain techniques to extend into the ***** biomedical domain *****. | ||
| W18-5618 Rapidly expanding volume of publications in the ***** biomedical domain ***** makes it increasingly difficult for a timely evaluation of the latest literature. | ||
| W17-2507 Even though large collections are available for certain domains and language pairs, these are still scarce in the ***** biomedical domain *****. | ||
| 2020.emnlp-main.379 There has been an influx of biomedical domain - specific language models , showing language models pre - trained on biomedical text perform better on *****biomedical domain***** benchmarks than those trained on general domain text corpora such as Wikipedia and Books . | ||
| semantic change | 45 | |
| D19-1272 We propose to use masking (replacement) rate threshold as an adjustable parameter to control the amount of ***** semantic change ***** in the text. | ||
| L10-1448 Two types of ***** semantic change ***** are amelioration and pejoration; in these processes a word sense changes to become more positive or negative, respectively. | ||
| L16-1379 Serving as a rich reference for new and existing databases in diachronic and synchronic linguistics, it allows researchers a quick access to studies on ***** semantic change *****, cross-linguistic polysemies, and semantic associations. | ||
| 2020.emnlp-main.682 Through extensive experimentation under various settings with synthetic and real data we showcase the importance of sequential modelling of word vectors through time for ***** semantic change ***** detection. | ||
| W19-4704 The paper focuses on diachronic evaluation of ***** semantic change *****s of harm-related concepts in psychology. | ||
| hate speech classification | 45 | |
| 2021.naacl-main.183 In this work, we propose lifelong learning of ***** hate speech classification ***** on social media. | ||
| 2020.alw-1.22 The proposed method and collected insights can contribute to developing fairer and more reliable ***** hate speech classification ***** models. | ||
| I17-1078 To address various limitations of supervised ***** hate speech classification ***** methods including corpus bias and huge cost of annotation, we propose a weakly supervised two-path bootstrapping approach for an online hate speech detection model leveraging large-scale unlabeled data. | ||
| D18-1391 In this paper, we propose a novel method on a fine-grained ***** hate speech classification ***** task, which focuses on differentiating among 40 hate groups of 13 different hate group categories. | ||
| W18-5110 The paper investigates the potential effects user features have on ***** hate speech classification *****. | ||
| story cloze test | 45 | |
| K17-1019 We further demonstrate that our joint model can be applied to ***** story cloze test ***** and shallow discourse parsing tasks with improved performance and that each semantic aspect contributes to the model. | ||
| N18-2015 In the *****Story Cloze Test*****, a system is presented with a 4-sentence prompt to a story, and must determine which one of two potential endings is the `right' ending to the story. | ||
| W17-0908 The *****Story Cloze test***** is a recent effort in providing a common test scenario for text understanding systems. | ||
| P18-2119 The *****Story Cloze Test***** (SCT) is a recent framework for evaluating story comprehension and script learning. | ||
| 2020.emnlp-main.247 We conduct experiments on the ROCStories, a dataset of *****Story Cloze Test***** (SCT), and CosmosQA, a dataset of multiple choice. | ||
| hypernym discovery | 45 | |
| P19-1327 Even with a simple methodology for each individual system, utilizing a hybrid approach establishes new state-of-the-art results on two domain-specific English ***** hypernym discovery ***** tasks and outperforms all non-hybrid approaches in a general English ***** hypernym discovery ***** task. | ||
| S18-1150 In this paper, we present our proposed system (EXPR) to participate in the ***** hypernym discovery ***** task of SemEval 2018. | ||
| S18-1116 This system exploits a combination of supervised projection learning and unsupervised pattern-based ***** hypernym discovery *****. | ||
| S18-1147 This paper describes a *****hypernym discovery***** system for our participation in the SemEval-2018 Task 9, which aims to discover the best (set of) candidate hypernyms for input concepts or entities, given the search space of a pre-defined vocabulary. | ||
| S18-1152 This paper describes 300-sparsians's participation in SemEval-2018 Task 9: *****Hypernym Discovery*****, with a system based on sparse coding and a formal concept hierarchy obtained from word embeddings. | ||
| proposition bank | 45 | |
| L08-1461 In this paper, we present the details of creating a pilot Arabic ***** proposition bank ***** (Propbank). | ||
| C16-1096 In addition, it allows the generation of ***** proposition bank *****s upon which semantic parsers can be trained. | ||
| 2020.lrec-1.734 This paper presents a ***** proposition bank ***** for Russian (RuPB), a resource for semantic role labeling (SRL). | ||
| L16-1606 This paper describes the procedure of semantic role labeling and the development of the first manually annotated Persian *****Proposition Bank***** (PerPB) which added a layer of predicate-argument information to the syntactic structures of Persian Dependency Treebank (known as PerDT). | ||
| 2020.coling-main.266 We conduct experiments on two widely-used benchmark datasets, i.e., Chinese *****Proposition Bank***** 1.0 and English CoNLL-2005 dataset. | ||
| Humanities | 44 | |
| L14-1256 Computational Narratology is an emerging field within the Digital ***** Humanities *****. | ||
| 2020.ai4hi-1.5 Semantic enrichment of historical images to build interactive AI systems for the Digital ***** Humanities ***** domain has recently gained significant attention. | ||
| C16-1262 The overall low reliability we observe, nevertheless, casts doubt on the suitability of word neighborhoods in embedding spaces as a basis for qualitative conclusions on synchronic and diachronic lexico-semantic matters, an issue currently high up in the agenda of Digital ***** Humanities *****. | ||
| 2020.latechclfl-1.3 Entity recognition provides semantic access to ancient materials in the Digital ***** Humanities *****: it exposes people and places of interest in texts that cannot be read exhaustively, facilitates linking resources and can provide a window into text contents, even for texts with no translations. | ||
| W17-8104 Current approaches in Digital .***** Humanities ***** tend to ignore a central as-pect of any hermeneutic introspection: the intrinsic vagueness of analyzed texts | ||
| METEOR | 44 | |
| 2009.iwslt-evaluation.2 The results are evaluated based on BLEU and ***** METEOR ***** scores. | ||
| W16-4207 Extensive experiments on a large curated clinical paraphrase corpus show that the attention-based NCPG models achieve improvements of up to 5.2 BLEU points and 0.5 ***** METEOR ***** points over a non-attention based strong baseline for word-level modeling, whereas further gains of up to 6.1 BLEU points and 1.3 ***** METEOR ***** points are obtained by the character-level NCPG models over their word-level counterparts. | ||
| C16-1110 The augmented versions of ***** METEOR *****, using vector representations, are made available on our Github page. | ||
| P19-1255 Automatic evaluation on a large-scale dataset collected from Reddit shows that our model yields significantly higher BLEU, ROUGE, and ***** METEOR ***** scores than the state-of-the-art and non-trivial comparisons. | ||
| 2008.iwslt-evaluation.6 The system has shown competitive performance with respect to the BLEU and ***** METEOR ***** measures in Chinese-English Challenge and BTEC tasks | ||
| extractors | 44 | |
| Q16-1023 The BiLSTM is trained jointly with the parser objective, resulting in very effective feature ***** extractors ***** for parsing. | ||
| 2020.aacl-main.81 The local sentence-level event ***** extractors ***** often yield many noisy event role filler extractions in the absence of a broader view of the document-level context. | ||
| W17-5228 The system combines various independent feature ***** extractors *****, trains them on general regressors and finally combines the best performing models to create an ensemble. | ||
| W17-4606 This opens up new possibilities, as for many tasks currently addressed by human ***** extractors *****, raw input and output data are available, but not token-level labels. | ||
| S18-1039 We transfer the emotional knowledge by exploiting neural network models as feature ***** extractors ***** and use these representations for traditional machine learning models such as support vector regression (SVR) and logistic regression to solve the competition tasks | ||
| transformations | 44 | |
| W19-3306 We describe how ULF can be used to generate natural language inferences that are grounded in the semantic and syntactic structure through a small set of rules defined over interpretable predicates and ***** transformations ***** on ULFs. | ||
| L16-1247 As the rules enable to consider unbounded context, include lexical information and both flat and tree structure features at the same time, the method has proved to be reliable and flexible enough to handle most of ***** transformations *****. | ||
| P18-1080 We evaluate this technique on three different style ***** transformations *****: sentiment, gender and political slant. | ||
| 2021.wnut-1.46 We show that word replacement edits may be suboptimal and lead to explosion of rules for spelling, diacritization and errors in morphologically rich languages, and propose a method for generating character ***** transformations ***** from GEC corpus. | ||
| D17-1311 We then search for structured relationships among these aligned pairs to discover simple vector space ***** transformations ***** corresponding to negation, conjunction, and disjunction | ||
| Bidirectional Encoder Representations | 44 | |
| 2020.clinicalnlp-1.18 ***** Bidirectional Encoder Representations ***** from Transformers (BERT) models achieve state-of-the-art performance on a number of Natural Language Processing tasks. | ||
| D19-5020 The first two models use the uncased and cased versions of ***** Bidirectional Encoder Representations ***** from Transformers (BERT) (Devlin et al., 2018) while the third model uses Universal Sentence Encoder (USE) (Cer et al. 2018). | ||
| 2020.semeval-1.136 The ***** Bidirectional Encoder Representations ***** from Transformers (BERT) regressor is considered the primary pre-trained model in our approach, whereas Flair is the main NLP library. | ||
| 2021.nllp-1.22 ***** Bidirectional Encoder Representations ***** from Transformers (BERT) has achieved state-of-the-art performances on several text classification tasks, such as GLUE and sentiment analysis. | ||
| 2020.clinicalnlp-1.5 Relying on large pretrained language models such as ***** Bidirectional Encoder Representations ***** from Transformers (BERT) for encoding and adding a simple prediction layer has led to impressive performance in many clinical natural language processing (NLP) tasks | ||
| generalizable | 44 | |
| W18-5619 Our framework makes use of labeled, unlabeled, and social media data, operates on basic features, and is scalable and ***** generalizable *****. | ||
| P19-1253 We evaluate our method on a simulated dialog dataset and achieve state-of-the-art performance, which is ***** generalizable ***** to new tasks. | ||
| W19-3608 At the moment, the domain suffers from lack of reproducibility as well as a lack of consensus on ***** generalizable ***** techniques. | ||
| 2021.naacl-main.57 In this work, we introduce a novel and ***** generalizable ***** method, called WikiTransfer, for fine-tuning pretrained models for summarization in an unsupervised, dataset-specific manner. | ||
| 2021.acl-long.202 With the help of rich non-paired single-modal data, our model is able to learn more ***** generalizable ***** representations, by allowing textual knowledge and visual knowledge to enhance each other in the unified semantic space | ||
| transcripts | 44 | |
| L10-1247 The zone of prose includes oral texts and films ***** transcripts *****, in which stressed syllables are marked according to the real pronunciation. | ||
| W17-1416 This study draws from a larger corpus of speeches ***** transcripts ***** of the Lithuanian Parliament (1990-2013) to explore language differences of political debates by gender via stylometric analysis. | ||
| L12-1358 The LR includes audio files, ***** transcripts ***** in text format and text-to-speech alignment (accessible with WinPitch Pro software). | ||
| 2011.iwslt-evaluation.6 Second, only a very small amount of relevant parallel data (***** transcripts ***** of TED talks) is available. | ||
| W16-3913 Topic modelling techniques such as LDA have recently been applied to speech ***** transcripts ***** and OCR output | ||
| dropout | 44 | |
| 2021.naacl-main.302 Specifically, we propose an approach named UniDrop to unites three different ***** dropout ***** techniques from fine-grain to coarse-grain, i.e., feature ***** dropout *****, structure ***** dropout *****, and data ***** dropout *****. | ||
| D17-1260 AAD-DQN, as a data-driven student policy, provides (1) two separate experience memories for student and teacher, (2) an uncertainty estimated by ***** dropout ***** to control the timing of consultation and learning. | ||
| 2020.acl-srw.1 Specifically, we study attention spans, sparse, and structured ***** dropout ***** methods to help understand how their attention mechanism extends for vision and language tasks. | ||
| C16-1165 Differently from the widely adopted ***** dropout ***** method, which is applied to forward connections of feedforward architectures or RNNs, we propose to drop neurons directly in recurrent connections in a way that does not cause loss of long-term memory. | ||
| 2021.semeval-1.138 We tune hyperparameters of ***** dropout ***** rate, number of LSTM units, embedding size with 10 epochs and choose the best epoch with validation recall | ||
| PDTB | 44 | |
| 2020.lrec-1.133 In this paper, we provide inter-annotator agreement figures for the new annotations and compare corpus statistics based on the new annotations to the equivalent statistics extracted from the ***** PDTB *****. | ||
| L12-1533 While the proposed modifications were driven by the desire to introduce greater conceptual clarity in the ***** PDTB ***** scheme and to facilitate better annotation quality, our findings indicate that overall, some of the changes render the annotation task much more difficult for the annotators, as also reflected in lower inter-annotator agreement for the relevant sub-tasks. | ||
| L10-1230 The ***** PDTB ***** XML is developed as a unified format for the convenience of XQuery users; it integrates discourse relations and XML structures into one unified hierarchy and builds the cross references between the syntactic trees and the discourse relations. | ||
| L12-1132 We evaluated the resulting systems on the standard test set of the ***** PDTB ***** and achieved a rebalancing of precision and recall with improved F-measures across the board. | ||
| L10-1070 Following the D-LTAG approach to discourse, we have developed a lexically anchored description of attribution, considering this relation, contrary to the approach in the ***** PDTB *****, independently from other discourse relations | ||
| NIST | 44 | |
| L10-1114 This development has been in part due to the competitive technological Language Recognition Evaluations (LRE) organized by the National Institute of Standards and Technology (***** NIST *****). | ||
| 2020.lrec-1.816 The CMN2 corpus has been used in two ***** NIST ***** Speaker Recognition Evaluations (SRE18 and SRE19), and the SRE test sets as well as the full CMN2 corpus will be published in the Linguistic Data Consortium Catalog. | ||
| L06-1088 Under the Global Autonomous Language Exploitation (GALE) program, ***** NIST ***** is tasked with implementing an edit-distance-based evaluation of MT. | ||
| 2011.iwslt-evaluation.18 While performing similarly in terms of BLEU and ***** NIST ***** scores to the popular log-linear and linear interpolation techniques, filled-up translation models are more compact and easy to tune by minimum error training. | ||
| K19-1038 It is tested on a new dataset of student summaries, and historical ***** NIST ***** data from extractive summarizers | ||
| multilingual BERT | 44 | |
| D19-1374 Starting from a public ***** multilingual BERT ***** checkpoint, our final model is 6x smaller and 27x faster, and has higher accuracy than a state-of-the-art multilingual baseline. | ||
| 2020.lrec-1.335 We extend the analysis to contextual word embeddings and evaluate ***** multilingual BERT ***** on a named entity recognition task. | ||
| 2021.nodalida-main.2 The evaluation results show that the models based on EstBERT outperform ***** multilingual BERT ***** models on five tasks out of seven, providing further evidence towards a view that training language-specific BERT models are still useful, even when multilingual models are available. | ||
| 2020.findings-emnlp.389 To investigate to what extent these results also hold for a language other than English, we probe a Dutch BERT-based model and the ***** multilingual BERT ***** model for Dutch NLP tasks. | ||
| 2020.findings-emnlp.150 Multilingual contextual embeddings, such as ***** multilingual BERT ***** and XLM-RoBERTa, have proved useful for many multi-lingual tasks | ||
| quantify | 44 | |
| 2021.eval4nlp-1.3 We ***** quantify ***** stability by comparing the models' mistakes with Fleiss' Kappa (Fleiss, 1971) and overlap ratio scores. | ||
| W18-6307 However, metrics that ***** quantify ***** the overall translation quality are ill-equipped to measure gains from additional context. | ||
| L06-1251 In our approach, we ***** quantify ***** the degree of similarity of words between different domains by measuring the degree of overlap in their domain-specific semantic spaces. | ||
| 2020.findings-emnlp.7 We ***** quantify ***** sentiment bias by adopting individual and group fairness metrics from the fair machine learning literature, and demonstrate that large-scale models trained on two different corpora (news articles, and Wikipedia) exhibit considerable levels of bias. | ||
| 2020.sustainlp-1.19 We ***** quantify ***** the error of existing software-based energy estimations by using a hardware power meter that provides highly accurate energy measurements | ||
| orthogonal | 44 | |
| 2020.coling-main.528 A common mapping approach is using an ***** orthogonal ***** matrix. | ||
| 2020.coling-main.106 Word prisms learn ***** orthogonal ***** transformations to linearly combine the input source embeddings, which allows them to be very efficient at inference time. | ||
| 2021.sustainlp-1.8 Furthermore, our techniques are ***** orthogonal ***** to other methods focused on accelerating transformer inference, and thus can be combined for even greater efficiency gains. | ||
| 2020.acl-main.744 These improvements are due to expressive input representations, which, at least at the surface, are ***** orthogonal ***** to knowledge-rich constrained decoding mechanisms that helped linear SRL models. | ||
| 2021.bea-1.5 Conducting experiments with different feature subsets, we show that the different linguistic dimensions contribute ***** orthogonal ***** information, each contributing towards the highest result achieved using all linguistic feature subsets | ||
| GPU | 44 | |
| D17-1208 We show that unfolding can already improve the runtime in practice since more work can be done on the ***** GPU *****. | ||
| 2021.acl-long.389 VisualSparta is capable of outperforming previous state-of-the-art scalable methods in MSCOCO and Flickr30K. We also show that it achieves substantial retrieving speed advantages, i.e., for a 1 million image index, VisualSparta using CPU gets ~391X speedup compared to CPU vector search and ~5.4X speedup compared to vector search with ***** GPU ***** acceleration. | ||
| P19-1626 We present two new ***** GPU ***** algorithms: one at the input layer, for multiplying a matrix by a few-hot vector (generalizing the more common operation of multiplication by a one-hot vector) and one at the output layer, for a fused softmax and top-N selection (commonly used in beam search). | ||
| W18-2716 We investigate combinations of teacher-student training, low-precision matrix products, auto-tuning and other methods to optimize the Transformer model on ***** GPU ***** and CPU | ||
| 2020.emnlp-main.366 We propose an efficient batching strategy for variable - length decoding on *****GPU***** architectures . | ||
| finite | 44 | |
| 2013.iwslt-evaluation.18 For the ASR task, using Kaldi toolkit, we developed the system based on weighted ***** finite ***** state transducer. | ||
| C18-1053 We empirically evaluate the transliteration task using the traditional weighted ***** finite ***** state transducer (WFST) approach against two neural approaches: the encoder-decoder recurrent neural network method and the recent, non-sequential Transformer method. | ||
| D18-1533 One approach is implemented, end-to-end, using ***** finite ***** state transducers (FSTs). | ||
| 2020.lrec-1.783 Although it is based on a simple hand-written ***** finite ***** state grammar, it is also able to annotate sentences that deviate from this grammar. | ||
| I17-4013 However, ***** finite ***** lexicon resources make it difficult to effectively and automatically distinguish between various types of sentiment information in Chinese texts | ||
| Spelling | 44 | |
| 2020.acl-main.81 Chinese ***** Spelling ***** Check (CSC) is a task to detect and correct spelling errors in Chinese natural language. | ||
| W17-5908 *****Spelling***** errors occur frequently in educational settings , but their influence on automatic scoring is largely unknown . | ||
| 1995.iwpt-1.24 *****Spelling***** correction using a recognizer constructed from a large word German list that simulates compounding , also indicates that the approach is applicable in such cases . | ||
| P18-3021 *****Spelling***** correction is a well - known task in Natural Language Processing ( NLP ) . | ||
| E17-4002 *****Spelling***** variation in non - standard language , e.g. | ||
| monolingual corpus | 44 | |
| 2021.emnlp-main.268 We propose a cross-lingual data selection method to extract in-domain sentences in the missing language side from a large generic ***** monolingual corpus *****. | ||
| N18-1057 In this paper, we consider synthesizing parallel data by noising a clean ***** monolingual corpus *****. | ||
| 2020.emnlp-main.208 In this way, CSP is able to pre-train the NMT model by explicitly making the most of the alignment information extracted from the source and target ***** monolingual corpus *****. | ||
| 2006.amta-papers.3 Decoding requires a very large target-language-only corpus, and while substitution in target can be performed using that same corpus, substitution in source requires a separate (and smaller) source ***** monolingual corpus *****. | ||
| C18-1305 An SLU corpus is a ***** monolingual corpus ***** with domain/intent/slot labels | ||
| referring | 44 | |
| D18-1466 As entities move from hearer-new (first introduction to the NYT audience) to hearer-old (common knowledge) status, we show empirically that the ***** referring ***** expressions along this trajectory depend on the type of the entity, and exhibit linguistic properties related to becoming common knowledge (e.g., shorter length, less use of appositives, more definiteness). | ||
| W19-8603 This paper describes and discusses the results of an empirical study on the production of ***** referring ***** expressions in visual fields with different object configurations of varying complexity and different contextual premises for using a ***** referring ***** expression. | ||
| 2020.bionlp-1.2 Novel contexts, comprising a set of terms ***** referring ***** to one or more concepts, may often arise in complex querying scenarios such as in evidence-based medicine (EBM) involving biomedical literature. | ||
| W18-6540 A ***** referring ***** expression generation algorithm adapted to this task needs to continuously monitor the changes in the field of view of the observer, his relative position to the people being described, and the relative position of these people to any landmarks around them, and to take these changes into account in the ***** referring ***** expressions generated | ||
| 2020.signlang-1.3 The utterance unit is an original concept for segmenting and annotating sign language dialogue ***** referring ***** to signer's native sense from the perspectives of Conversation Analysis (CA) and Interaction Studies. | ||
| synthetic | 44 | |
| 2020.wosp-1.4 We find that both ***** synthetic ***** and organic reference strings are equally suited for training Grobid (F1 = 0.74). | ||
| N18-1076 We show that our model has superior performance on both ***** synthetic ***** and natural data. | ||
| 2020.wmt-1.12 We attempted to develop new methods for both ***** synthetic ***** data filtering and reranking. | ||
| 2020.textgraphs-1.4 Plugging the resulting graph and representation into existing graph-based semi-supervised learn- ing algorithms like label spreading and graph convolutional networks, we show that our approach outperforms standard graph construction methods on both ***** synthetic ***** data and real datasets. | ||
| N19-1295 For evaluation, the proposed method is applied to both ***** synthetic ***** and real data, including two labelling tasks: text classification and textual entailment | ||
| spelling | 44 | |
| W17-4116 In this way, this ***** spelling ***** corrector is being developed based on two steps: an automatic rule-based syllabification method and a character-level graph to detect the degree of error in a misspelled word. | ||
| R19-1090 Building representative linguistic resources and NLP tools for non-standardized languages is challenging: when ***** spelling ***** is not determined by a norm, multiple written forms can be encountered for a given word, inducing a large proportion of out-of-vocabulary words. | ||
| W17-5908 Our main finding is that scoring methods using both token and character n-gram features are robust against ***** spelling ***** errors up to the error frequency in ASAP. | ||
| L10-1303 We believe this to be the first ***** spelling ***** correction system designed for a spoken, colloquial dialect of Arabic. | ||
| 2021.conll-1.22 Our methods also improve existing spell checkers by fixing not only more tokenization errors but also more ***** spelling ***** errors: once it is clear which characters form a word, it is much easier for them to figure out the correct word | ||
| emotion detection | 44 | |
| 2021.naacl-main.230 Here, we present work exploring the use of a semantically related task, ***** emotion detection *****, for equally competent but more explainable and human-like psychological stress detection as compared to a black-box model. | ||
| S19-2048 In this paper we present our model on the task of ***** emotion detection ***** in textual conversations in SemEval-2019. | ||
| 2020.emnlp-main.291 As an important research issue in the natural language processing community, multi-label ***** emotion detection ***** has been drawing more and more attention in the last few years. | ||
| 2021.acl-long.184 The existing studies working on ***** emotion detection ***** usually focus on how to improve the performance of model prediction, in which emotions are represented with one-hot vectors. | ||
| S19-2061 This system cooperates with an ***** emotion detection ***** neural network method (Poria et al., 2017), emoji2vec (Eisner et al., 2016) embedding, word2vec embedding (Mikolov et al., 2013), and our proposed emoticon and emoji preprocessing method. | ||
| product reviews | 44 | |
| D17-1142 We observe that the evidence-conclusion discourse relations, also known as arguments, often appear in ***** product reviews *****, and we hypothesise that some argument-based features, e.g. | ||
| D18-1403 We present a neural framework for opinion summarization from online ***** product reviews ***** which is knowledge-lean and only requires light supervision (e.g., in the form of product domain labels and user-provided ratings). | ||
| E17-1059 The proposed model is trained end-to-end to maximize the likelihood of target ***** product reviews ***** given the attributes. | ||
| 2020.aespen-1.4 This study evaluates the robustness of two state-of-the-art deep contextual language representations, ELMo and DistilBERT, on supervised learning of binary protest news classification (PC) and sentiment analysis (SA) of ***** product reviews *****. | ||
| 2020.ecnlp-1.11 Some of these aspects can be extracted from the ***** product reviews *****. | ||
| errors | 44 | |
| 2020.coling-main.554 Our best model, based on LSTMs, outperforms state-of-the-art results and achieves mean absolute ***** errors ***** of 1.86 and 2.28, at sentence and text levels, respectively. | ||
| 2021.wassa-1.26 We explicitly examine the impact of transcription ***** errors ***** on the downstream performance of a multi-modal system on three related tasks from three datasets: emotion, sarcasm, and personality detection. | ||
| 2020.emnlp-main.446 One setting of particular interest is machine translation (MT), where models have high commercial value and ***** errors ***** can be costly. | ||
| 2020.eamt-1.14 The resulting annotation shows that, compared to the best recurrent system, the best Transformer system results in a 31% reduction of the total number of ***** errors ***** and it produced significantly less ***** errors ***** in 10 out of 22 error categories. | ||
| 2016.iwslt-1.25 Our main finding is that translating conversation transcripts turned out to not be as challenging as we expected : while translation quality is of course not perfect , a straightforward phrase - based system trained on movie subtitles yields high BLEU scores ( high 40s on the development set ) and manual analysis of 100 examples showed that 61 of them were correctly translated , and *****errors***** were mostly local disfluencies in the remaining examples . | ||
| data set | 44 | |
| 2021.nlp4posimpact-1.14 We also release the first publicly available ***** data set ***** at the intersection of geopolitical relations and a raging pandemic in the context of India and Pakistan. | ||
| N18-1126 We evaluate its lemmatization accuracy across 20 languages in both a full ***** data set *****ting and a lower-resource setting with 10k training examples in each language. | ||
| W16-4502 Experiments were conducted by comparing perplexity and BLEU scores on common test cases using the same training ***** data set *****. | ||
| L14-1646 The ECB corpus is one of the ***** data set *****s used for evaluation of the task of event coreference resolution. | ||
| P19-1144 Experiments have shown that our model outperformed several strong baseline models on different ***** data set *****s. | ||
| human language technology | 44 | |
| L06-1464 GALE's goals require a quantum leap in the performance of ***** human language technology *****, while also demanding solutions that are more intelligent, more robust, more adaptable, more efficient and more integrated. | ||
| L10-1177 For some years now, the Nederlandse Taalunie (Dutch Language Union) has been active in promoting the development of ***** human language technology ***** (HLT) applications for users of Dutch with communication disabilities. | ||
| 2020.lt4gov-1.5 Therefore, the Austrian Language Resource Portal stresses the importance of language resources specific to a language variety, thus paving the way for the re-use of variety-specific language data for ***** human language technology *****, such as machine translation training, for the Austrian standard variety. | ||
| 2020.stoc-1.1 We deploy a pipeline of ***** human language technology *****, including Ask and Framing Detection, Named Entity Recognition, Dialogue Engineering, and Stylometry. | ||
| L16-1255 A survey of the twelve cycles to date ― two awards each in the Fall and Spring semesters from Fall 2010 through Spring 2016 ― yields an interesting view into graduate program research trends in ***** human language technology ***** and related fields and the particular data sets deemed important to support that research. | ||
| clinical text | 44 | |
| 2020.multilingualbio-1.3 Filtering against a large collection of domain terminologies and corpora drastically reduces the size of the vocabulary in favour of more realistic terms or terms that can reasonably be expected to match ***** clinical text ***** passages within a text-mining pipeline. | ||
| 2021.louhi-1.2 Negation scope resolution is key to high-quality information extraction from ***** clinical text *****s, but so far, efforts to make encoders used for information extraction negation-aware have been limited to English. | ||
| 2020.lrec-1.561 A popular application for that purpose is named entity recognition (NER), but the annotation policies of existing clinical corpora have not been standardized across ***** clinical text *****s of different types. | ||
| 2020.clinicalnlp-1.8 However, recent advanced neural architectures with flat convolutions or multi-channel feature concatenation ignore the sequential causal constraint within a text sequence and may not learn meaningful ***** clinical text ***** representations, especially for lengthy clinical notes with long-term sequential dependency. | ||
| W18-5607 We explore to adapt the tree-based LSTM-RNN model proposed by Miwa and Bansal (2016) to temporal relation extraction from ***** clinical text *****, obtaining a five point improvement over the best 2016 Clinical TempEval system and two points over the state-of-the-art. | ||
| knowledge representation | 44 | |
| L06-1324 Sentences written in this language unambiguously map into a number of ***** knowledge representation ***** formats including OWL and RDF-S to allow round-trip ontology management. | ||
| W19-0604 We argue that using fuzzy sets for modeling meaning of words and other natural language constructs, along with situations described with natural language is interesting both from purely linguistic perspective, and also as a ***** knowledge representation ***** for problems of computational linguistics and natural language processing. | ||
| L10-1575 WWe propose applying standardized linguistic annotation to terms included in labels of ***** knowledge representation ***** schemes (taxonomies or ontologies), hypothesizing that this would help improving ontology-based semantic annotation of texts. | ||
| L16-1156 We show via corpus analysis that the Generative Lexicon, enhanced in different manners and viewed as both a lexicon and a domain ***** knowledge representation *****, is a relevant approach. | ||
| W19-2404 In this paper, we advocate the use of Message Sequence Chart (MSC) as a ***** knowledge representation ***** to capture and visualize multi-actor interactions and their temporal ordering. | ||
| automatic text | 44 | |
| L06-1110 We introduce the problem of ***** automatic text *****ual anonymisation and present a new publicly-available, pseudonymised benchmark corpus of personal email text for the task, dubbed ITAC (Informal Text Anonymisation Corpus). | ||
| L04-1155 In the field of biomedicine, there is a critical need for ***** automatic text ***** processing. | ||
| I17-2033 An ***** automatic text ***** summarization system can automatically generate a short and brief summary that contains a main concept of an original document. | ||
| R19-1131 We use the state-of-the-art ***** automatic text ***** simplification (ATS) system for lexically and syntactically simplifying source sentences, which are then translated with two state-of-the-art English-to-Serbian MT systems, the phrase-based MT (PBMT) and the neural MT (NMT). | ||
| 2018.jeptalnrecital-court.34 Lexical complexity detection is an important step for ***** automatic text ***** simplification which serves to make informed lexical substitutions. | ||
| recursive neural network | 44 | |
| W18-5035 In this paper we have proposed a linguistically informed ***** recursive neural network ***** architecture for automatic extraction of cause-effect relations from text. | ||
| C16-1289 In order to accurately learn the semantic hierarchy of a bilingual phrase, we develop a ***** recursive neural network ***** to constrain the learned bilingual phrase structures to be consistent with word alignments. | ||
| E17-1002 We implemented and evaluated a binary tree model of NTI, showing the model achieved the state-of-the-art performance on three different NLP tasks: natural language inference, answer sentence selection, and sentence classification, outperforming state-of-the-art recurrent and ***** recursive neural network *****s. | ||
| P17-2075 In this work, we employ ***** recursive neural network *****s to break down these independence assumptions to obtain inference about demographic characteristics on Twitter. | ||
| P18-3004 In this paper, we propose a preordering method with ***** recursive neural network *****s that learn features from raw inputs. | ||
| Neural Machine Translation ( NMT ) | 44 | |
| W19-5201 Despite their original goal to jointly learn to align and translate , *****Neural Machine Translation ( NMT )***** models , especially Transformer , are often perceived as not learning interpretable word alignments . | ||
| D19-5543 *****Neural Machine Translation ( NMT )***** models have been proved strong when translating clean texts , but they are very sensitive to noise in the input . | ||
| 2020.acl-srw.38 In this paper , we propose a method of re - ranking the outputs of *****Neural Machine Translation ( NMT )***** systems . | ||
| 2020.ngt-1.5 We present META - MT , a meta - learning approach to adapt *****Neural Machine Translation ( NMT )***** systems in a few - shot setting . | ||
| D19-5619 *****Neural Machine Translation ( NMT )***** models generally perform translation using a fixed - size lexical vocabulary , which is an important bottleneck on their generalization capability and overall translation quality . | ||
| aligner | 43 | |
| L14-1407 This allowed us to test, recordings duration, recordings material, the performance of our automatic ***** aligner ***** software. | ||
| L10-1092 In our experiments we can show that our tree ***** aligner ***** produces results with high quality and outperforms unsupervised techniques proposed otherwise. | ||
| 2021.nodalida-main.36 However, most forced ***** aligner *****s are language-dependent, and under-resourced languages rarely have enough resources to train an acoustic model for an ***** aligner *****. | ||
| L06-1272 The ***** aligner ***** uses a Support Vector Machine classifier to discriminate between positive and negative examples of sentence pairs. | ||
| L16-1354 When plugged into Bleualign, a state-of-the-art sentence ***** aligner *****, our function improves both precision and recall of alignments over the originally proposed BLEU score | ||
| realization | 43 | |
| L06-1388 If a sort of new paradigm for language resource sharing is required, we think that the emerging and still evolving technology connected to Grid computing is a very interesting and suitable one for a concrete ***** realization ***** of this vision. | ||
| N19-1236 We propose to split the generation process into a symbolic text-planning stage that is faithful to the input, followed by a neural generation stage that focuses only on ***** realization *****. | ||
| C16-1144 Our experiments show that this can filter out large fractions of infeasible edges in, and thus benefit the performance of, complex ***** realization ***** processes. | ||
| 2020.acl-main.665 The Surface Realization Shared Tasks of 2018 and 2019 were Natural Language Generation shared tasks with the goal of exploring approaches to surface ***** realization ***** from Universal-Dependency-like trees to surface strings for several languages | ||
| 2005.mtsummit-papers.15 In this paper we describe and evaluate different statistical models for the task of *****realization***** ranking , i.e. | ||
| SemEval 2021 | 43 | |
| 2021.semeval-1.161 In this paper, we describe our system used for ***** SemEval 2021 ***** Task 7: HaHackathon: Detecting and Rating Humor and Offense. | ||
| 2021.semeval-1.182 This paper describes our approach for Task 9 of ***** SemEval 2021 *****: | ||
| 2021.semeval-1.66 This paper describes our contribution to ***** SemEval 2021 ***** Task 1 (Shardlow et al., 2021): Lexical Complexity Prediction. | ||
| 2021.semeval-1.47 Our experiments on the recent ***** SemEval 2021 ***** Task 8 datasets reveal the effectiveness of the proposed model. | ||
| 2021.semeval-1.78 We describe the UTFPR systems submitted to the Lexical Complexity Prediction shared task of ***** SemEval 2021 ***** | ||
| hierarchically | 43 | |
| P19-1295 By default, the hidden states of each word are ***** hierarchically ***** calculated by attending to all words in the sentence, which assembles global information. | ||
| D18-1408 To this end, we propose Phrase-level Self-Attention Networks (PSAN) that perform self-attention across words inside a phrase to capture context dependencies at the phrase level, and use the gated memory updating mechanism to refine each word's representation ***** hierarchically ***** with longer-term context dependencies captured in a larger phrase. | ||
| W19-5207 The results show document embeddings derived from sentence-level averaging are surprisingly effective for clean datasets, but suggest models trained ***** hierarchically ***** at the document-level are more effective on noisy data. | ||
| D17-3001 The tutorial teaches the audience about definitions, assumptions and practical choices related to modeling and representing IsA relations in existing, human-compiled resources of instances, concepts and resulting conceptual hierarchies; methods for automatically extracting sets of instances within unlabeled or labeled concepts, where the concepts may be considered as a flat set or organized ***** hierarchically *****; and applications of IsA relations in information retrieval. | ||
| 2021.emnlp-main.285 In this paper, to alleviate this problem, we propose leveraging capsule routing to associate knowledge with medical literature ***** hierarchically ***** (called HiCapsRKL) | ||
| interlocutors | 43 | |
| C16-1070 In order to find such a coordination, we investigated 1) lexical similarities between the speakers in each emotional segments, 2) correlation between the ***** interlocutors ***** using psycholinguistic features, such as linguistic styles, psychological process, personal concerns among others, and 3) relation of ***** interlocutors ***** turn-taking behaviors such as competitiveness. | ||
| 2020.sigdial-1.23 We employ LSTM-based encoders that capture self and inter-speaker dependency of ***** interlocutors ***** to generate contextualized utterance representations which are fed into the CRF layer. | ||
| K19-1067 Unfortunately, complex interactions among the ***** interlocutors *****' roles make it challenging to precisely capture conversational contexts and ***** interlocutors *****' information. | ||
| 2020.law-1.17 This study develops the strand of research on topic transitions in social talk which aims to gain a better understanding of ***** interlocutors *****' conversational goals. | ||
| L12-1024 The paper discusses mechanisms for topic management in conversations, concentrating on interactions where the ***** interlocutors ***** react to each other's presentation of new information and construct a shared context in which to exchange information about interesting topics | ||
| smoothing | 43 | |
| P18-1195 Our experiments on two different tasks, image captioning and machine translation, show that token-level and sequence-level loss ***** smoothing ***** are complementary, and significantly improve results. | ||
| 2020.aacl-main.25 Theoretically, we derive and explain exactly what label ***** smoothing ***** is optimizing for. | ||
| N18-1085 We introduce neural particle ***** smoothing *****, a sequential Monte Carlo method for sampling annotations of an input string from a given probability model. | ||
| 2021.acl-long.272 Although existing approaches such as label ***** smoothing ***** can alleviate this issue, they fail to adapt to diverse dialog contexts. | ||
| W18-5712 We sketch several directions for tackling model over-confidence and, hence, the low-diversity problem, including confidence penalties and label ***** smoothing ***** | ||
| prototypical | 43 | |
| 2020.emnlp-main.494 On ***** prototypical ***** language generation tasks such as translation and summarization, our method consistently outperforms other distillation algorithms, such as sequence-level knowledge distillation. | ||
| L08-1275 The present study applies the measurement of conceptual similarity to conceptual metaphor research by comparing concreteness of ontological resource nodes to several ***** prototypical ***** concrete nodes selected by human subjects. | ||
| 2021.metanlp-1.8 By incorporating SMLMT with ***** prototypical ***** networks, the meta learner generalizes better to unseen domains and gains higher accuracy on out-of-scope examples without the heavy lifting of pre-training. | ||
| P19-1277 Previous studies on this topic adopt ***** prototypical ***** networks, which calculate the embedding vector of a query instance and the prototype vector of the support set for each relation candidate independently. | ||
| 2020.lrec-1.366 In this paper we embrace a radically different paradigm that provides a slot-filler structure, called “semagram”, to define the meaning of words in terms of their ***** prototypical ***** semantic information | ||
| antecedent | 43 | |
| J18-2002 We show that taking sibling anaphors into account in a joint inference model improves ***** antecedent ***** selection performance. | ||
| N18-2108 Our approach uses the ***** antecedent ***** distribution from a span-ranking architecture as an attention mechanism to iteratively refine span representations. | ||
| D17-1018 It is trained to maximize the marginal likelihood of gold ***** antecedent ***** spans from coreference clusters and is factored to enable aggressive pruning of potential mentions. | ||
| 2020.emnlp-main.686 To make a comprehensive analysis, we implement an end-to-end coreference system as well as four HOI approaches, attended ***** antecedent *****, entity equalization, span clustering, and cluster merging, where the latter two are our original methods. | ||
| L10-1588 Additionally, possible features are revealed for their identification, and a search scope for the ***** antecedent ***** has been determined, increasing the chances of correct resolution | ||
| Definition | 43 | |
| L12-1448 We outline the linguistic representation (morphology and dependency syntax) for Finnish, and show how the resulting `Grammar ***** Definition ***** Corpus' and the documentation is used as a task specification for an external subcontractor for building a parser engine for use in morphological and dependency syntactic analysis of large volumes of Finnish for parsebanking purposes. | ||
| D19-1357 *****Definition***** modeling includes acquiring word embeddings from dictionary definitions and generating definitions of words . | ||
| 2021.ranlp-1.17 *****Definition***** modelling is the task of automatically generating a dictionary - style definition given a target word . | ||
| 2020.acl-main.65 *****Definition***** generation , which aims to automatically generate dictionary definitions for words , has recently been proposed to assist the construction of dictionaries and help people understand unfamiliar texts . | ||
| W19-4015 *****Definition***** extraction has been a popular topic in NLP research for well more than a decade , but has been historically limited to well - defined , structured , and narrow conditions . | ||
| empathetic | 43 | |
| 2020.coling-main.394 A humanized dialogue system is expected to generate ***** empathetic ***** replies, which should be sensitive to the users' expressed emotion. | ||
| P18-1104 Generating emotional language is a key step towards building ***** empathetic ***** natural language processing agents. | ||
| 2020.emnlp-main.531 Notably, our results show that persona improves ***** empathetic ***** responding more when CoBERT is trained on ***** empathetic ***** conversations than non-***** empathetic ***** ones, establishing an empirical link between persona and empathy in human conversations. | ||
| 2020.wanlp-1.6 The experiments showed success of our proposed empathy-driven Arabic chatbot in generating ***** empathetic ***** responses with a perplexity of 38.6, an empathy score of 3.7, and a fluency score of 3.92. | ||
| 2021.acl-long.440 Emotion recognition in conversation (ERC) is a crucial component in affective dialogue systems, which helps the system understand users' emotions and generate ***** empathetic ***** responses | ||
| evaluating | 43 | |
| W18-0501 We conclude that ***** evaluating ***** the model on a corpus of exemplar responses if one is available provides additional evidence about system validity; at the same time, investing effort into creating a corpus of exemplar responses for model training is unlikely to lead to a substantial gain in model performance. | ||
| W18-2325 We present a novel annotation task ***** evaluating ***** a patient's engagement with their health care regimen. | ||
| W16-4019 In our experiments we used a version of the Bible translated in four different languages, ***** evaluating ***** the precision of our semantic indexing pipeline and showing its reliability on the cross-lingual text retrieval task. | ||
| 2021.semeval-1.89 Text comprehension depends on the reader's ability to understand the words present in it; ***** evaluating ***** the lexical complexity of such texts can enable readers to find an appropriate text and systems to tailor a text to an audience's needs. | ||
| 2020.acl-main.55 Through experiments, we demonstrate that ***** evaluating ***** systems via response selection with the test set developed by our method correlates more strongly with human evaluation, compared with widely used automatic evaluation metrics such as BLEU | ||
| interaction | 43 | |
| 2021.hcinlp-1.13 In this research proposal, we aim to quantify the influence of each modality in ***** interaction ***** with various referential complexities. | ||
| W18-0621 In this work, we focus on how ***** interaction ***** with others in such a community affects the mental state of users who are seeking support. | ||
| L16-1565 The treebank is dynamic: by global reparsing at certain intervals it is kept compatible with the latest versions of the grammar and the lexicon, which are continually further developed in ***** interaction ***** with the annotators. | ||
| L14-1718 This article presents a corpus featuring adults playing games in ***** interaction ***** with machine trying to induce laugh. | ||
| 2020.sltu-1.22 We also considered the ***** interaction ***** of adjectives with other grammatical means, especially other part of speeches, e.g. | ||
| simultaneous | 43 | |
| 2021.mtsummit-asltrw.4 We argue that ***** simultaneous ***** translation for readable live subtitles still faces challenges, the main one being poor translation quality, and propose directions for steering future research. | ||
| 2021.autosimtrans-1.4 We propose a competitive ***** simultaneous ***** translation system and achieves a BLEU score of 24.39 in the audio input track. | ||
| N18-1164 Interleaved conversations lead to difficulties in not only following discussions but also retrieving relevant information from ***** simultaneous ***** messages. | ||
| 2020.ngt-1.16 In this paper, we propose a transfer learning based ***** simultaneous ***** translation model by extending BART. | ||
| 2020.aacl-main.58 We investigate how to adapt ***** simultaneous ***** text translation methods such as wait-k and monotonic multihead attention to end-to-end ***** simultaneous ***** speech translation by introducing a pre-decision module | ||
| explanation | 43 | |
| 2021.hackashop-1.11 We present AttViz, a method for exploration of self-attention in transformer networks, which can help in ***** explanation ***** and debugging of the trained models by showing associations between text tokens in an input sequence. | ||
| 2020.findings-emnlp.300 QUARTET constructs ***** explanation *****s from the sentences in the procedural text, achieving ~18 points better on ***** explanation ***** accuracy compared to several strong baselines on a recent process comprehension benchmark. | ||
| 2020.nl4xai-1.7 In this paper we aim to highlight the importance of a natural language approach to ***** explanation ***** and to discuss some of the previous and state of the art attempts of the textual ***** explanation ***** of Bayesian Networks. | ||
| W19-5938 Furthermore, we expect that this setting puts the focus on ***** explanation ***** as a linguistic act, vs. explainability as a property of models. | ||
| 2020.lrec-1.220 We then introduce NLP researchers to contemporary philosophy of science theories that allow robust yet non-causal reasoning in ***** explanation *****, giving computer scientists a vocabulary for future research | ||
| abstract | 43 | |
| 2020.findings-emnlp.238 To this end, we introduce a novel dataset allowing to explore different stance-based persona representations and their impact on claim generation, showing that they are able to grasp ***** abstract ***** and profound aspects of the author persona. | ||
| 2021.wmt-1.13 We conducted experiments on English-Hausa, Xhosa-Zulu and English-Basque, and submitted the results for Xhosa→Zulu in the News Translation Task, and English→Basque in the Biomedical Translation Task, ***** abstract ***** and terminology translation subtasks. | ||
| 2021.naacl-main.441 To improve the model generalization capability for rare and unseen schemas, we propose a new architecture, ShadowGNN, which processes schemas at ***** abstract ***** and semantic levels. | ||
| P18-1207 Multimodal affective computing, learning to recognize and interpret human affect and subjective information from multiple data sources, is still a challenge because: (i) it is hard to extract informative features to represent human affects from heterogeneous inputs; (ii) current fusion strategies only fuse different modalities at ***** abstract ***** levels, ignoring time-dependent interactions between modalities. | ||
| 2012.amta-monomt.7 Here, as a feasibility test, we focus on depassivization in German and we ***** abstract ***** from surface forms to parts of speech | ||
| latent semantic analysis | 43 | |
| N18-1041 We show that naïve use of AMR in paraphrase detection is not necessarily useful, and turn to describe a technique based on ***** latent semantic analysis ***** in combination with AMR parsing that significantly advances state-of-the-art results in paraphrase detection for the Microsoft Research Paraphrase Corpus. | ||
| W16-4904 Experiments demonstrate that the proposed technique often outperforms other compositional distributional semantics approaches as well as vector space methods such as ***** latent semantic analysis *****. | ||
| D18-1212 The existing studies in cross-language information retrieval (CLIR) mostly rely on general text representation models (e.g., vector space model or ***** latent semantic analysis *****). | ||
| L14-1166 Comparison between kanji-based DSMs and word-based DSMs reveals that our kanji-based DSMs generally outperform ***** latent semantic analysis *****, and also surpasses the best score word-based DSM for infrequent words comprising only frequent kanji characters. | ||
| L12-1470 In addition, semantic word relatedness modeled by ***** latent semantic analysis ***** is also included. | ||
| large corpus | 43 | |
| 2020.emnlp-main.283 In this paper, we propose a simple method to provide annotations for most unambiguous words in a ***** large corpus *****. | ||
| W19-4727 Here, we provide a ***** large corpus ***** of German poetry which consists of about 75k poems with more than 11 million tokens, with poems ranging from the 16th to early 20th century. | ||
| 2021.emnlp-main.293 Information seeking is an essential step for open-domain question answering to efficiently gather evidence from a ***** large corpus *****. | ||
| W18-3013 The task we use is fine-grained name typing: given a ***** large corpus *****, find all types that a name can refer to based on the name embedding. | ||
| L06-1210 Data sparsity is a large problem in natural language processing that refers to the fact that language is a system of rare events, so varied and complex, that even using an extremely ***** large corpus *****, we can never accurately model all possible strings of words. | ||
| tools | 43 | |
| D19-1212 Multi-view learning algorithms are powerful representation learning ***** tools *****, often exploited in the context of multimodal problems. | ||
| L06-1343 In this paper, we describe the second release of a suite of language analysers, developed over the last five years, called wraetlic, which includes ***** tools ***** for several partial parsing tasks, both for English and Spanish. | ||
| 2020.lrec-1.818 Recent advances in neural speech synthesis have enabled the development of such systems with a data-driven approach that does not require significant development of language-specific ***** tools *****. | ||
| L16-1175 This is particularly a challenge for the creation of ***** tools ***** to support learning Arabic as a living language on the web, where authentic material can be found in both MSA and DA. | ||
| L06-1040 As chat language holds anomalous characteristics in forming words , phrases , and non - alphabetical characters , conventional natural language processing *****tools***** are ineffective to handle chat language text . | ||
| evidence | 43 | |
| 2021.naacl-main.258 We provide supportive ***** evidence ***** by ex-perimentally confirming that well-performingmodels show a low sensitivity to noise andfine-tuning with LNSR exhibits clearly bet-ter generalizability and stability. | ||
| D19-1664 We first produce a new dataset, BASIL, of 300 news articles annotated with 1,727 bias spans and find ***** evidence ***** that informational bias appears in news articles more frequently than lexical bias. | ||
| D17-1142 We observe that the ***** evidence *****-conclusion discourse relations, also known as arguments, often appear in product reviews, and we hypothesise that some argument-based features, e.g. | ||
| W19-0506 At the same time, the studies reveal empirical ***** evidence ***** why contextual abstractness represents a valuable indicator for automatic non-literal language identification. | ||
| P18-2061 This study provides ***** evidence ***** that such features allow for better transfer across languages. | ||
| linguistic data | 43 | |
| L12-1467 The platform will facilitate new linguistic findings by making it possible to manage and analyse primary data and annotations in the petabyte range, while at the same time allowing an undistorted view of the primary ***** linguistic data *****, and thus fully satisfying the demands of a scientific tool. | ||
| W19-4819 Here we present a suite of experiments probing whether neural language models trained on ***** linguistic data ***** induce these stack-like data structures and deploy them while incrementally predicting words. | ||
| L14-1282 Collaboration is seen as an essential method for generating large amounts of ***** linguistic data *****, as well as for validating the data so it can be considered trustworthy. | ||
| W17-8105 The E-platform integrates: 1/ an environment for creating, organizing and maintaining electronic text archives, for extracting text corpora and aligning corpora; 2/ a ***** linguistic data *****base; 3/ a concordancer; 4/ a set of modules for the generation and editing of practice exercises for each text or corpus; 5/ functionalities for export from the platform and import to other educational platforms. | ||
| 2020.lrec-1.694 In this paper we propose an approach to validate the retrieved data based on four axioms that rely on two linguistic theories: the x-bar theory and the multidimensional theory of terminology.The validation process is supported by a second knowledge base specialised in ***** linguistic data *****; in this case, CONCEPTNET. | ||
| document classification | 43 | |
| D19-1077 This paper explores the broader cross-lingual potential of mBERT (multilingual) as a zero shot language transfer model on 5 NLP tasks covering a total of 39 languages from various language families: NLI, ***** document classification *****, NER, POS tagging, and dependency parsing. | ||
| 2020.findings-emnlp.152 Our work thus focuses on optimizing the computational cost of fine-tuning for ***** document classification *****. | ||
| 2021.emnlp-main.253 Therefore, in this paper, we propose a novel neural network based approach for multi-label ***** document classification *****, in which two heterogeneous graphs are constructed and learned using heterogeneous graph transformers. | ||
| Q15-1022 Experimental results show that by using information from the external corpora, our new models produce significant improvements on topic coherence, document clustering and ***** document classification ***** tasks, especially on datasets with few or short documents. | ||
| L16-1662 It has been evaluated on several NLP tasks: the analogical reasoning task, sentiment analysis, and crosslingual ***** document classification *****. | ||
| health | 43 | |
| D17-2003 Case studies tend to be used in legal, business, and ***** health ***** education contexts, but less in the teaching and learning of linguistics. | ||
| 2021.acl-short.33 A recent study showed that manual summarization of consumer ***** health ***** questions brings significant improvement in retrieving relevant answers. | ||
| W17-3105 Our results indicate that, overall, research participants were enthusiastic about the possibility of using social media (in conjunction with automated Natural Language Processing algorithms) for mood tracking under the supervision of a mental ***** health ***** practitioner. | ||
| 2021.clpsych-1.10 Based on research in mental ***** health ***** studies linking self-harm tendencies with suicide, in our system, we attempt to characterize self-harm aspects expressed in user tweets over a period of time. | ||
| 2021.dash-1.15 We present the Everyday Living Artificial Intelligence (AI) Hub, a novel proof-of-concept framework for enhancing human ***** health ***** and wellbeing via a combination of tailored wear-able and Conversational Agent (CA) solutions for non-invasive monitoring of physiological signals, assessment of behaviors through unobtrusive wearable devices, and the provision of personalized interventions to reduce stress and anxiety. | ||
| emotional | 43 | |
| D18-1005 Our system features the use of domain-specific resources automatically derived from a large unlabeled corpus, and contextual representations of the ***** emotional ***** and semantic content of the user's recent tweets as well as their interactions with other users. | ||
| L10-1351 We describe an experimentalWizard-of-Oz-setup for the integration of ***** emotional ***** strategies into spoken dialogue management. | ||
| 2021.clpsych-1.6 These lexicons are useful for various psychology applications such as detecting ***** emotional ***** state, well being, relationship quality in conversation, identifying topics (e.g., family, work) and many more. | ||
| 2020.emnlp-main.426 In this paper, we present the first study on modeling the ***** emotional ***** trajectory of the protagonist in neural storytelling. | ||
| L12-1514 The corpus is actually made up of six different subsets of material: a neutral subcorpus, containing emotionless utterances; a dialog' subcorpus, containing typical call center utterances; an ***** emotional *****' corpus, a set of sentences representative of pure ***** emotional ***** states; a football' subcorpus, including utterances imitating a football broadcasting situation; a SMS' subcorpus, including readings of SMS texts; and a paralinguistic elements' corpus, including recordings of interjections and paralinguistic sounds uttered in isolation. | ||
| short answer grading | 43 | |
| 2020.lrec-1.321 In this paper, we introduce AR-ASAG, an Arabic Dataset for automatic ***** short answer grading *****. | ||
| W16-4904 We address the problem of automatic ***** short answer grading *****, evaluating a collection of approaches inspired by recent advances in distributional text representations. | ||
| S17-2001 Applications include machine translation (MT), summarization, generation, question answering (QA), ***** short answer grading *****, semantic search, dialog and conversational systems. | ||
| 2021.emnlp-main.487 Automatic ***** short answer grading ***** (ASAG) is the task of assessing students' short natural language responses to objective questions. | ||
| D19-6119 *****Short Answer Grading***** (SAG) is a task of scoring students' answers in examinations. | ||
| cross - lingual representation | 43 | |
| 2020.lrec-1.330 Our results indicate that *****cross-lingual representations***** for OOV words can indeed be formed from sub-word embeddings, including in the case of a truly low-resource morphologically-rich language. | ||
| 2020.findings-emnlp.84 More specifically, we propose a hybrid emoji-based Masked Language Model (MLM) to leverage the common information conveyed by emojis across different languages and improve the learned *****cross-lingual representation***** of short text messages, with the goal to perform zero- shot abusive language detection. | ||
| 2020.emnlp-main.362 Multilingual BERT (mBERT), XLM-RoBERTa (XLMR) and other unsupervised multilingual encoders can effectively learn *****cross-lingual representation*****. | ||
| 2020.blackboxnlp-1.5 Recent works have demonstrated that multilingual BERT (mBERT) learns rich *****cross-lingual representations*****, that allow for transfer across languages. | ||
| 2021.starsem-1.22 *****Cross-lingual representations***** have the potential to make NLP techniques available to the vast majority of languages in the world. | ||
| tree kernel | 43 | |
| S18-1076 The model builds on the distributed tree embedder also known as distributed *****tree kernel*****. | ||
| W16-4829 We proposed three different systems that involved simplistic features, to name: a Naive-bayes system, a Support Vector Machines-based system and a *****Tree Kernel*****-based system. | ||
| W17-5809 In this work, we introduce a novel feature engineering approach named “algebraic invariance” to identify discriminative patterns for learning relation pair features for the chemical-disease relation (CDR) task of BioCreative V. Our method exploits the existing structural similarity of the key concepts of relation descriptions from the CDR corpus to generate robust linguistic patterns for SVM *****tree kernel*****-based learning. | ||
| C16-2042 For text coherence, we use a measure of agreement between a given and consecutive paragraph by *****tree kernel***** learning of their discourse trees. | ||
| L16-1452 We adapt and investigate the effects of two untyped dependency *****tree kernels*****, which have originally been proposed for relation extraction, to the multi-document summarization problem. | ||
| summarize | 42 | |
| 2020.fnp-1.34 With the constantly growing amount of information, the need arises to automatically ***** summarize ***** this written information. | ||
| 2020.eamt-1.64 In this project description, we ***** summarize ***** the main results of the project and present future work. | ||
| L12-1375 We also ***** summarize ***** the procedure of the rich annotation (incl. | ||
| 2021.naacl-industry.14 In this paper, we conduct a deep analysis of a dialogue corpus and ***** summarize ***** three major issues on dialogue translation, including pronoun dropping (), punctuation dropping (), and typos (). | ||
| 2020.conll-1.16 We also ***** summarize ***** the types of errors that we found, and we revisit several recent results in NER in light of the corrected data | ||
| likelihood | 42 | |
| 2021.naacl-main.189 We evaluate bias within pre-trained transformers using three metrics: WEAT, sequence ***** likelihood *****, and pronoun ranking. | ||
| 2021.naacl-main.395 We then propose a new metric based on ***** likelihood ***** scores from a masked language model pretrained on scientific texts. | ||
| L10-1303 Results are ranked by the estimated ***** likelihood ***** that a citation form could be misheard, mistyped, or mistranscribed for the input given by the user. | ||
| L12-1149 All experiments compare the use of two different retrieval models, i.e. Okapi BM25 and a query ***** likelihood ***** language model. | ||
| L16-1170 The features that we utilized account for: ***** likelihood ***** of stems, prefixes, suffixes, and their combination; presence in lexicons containing valid stems and named entities; and underlying stem templates | ||
| Dialect | 42 | |
| W17-1307 In this paper we are interested in the SA of the Tunisian ***** Dialect *****. | ||
| W17-7907 ***** Dialect ***** identification is studied as a subset of the task of language identification. | ||
| W19-1406 Our team achieved first place in German ***** Dialect ***** identification (GDI) and MRC subtasks 2 and 3, second place in the simplified variant of Discriminating between Mainland and Taiwan variation of Mandarin Chinese (DMT) as well as Cuneiform Language Identification (CLI), and third and fifth place in DMT traditional and MRC subtask 1 respectively. | ||
| 2020.vardial-1.22 *****Dialect***** identification represents a key aspect for improving a series of tasks , for example , opinion mining , considering that the location of the speaker can greatly influence the attitude towards a subject . | ||
| 2020.wanlp-1.10 In this paper , we present the experiments conducted , and the models developed by our competing team , Mawdoo3 AI , along the way to achieving our winning solution to subtask 1 of the Nuanced Arabic *****Dialect***** Identification ( NADI ) shared task . | ||
| Conversation | 42 | |
| 2021.naacl-main.450 We consider the intrinsic evaluation of neural generative dialog models through the lens of Grice's Maxims of ***** Conversation ***** (1975). | ||
| L14-1341 First, the ***** Conversation ***** Speech (CS) component contains free conversations of one hour length between friends, colleagues, couples, or family members. | ||
| 2021.emnlp-main.181 *****Conversation***** disentanglement aims to separate intermingled messages into detached sessions , which is a fundamental task in understanding multi - party conversations . | ||
| L12-1604 In order to understand and model the non - verbal communicative conduct of humans , it seems fruitful to combine qualitative methods ( *****Conversation***** Analysis ) and quantitative techniques ( motion capturing ) . | ||
| 2021.alta-1.1 *****Conversation***** disentanglement , the task to identify separate threads in conversations , is an important pre - processing step in multi - party conversational NLP applications such as conversational question answering and con - versation summarization . | ||
| ICD | 42 | |
| 2021.naacl-main.318 This paper is the first attempt at learning the label set distribution as a reranking module for ***** ICD ***** coding. | ||
| D19-1638 Rather, inferences are made from the input (symptoms specified in ***** ICD ***** codes) to generate the output (instructions). | ||
| 2021.ranlp-1.130 When considering ***** ICD ***** codes grouped into ten blocks, the KB-BERT was superior to the baseline models, obtaining an F1-micro of 0.80 and an F1-macro of 0.58. | ||
| 2021.acl-long.463 Since manual coding is very laborious and prone to errors , many methods have been proposed for the automatic *****ICD***** coding task . | ||
| 2020.clinicalnlp-1.3 *****ICD***** coding is the task of classifying and cod - ing all diagnoses , symptoms and proceduresassociated with a patient 's visit . | ||
| hyperparameter | 42 | |
| 2020.nlp4convai-1.5 We demonstrate the usefulness and wide applicability of the proposed intent detectors, showing that: 1) they outperform intent detectors based on fine-tuning the full BERT-Large model or using BERT as a fixed black-box encoder on three diverse intent detection data sets; 2) the gains are especially pronounced in few-shot setups (i.e., with only 10 or 30 annotated examples per intent); 3) our intent detectors can be trained in a matter of minutes on a single CPU; and 4) they are stable across different ***** hyperparameter ***** settings. | ||
| 2021.smm4h-1.9 In addition, we perform data cleaning and augmentation, as well as ***** hyperparameter ***** optimization and model ensemble to further boost the BERT performance. | ||
| 2021.wanlp-1.44 We also provide the experiments and the ***** hyperparameter ***** tuning that lead to this result. | ||
| 2021.emnlp-main.525 Commonly, rationales are modeled as stochastic binary masks, requiring sampling-based gradient estimators, which complicates training and requires careful ***** hyperparameter ***** tuning. | ||
| 2021.vardial-1.7 We particularly focus on the role of ***** hyperparameter ***** tuning for Hindi based on recommendations made in previous work (on English) | ||
| Causal | 42 | |
| W17-0903 ***** Causal ***** relations play a key role in information extraction and reasoning. | ||
| 2021.naacl-main.155 ***** Causal ***** inference is the process of capturing cause-effect relationship among variables | ||
| 2020.alta-1.9 *****Causal***** relationships form the basis for reasoning and decision - making in Artificial Intelligence systems . | ||
| D18-1488 *****Causal***** understanding is essential for many kinds of decision - making , but causal inference from observational data has typically only been applied to structured , low - dimensional datasets . | ||
| 2021.emnlp-main.644 *****Causal***** inference using observational text data is becoming increasingly popular in many research areas . | ||
| spontaneous | 42 | |
| W19-4441 We developed an automated oral proficiency scoring system for non-native English speakers' ***** spontaneous ***** speech. | ||
| L14-1141 In this paper we focus on the multilayer annotation process of left periphery structures by using a small sample of highly ***** spontaneous ***** speech in which the distinct types of topic structures are displayed. | ||
| L16-1235 Our methodology is based on an IQ game-like Wizard of Oz experiment to collect ***** spontaneous ***** and implicitly produced gestures in an ecological context. | ||
| C16-1037 Evaluation proves that inter-annotator agreement reaches satisfactory values, from 0.60 to 0.80 Cohen's kappa, while the prosody tagger achieves acceptable recall and f-measure figures for five ***** spontaneous ***** samples used in the evaluation of monologue and dialogue formats in English and Spanish | ||
| L10-1464 This paper introduces a new corpus of consulting dialogues designed for training a dialogue manager that can handle consulting dialogues through *****spontaneous***** interactions from the tagged dialogue corpus . | ||
| affect | 42 | |
| W17-5235 We delve into our feature selection approach for ***** affect ***** intensity, ***** affect ***** presence, sentiment intensity and sentiment presence lexica alongside pre-trained word embeddings, which are utilized to extract emotion intensity signals from tweets in an ensemble learning approach. | ||
| P17-1059 (Long Short-Term Memory) language model for generation of conversational text, conditioned on ***** affect ***** categories. | ||
| S18-1035 The task is ***** affect ***** intensity prediction in tweets, including five subtasks. | ||
| L16-1345 In this paper we present a multimodal database of ***** affect ***** bursts, which are very short non-verbal expressions with facial, vocal, and gestural components that are highly synchronized and triggered by an identifiable event. | ||
| C16-1283 We describe the method for collecting the heterogeneous longitudinal data, how features are extracted to address missing information and differences in temporal alignment, and how the latter are combined to yield promising predictions of ***** affect ***** and well-being on the basis of widely used psychological scales | ||
| bilingual corpora | 42 | |
| W18-3903 Methods to detect them automatically do exist, however they make use of large aligned ***** bilingual corpora *****, which are hard to find and expensive to build, or encounter problems dealing with infrequent words. | ||
| L08-1567 Statistical Machine Translation (SMT) is based on alignment models which learn from ***** bilingual corpora ***** the word correspondences between source and target language. | ||
| 2002.amta-papers.13 One of the problems facing translation systems that automatically extract transfer mappings (rules or examples) from ***** bilingual corpora ***** is the trade-off between contextual specificity and general applicability of the mappings, which typically results in conflicting mappings without distinguishing context | ||
| 2020.wmt-1.66 In this paper, we first take a step back and look at the commonly used ***** bilingual corpora ***** (WMT), and resurface the existence and importance of implicit structure that existed in it: multi-way alignment across examples (the same sentence in more than two languages). | ||
| 2021.naacl-srw.17 In this paper, we propose an approach based on transfer learning to mine parallel sentences in the unsupervised setting.With the help of ***** bilingual corpora ***** of rich-resource language pairs, we can mine parallel sentences without bilingual supervision of low-resource language pairs. | ||
| people | 42 | |
| L12-1611 We focus specifically on speech and gesture interaction which can enhance the quality of lifestyle of ***** people ***** living in assistive environments, be they seniors or ***** people ***** with physical or cognitive disabilities. | ||
| 2020.iwslt-1.33 Though ***** people ***** rarely speak in complete sentences, punctuation confers many benefits to the readers of transcribed speech. | ||
| 2020.lrec-1.550 Our focus is directed at the de-identification of emails where personally identifying information does not only refer to the sender but also to those ***** people *****, locations, dates, and other identifiers mentioned in greetings, boilerplates and the content-carrying body of emails. | ||
| 2020.***** people *****s-1.7 Furthermore, we propose contextual augmentation of pretrained language models for emotion recognition in conversations, which is to consider not only previous utterances, but also conversation-related information such as speakers, speech acts and topics. | ||
| 2020.lrec-1.33 For instance, in “The FBI alleged in court documents that Zazi had admitted having a handwritten recipe for explosives on his computer”, do ***** people ***** believe that Zazi had a handwritten recipe for explosives? | ||
| language resource | 42 | |
| L10-1039 We propose a ***** language resource ***** management system, called WordNet Management System (WNMS), as a distributed management system that allows the server to perform the cross language WordNet retrieval, including the fundamental web service applications for editing, visualizing and language processing. | ||
| L16-1533 This paper describes the named entity ***** language resource *****s developed as part of a development project for the South African languages. | ||
| L10-1187 To address this issue a broad alliance of LRT providers (CLARIN, the Linguist List, DOBES, DELAMAN, DFKI, ELRA) have initiated the Virtual Language Observatory portal to provide a low-barrier, easy-to-follow entry point to ***** language resource *****s and tools; it can be accessed via http://www.clarin.eu/vlo | ||
| L06-1163 Tagging as the most crucial annotation of ***** language resource *****s can still be challenging when the corpus size is big and when the corpus data is not homogeneous. | ||
| P19-1077 One of the key steps in *****language resource***** creation is the identification of the text segments to be annotated , or markables , which depending on the task may vary from nominal chunks for named entity resolution to ( potentially nested ) noun phrases in coreference resolution ( or mentions ) to larger text segments in text segmentation . | ||
| surface | 42 | |
| C16-1078 The system was evaluated using standard IR metrics on the new benchmark, and we saw that lexical-semantical rerankers improve significantly over a purely ***** surface *****-oriented system, but must be carefully tailored for each individual construction. | ||
| 2021.emnlp-main.50 The probing analysis of the features reveals their sensitivity to the ***** surface ***** and syntactic properties. | ||
| 2020.wmt-1.66 In this paper, we first take a step back and look at the commonly used bilingual corpora (WMT), and re***** surface ***** the existence and importance of implicit structure that existed in it: multi-way alignment across examples (the same sentence in more than two languages). | ||
| 1998.amta-papers.36 It may in fact be superior to other approaches in that it can handle target ***** surface *****-structure constraints, variation of syntactic patterns, discourse-structure constraints, and stylistic preference. | ||
| 2021.naacl-main.242 In this paper, we explore text classification with extremely weak supervision, i.e., only relying on the ***** surface ***** text of class names. | ||
| universal sentence encoder | 42 | |
| 2020.trac-1.16 we have developed a system based on transfer learning technique depending on ***** universal sentence encoder ***** (USE) embedding that will be trained in our developed model using xgboost classifier to identify the aggressive text data from English content. | ||
| 2020.emnlp-main.18 We experimented with KERMIT paired with two state-of-the-art transformer-based ***** universal sentence encoder *****s (BERT and XLNet) and we showed that KERMIT can indeed boost their performance by effectively embedding human-coded universal syntactic representations in neural networks | ||
| 2021.acl-long.72 When used to extend the pretraining of transformer-based language models, our approach closes the performance gap between unsupervised and supervised pretraining for ***** universal sentence encoder *****s. | ||
| W19-5053 For the RQE task, we trained a traditional multilayer perceptron network based on embeddings generated by the ***** universal sentence encoder *****. | ||
| R19-1157 Our approach shows improvement by using a transformer model and deep averaging network-based ***** universal sentence encoder ***** compared to previous solutions. | ||
| word vector space | 42 | |
| 2020.acl-main.337 This paper presents an investigation on the distribution of word vectors belonging to a certain word class in a pre-trained ***** word vector space *****. | ||
| 2020.wanlp-1.17 Recent work has shown that distributional ***** word vector space *****s often encode human biases like sexism or racism. | ||
| K18-1021 Most recent approaches to bilingual dictionary induction find a linear alignment between the ***** word vector space *****s of two languages. | ||
| W19-2001 Distributed ***** word vector space *****s are considered hard to interpret which hinders the understanding of natural language processing (NLP) models. | ||
| E17-2065 We develop a novel cross-lingual word representation model which injects syntactic information through dependency-based contexts into a shared cross-lingual ***** word vector space *****. | ||
| GitHub | 41 | |
| 2021.naacl-main.307 We will open all our source code on ***** GitHub *****. | ||
| K18-2017 Based entirely on recurrent neural networks, written in Python, this ready-to-use open source system is freely available on ***** GitHub *****. | ||
| 2021.acl-demo.11 Source code, datasets and pre-trained models are publicly available at ***** GitHub *****, with a short instruction video. | ||
| C18-1312 Source code and trained models for the bot#1337 are available on ***** GitHub *****. | ||
| 2020.msr-1.2 Both components of our system are available on ***** GitHub ***** under an MIT license | ||
| acceptability | 41 | |
| 2021.emnlp-main.74 For ***** acceptability ***** judgments, we find clearer evidence that non-uniformity in information density is predictive of lower ***** acceptability *****. | ||
| L12-1101 Training a supervised lexical substitution system on a smaller version of the resource resulted in well over 90% ***** acceptability ***** for lexical substitutions provided by the system. | ||
| N19-2027 While ***** acceptability ***** includes grammatical correctness and semantic correctness, we focus only on grammaticality classification in this paper, and show that existing datasets for grammatical error correction don't correctly capture the distribution of errors that data-driven generators are likely to make. | ||
| 2020.emnlp-main.389 Motivated by this idea, we design an unsupervised parser by specifying a set of transformations and using an unsupervised neural ***** acceptability ***** model to make grammaticality decisions. | ||
| R19-1078 The implementation formalizes a diverse fragment of NL, with NLC expressions type checking and failing to type check in exactly the same ways that NL expressions pass and fail their ***** acceptability ***** tests | ||
| deriving | 41 | |
| Q16-1010 However, the lack of a type system makes a formal mechanism for ***** deriving ***** logical forms from dependency structures challenging. | ||
| L16-1529 We describe a system for ***** deriving ***** an inferred capitalization value from closely related languages by phonological similarity, and illustrate the system using several related Western Iranian languages. | ||
| 2020.emnlp-main.354 This is not surprising as the learning signal is likely insufficient for ***** deriving ***** all aspects of phrase-structure syntax and gradient estimates are noisy. | ||
| 2020.emnlp-main.82 In this paper, we propose to generate diverse translations by ***** deriving ***** a large number of possible models with Bayesian modelling and sampling models from them for inference. | ||
| L14-1455 An experiment is presented to induce a set of polysemous basic type alternations (such as Animal-Food, or Building-Institution) by ***** deriving ***** them from the sense alternations found in an existing lexical resource | ||
| psycholinguistics | 41 | |
| 2020.coling-main.553 In this survey, we show the trajectory of research towards automatic personality detection from purely psychology approaches, through ***** psycholinguistics *****, to the recent purely natural language processing approaches on large datasets automatically extracted from social media. | ||
| Q13-1010 In addition to whole-sentence F-score, we also evaluate the partial trees that the parser constructs for sentence prefixes; partial trees play an important role in incremental interpretation, language modeling, and ***** psycholinguistics *****. | ||
| E17-2020 We assess the lexical capacity of a network using the lexical decision task common in ***** psycholinguistics *****: the system is required to decide whether or not a string of characters forms a word. | ||
| 2020.acl-main.462 We first define this phenomenon more precisely, drawing on considerable prior work in theoretical cognitive semantics and ***** psycholinguistics *****. | ||
| E17-2090 Inferring the emotional content of words is important for text-based sentiment analysis, dialogue systems and ***** psycholinguistics *****, but word ratings are expensive to collect at scale and across languages or domains | ||
| OOD | 41 | |
| 2020.emnlp-main.102 Our experiments demonstrate that the proposed method outperforms existing calibration methods for text classification in terms of expectation calibration error, misclassification detection, and ***** OOD ***** detection on six datasets. | ||
| 2021.naacl-main.447 Previous unsupervised ***** OOD ***** detection methods only extract discriminative features of different in-domain intents while supervised counterparts can directly distinguish ***** OOD ***** and in-domain intents but require extensive labeled ***** OOD ***** data. | ||
| 2021.acl-long.190 This paper proposes to train a model with only IND data while supporting both IND intent classification and ***** OOD ***** detection. | ||
| 2021.emnlp-main.84 These ***** OOD ***** instances can then be accurately detected using the Mahalanobis distance in the model's penultimate layer. | ||
| 2020.findings-emnlp.225 While we substantially reduce the gap between in-distribution and ***** OOD ***** generalization, performance on ***** OOD ***** compositions is still substantially lower | ||
| LREC | 41 | |
| R19-1089 With recent efforts in drawing attention to the task of replicating and/or reproducing results, for example in the context of COLING 2018 and various ***** LREC ***** workshops, the question arises how the NLP community views the topic of replicability in general. | ||
| 2020.lrec-1.423 This latest in a series of Linguistic Data Consortium (LDC) progress reports to the ***** LREC ***** community does not describe any single language resource, evaluation campaign or technology but sketches the activities, since the last report, of a data center devoted to supporting the work of ***** LREC ***** attendees among other research communities. | ||
| L14-1688 This paper emphasises on ELRAs contribution to the HLT field thanks to the consolidation of its services since ***** LREC ***** 2012. | ||
| L06-1145 The STEVIN programme, which will run from 2004 to 2009, resulted from HLT activities in the Dutch language area, which were reported on at previous ***** LREC ***** conferences (2000, 2002, 2004). | ||
| L10-1253 The Map has been developed on the basis of the information provided by ***** LREC ***** authors during the submission of papers to the ***** LREC ***** 2010 conference and the ***** LREC ***** workshops, and contains information about almost 2000 resources | ||
| positional | 41 | |
| 2020.findings-emnlp.49 However, recent works have shown that most attention heads learn simple, and often redundant, ***** positional ***** patterns. | ||
| 2021.acl-short.18 Proof-of-concept experiments show that it improves on regular ALBERT on GLUE tasks, while only adding orders of magnitude less ***** positional ***** parameters. | ||
| P19-1030 Ablation studies to find whether ***** positional ***** information is inherently encoded in the trees and which type of attention is suitable for doing the recursive traversal are provided. | ||
| 2021.eacl-main.136 In this paper, we present KPRank, an unsupervised graph-based algorithm for keyphrase extraction that exploits both ***** positional ***** information and contextual word embeddings into a biased PageRank. | ||
| 2021.acl-long.201 Meanwhile, it also integrates a spatial-aware self-attention mechanism into the Transformer architecture so that the model can fully understand the relative ***** positional ***** relationship among different text blocks | ||
| MEDIQA | 41 | |
| W19-5040 This paper describes the models designated for the ***** MEDIQA ***** 2019 shared tasks by the team PANLP. | ||
| W19-5056 This paper presents the experiments accomplished as a part of our participation in the ***** MEDIQA ***** challenge, an (Abacha et al., 2019) shared task. | ||
| W19-5044 We train our models on the MedNLI dataset, yielding the best performance on the test set of the ***** MEDIQA ***** 2019 Task 1. | ||
| W19-5058 This study describes the model design of the NCUEE system for the *****MEDIQA***** challenge at the ACL - BioNLP 2019 workshop . | ||
| 2021.bionlp-1.30 This study describes the model design of the NCUEE - NLP system for the *****MEDIQA***** challenge at the BioNLP 2021 workshop . | ||
| Natural | 41 | |
| L08-1304 We report about a project which brings together ***** Natural ***** Language Processing and eLearning. | ||
| L16-1684 We locate our research in the context of Digital Humanities where the non-canonical nature of text causes issues facing an ***** Natural ***** Language Processing world in which tools are mainly trained on standard data. | ||
| 2020.lrec-1.454 However, they do not account for linguistic information obtained using syntactic analyzers which is known to be invaluable for several ***** Natural ***** Language Processing (NLP) tasks. | ||
| 2016.gwc-1.18 Although there are currently several versions of Princeton WordNet for different languages, the lack of development of some of these versions does not make it possible to use them in different ***** Natural ***** Language Processing applications | ||
| 2020.acl-main.599 *****Natural***** Questions is a new challenging machine reading comprehension benchmark with two - grained answers , which are a long answer ( typically a paragraph ) and a short answer ( one or more entities inside the long answer ) . | ||
| figurative | 41 | |
| 2021.semeval-1.152 In writing, humor is mainly based on ***** figurative ***** language in which words and expressions change their conventional meaning to refer to something without saying it directly. | ||
| L10-1636 Figurative motion events are extracted into the same event structure but are marked as ***** figurative ***** in the corpus. | ||
| 2020.coling-main.61 Our results show that modeling ***** figurative ***** usage can demonstrably improve the model's robustness and reliability for distinguishing the depression symptoms. | ||
| L14-1577 We present an approach to mining online forums for ***** figurative ***** language such as metaphor | ||
| 2020.acl-main.403 Parody is a *****figurative***** device used to imitate an entity for comedic or critical purposes and represents a widespread phenomenon in social media through many popular parody accounts . | ||
| factual | 41 | |
| 2021.acl-long.536 We first propose an efficient automatic evaluation metric to measure ***** factual ***** consistency; next, we propose a novel learning algorithm that maximizes the proposed metric during model training. | ||
| 2020.emnlp-main.265 Therefore, we introduce a novel self-supervised contrastive learning mechanism to learn the relationship between original samples, ***** factual ***** samples and counter***** factual ***** samples. | ||
| P19-3026 While the last several years have witnessed a substantial growth in interests and efforts in the area of computational fact-checking, ClaimPortal is a novel infrastructure in that fact-checkers have largely skipped ***** factual ***** claims in tweets. | ||
| 2021.naacl-main.58 Empirical results show that the fact-aware summarization can produce abstractive summaries with higher ***** factual ***** consistency compared with existing systems, and the correction model improves the ***** factual ***** consistency of given summaries via modifying only a few keywords. | ||
| 2020.aacl-main.74 Considering the problem of information ambiguity and incompleteness for short text, two kinds of knowledge, ***** factual ***** knowledge graph and conceptual knowledge graph, are introduced to provide additional knowledge for the semantic matching between candidate entity and mention context | ||
| Data | 41 | |
| 2020.findings-emnlp.130 *****Data***** balancing is a known technique for improving the performance of classification tasks . | ||
| L14-1107 *****Data***** annotation in modern practice often involves multiple , imperfect human annotators . | ||
| L06-1210 *****Data***** sparsity is a large problem in natural language processing that refers to the fact that language is a system of rare events , so varied and complex , that even using an extremely large corpus , we can never accurately model all possible strings of words . | ||
| 2021.acl-srw.1 *****Data***** processing is an important step in various natural language processing tasks . | ||
| 2021.iwslt-1.23 *****Data***** augmentation , which refers to manipulating the inputs ( e.g. , adding random noise , masking specific parts ) to enlarge the dataset , has been widely adopted in machine learning . | ||
| output | 41 | |
| W18-5415 One method uses a CNN to encode the text, an adversarial objective function to control for confounders, and projects its weights onto its activations to interpret the importance of each phrase towards each ***** output ***** class. | ||
| S18-1006 The last LSTM layer will ***** output ***** the hidden representations of texts, and they will be used in three classification task. | ||
| L10-1357 In particular, a reasonable measure should ***** output ***** higher values for 2009 than for 2008. | ||
| N18-1128 We discover a simple hands-on principle: in a multi-layer input embedding model, layers should be tied consecutively bottom-up if reused at ***** output *****. | ||
| L16-1731 Metonymic expressions need to be correctly detected and interpreted because sentences including such expressions have different mean- ings from literal ones; computer systems may ***** output ***** inappropriate results in natural language processing | ||
| lexical simplification | 41 | |
| 2020.lrec-1.428 We present an unsupervised method for ***** lexical simplification ***** of complex Urdu text. | ||
| W16-4912 Japanese ***** lexical simplification ***** is the task of replacing difficult words in a given sentence to produce a new sentence with simple words without changing the original meaning of the sentence. | ||
| C16-2020 Given an input sentence, the editor performs both syntactic and ***** lexical simplification *****. | ||
| C18-1021 As a first step towards personalized simplification, we propose a framework for adaptive ***** lexical simplification ***** and introduce Lexi, a free open-source and easily extensible tool for adaptive, personalized text simplification. | ||
| 2020.coling-main.118 Moreover, we show consistent gains on 3 benchmarks for ***** lexical simplification *****, a task where knowledge about word-level semantic similarity is paramount, as well as large gains on lexical reasoning probes. | ||
| slot filling | 41 | |
| C18-1305 Compared with naive translation, our proposed method improves domain classification accuracy by relatively 22%, and the ***** slot filling ***** F1 score by relatively more than 71%. | ||
| 2021.emnlp-main.297 Among them, dense phrase retrieval—the most fine-grained retrieval unit—is appealing because phrases can be directly used as the output for question answering and ***** slot filling ***** tasks. | ||
| N18-2093 In this paper, we present a novel approach TypeSQL which formats the problem as a ***** slot filling ***** task in a more reasonable way. | ||
| 2020.lrec-1.852 This paper describes our developing dataset of Japanese ***** slot filling ***** quizzes designed for evaluation of machine reading comprehension. | ||
| 2020.emnlp-main.490 In this paper, we propose a robust adversarial model-agnostic ***** slot filling ***** method that explicitly decouples local semantics inherent in open-vocabulary slot words from the global context. | ||
| university | 41 | |
| 2020.peoples-1.3 This paper explores the relationship between gender, age and Big Five personality traits of 179 ***** university ***** students from Germany and their Instagram images. | ||
| C18-1038 Answering questions from ***** university ***** admission exams (Gaokao in Chinese) is a challenging AI task since it requires effective representation to capture complicated semantic relations between questions and answers. | ||
| L08-1255 We implemented the proposed method into an application system, and are now operating the system at several ***** university ***** libraries in Japan. | ||
| L14-1235 in an academic domain ontology, classes like Professor, Department could be organization (***** university *****) specific, while Conference, Programming languages are organization independent. | ||
| 2020.lrec-1.343 Our results show that the data collected from low-income participants is of comparable quality to the data collected from ***** university ***** students (who are typically employed to do this work) and that crowdsourcing speech data from low-income rural and urban workers is a viable method of gathering speech data. | ||
| structured data | 41 | |
| 2020.ldl-1.3 The increasing recognition of the utility of Linked Data as a means of publishing lexical resource has helped to underline the need for RDF based data models which have the flexibility and expressivity to be able to represent the most salient kinds of information contained in such resources as ***** structured data *****, including, notably, information relating to time and the temporal dimension. | ||
| D19-5601 Second, we describe the results of the two shared tasks 1) efficient neural machine translation (NMT) where participants were tasked with creating NMT systems that are both accurate and efficient, and 2) document generation and translation (DGT) where participants were tasked with developing systems that generate summaries from ***** structured data *****, potentially with assistance from text in another language. | ||
| 2019.icon-1.11 Increased internet bandwidth at low cost is leading to the creation of large volumes of un***** structured data *****. | ||
| W18-6505 Learning to generate fluent natural language from ***** structured data ***** with neural networks has become an common approach for NLG. | ||
| D19-5631 However, generation of long descriptive summaries conditioned on ***** structured data ***** remains an open challenge. | ||
| social networks | 41 | |
| Q14-1024 Such evaluations can be analyzed separately using signed ***** social networks ***** and textual sentiment analysis, but this misses the rich interactions between language and social context. | ||
| 2020.sustainlp-1.17 Thus, there is a significant opportunity to deploy NLP in myriad applications to help web users, ***** social networks *****, and businesses. | ||
| 2021.eacl-main.31 Given the rapid, widespread dissemination of information in ***** social networks *****, manually detecting suspicious news is sub-optimal. | ||
| 2020.coling-main.51 In particular, Arabizi has recently emerged as the Arabic language in online ***** social networks *****, becoming of great interest for opinion mining and sentiment analysis. | ||
| W16-3915 In *****social networks***** services like Twitter , users are overwhelmed with huge amount of social data , most of which are short , unstructured and highly noisy . | ||
| translation shared | 41 | |
| 2020.wmt-1.22 In this paper, we introduced our joint team SJTU-NICT `s participation in the WMT 2020 machine ***** translation shared ***** task. | ||
| 2020.wmt-1.95 This paper describes the machine translation systems developed by the University of Sheffield (UoS) team for the biomedical ***** translation shared ***** task of WMT20. | ||
| 2020.findings-emnlp.375 Recent machine ***** translation shared ***** tasks have shown top-performing systems to tie or in some cases even outperform human translation. | ||
| W18-6428 These systems were used to participate in the WMT18 news ***** translation shared ***** task and more specifically, for the unsupervised learning sub-track. | ||
| 2020.wmt-1.86 This paper describes LIMSI's submissions to the ***** translation shared ***** tasks at WMT'20. | ||
| conditional random | 41 | |
| 2020.lrec-1.361 We implement several baseline approaches of ***** conditional random ***** field (CRF) and recent popular state-of-the-art bi-directional long-short term memory (Bi-LSTM) models. | ||
| L08-1291 At the core of ParsCit is a trained ***** conditional random ***** field (CRF) model used to label the token sequences in the reference string. | ||
| L06-1069 This paper presents a framework for Thai morphological analysis based on the theoretical background of ***** conditional random ***** fields. | ||
| L16-1178 We provide a strong baseline with a linear-chain ***** conditional random ***** field and word-embedding features with a performance of 0.62 for aspect detection and 0.63 for the extraction of subjective phrases. | ||
| 2020.figlang-1.27 In this paper we present a novel resource-inexpensive architecture for metaphor detection based on a residual bidirectional long short-term memory and ***** conditional random ***** fields. | ||
| users | 41 | |
| L14-1708 Workflow languages focus on expressive power of the languages to describe variety of workflow patterns to meet ***** users *****' needs. | ||
| 2021.emnlp-main.143 ***** users *****' implicit feedback). | ||
| 2020.acl-main.5 However, in multi-domain scenarios, ellipsis and reference are frequently adopted by ***** users ***** to express values that have been mentioned by slots from other domains. | ||
| 2020.lrec-1.61 We assessed the ***** users *****' subjective cognitive load and their satisfaction in different questionnaires during the interaction with both PA variants. | ||
| D18-1005 Our system features the use of domain-specific resources automatically derived from a large unlabeled corpus, and contextual representations of the emotional and semantic content of the user's recent tweets as well as their interactions with other ***** users *****. | ||
| event coreference | 41 | |
| L14-1646 The ECB corpus is one of the data sets used for evaluation of the task of ***** event coreference ***** resolution. | ||
| 2021.naacl-main.356 We propose a neural ***** event coreference ***** model in which ***** event coreference ***** is jointly trained with five tasks: trigger detection, entity coreference, anaphoricity determination, realis detection, and argument extraction. | ||
| 2020.aespen-1.7 The multi-task convolutional neural network is shown to be capable of recognizing events and ***** event coreference *****s given the headlines' texts and publication dates. | ||
| N18-2055 Our empirical experiments, using gold ***** event coreference ***** relations, have shown that the central event of a document can be well identified by mining properties of ***** event coreference ***** chains. | ||
| L14-1099 We have made SinoCoreferencer publicly available, in hope to facilitate the development of high-level Chinese natural language applications that can potentially benefit from ***** event coreference ***** information. | ||
| toxic | 41 | |
| 2020.trac-1.4 The contribution of this paper is the design of binary classification and regression-based approaches aiming to predict whether a comment is ***** toxic ***** or not. | ||
| 2020.aacl-main.91 In this paper, we propose a new large-scale dataset for Brazilian Portuguese with tweets annotated as either ***** toxic ***** or non-***** toxic ***** or in different types of ***** toxic *****ity. | ||
| 2021.semeval-1.26 Semeval-2021, Task 5 - Toxic Spans Detection is based on a novel annotation of a subset of the Jigsaw Unintended Bias dataset and is the first language ***** toxic *****ity detection task dedicated to identifying the ***** toxic *****ity-level spans. | ||
| 2021.semeval-1.132 The post-processing steps involve (1) labeling character offsets between consecutive ***** toxic ***** tokens as ***** toxic ***** and (2) assigning a ***** toxic ***** label to words that have at least one token labeled as ***** toxic *****. | ||
| 2021.semeval-1.120 The SemEval 2021 task 5: Toxic Spans Detection is a task of identifying considered-***** toxic ***** spans in text, which provides a valuable, automatic tool for moderating online contents. | ||
| word sense alignment | 41 | |
| 2020.globalex-1.14 (Ahmadi et al., 2020) There are three different angles from which the problem of ***** word sense alignment ***** can be addressed: approaches based on the similarity of textual descriptions of word senses, ap- proaches based on structural properties of lexical-semantic resources, and a combination of both. | ||
| Q13-1013 In this paper, we present Dijkstra-WSA, a novel graph-based algorithm for ***** word sense alignment *****. | ||
| 2020.ldl-1.7 This paper reports on an ongoing task of monolingual ***** word sense alignment ***** in which a comparative study between the Portuguese Academy of Sciences Dictionary and the Dicionärio Aberto is carried out in the context of the ELEXIS (European Lexicographic Infrastructure) project. | ||
| 2020.globalex-1.13 In this paper we describe the system submitted to the ELEXIS Monolingual *****Word Sense Alignment***** Task. | ||
| L14-1458 At this stage of development, we will report on two tasks, namely *****word sense alignment***** with MultiWordNet and automatic acquisition of Verb Shallow Frames from sense annotated data in the MultiSemCor corpus. | ||
| global | 41 | |
| 2020.sdp-1.11 While most previous approaches represent context using solely text surrounding the citation, we propose enhancing context representation with ***** global ***** information. | ||
| P19-1514 In this work, we propose a novel approach to support value extraction scaling up to thousands of attributes without losing performance: (1) We propose to regard attribute as a query and adopt only one ***** global ***** set of BIO tags for any attributes to reduce the burden of attribute tag or model explosion; (2) We explicitly model the semantic representations for attribute and title, and develop an attention mechanism to capture the interactive semantic relations in-between to enforce our framework to be attribute comprehensive. | ||
| 2020.acl-main.186 We extend local tree-based loss functions with terms that provide ***** global ***** supervision and show how to optimize them end-to-end. | ||
| 2005.mtsummit-swtmt.1 The bottleneck has been the engineering of sufficiently comprehensive bodies of relevant knowledge The Semantic Web offers opportunities for the gradual evolution of a ***** global ***** heterogeneous knowledge base. | ||
| 2020.***** global *****ex-1.8 To this end, we study a number of highly ambiguous Danish nouns and examine the effectiveness of sense representations constructed by combining vectors from a distributional model with the information from a wordnet. | ||
| paraphrase database | 41 | |
| W17-5603 In this work, we propose to take advantage of large-scaled ***** paraphrase database ***** and present a pairwise-GRU framework to generate compositional phrase representations. | ||
| L14-1520 We release a massive expansion of the ***** paraphrase database ***** (PPDB) that now includes a collection of paraphrases in 23 different languages. | ||
| W18-3912 This paper presents a methodology to extract a ***** paraphrase database ***** for the European and Brazilian varieties of Portuguese, and discusses a set of paraphrastic categories of multiwords and phrasal units, such as the compounds “toda a gente” versus “todo o mundo” `everybody' or the gerundive constructions [estar a + V-Inf] versus [ficar + V-Ger] (e.g., “estive a observar” | “fiquei observando” `I was observing'), which are extremely relevant to high quality paraphrasing. | ||
| N18-1069 Experiments show that our method can automatically detect various paraphrases that are absent from existing ***** paraphrase database *****s. | ||
| D19-5307 We further show that the majority of the identified paraphrases are domain-specific and thus complement existing ***** paraphrase database *****s. | ||
| phrase representation | 41 | |
| E17-1066 The key to reach this observation lies in phrase detection, *****phrase representation*****, phrase alignment, and more importantly how to connect those aligned phrases of different matching degrees with the final classifier. | ||
| W17-5603 Learning *****phrase representations***** has been widely explored in many Natural Language Processing tasks (e.g., Sentiment Analysis, Machine Translation) and has shown promising improvements. | ||
| E17-1006 We use CBOW word embeddings to represent word meaning and learn a compositionality function that combines the individual constituents into a *****phrase representation*****, thus capturing the compositional attribute meaning. | ||
| C16-1235 We propose an Attention-based Deep Distance Metric Learning (ADDML) method, by considering aspect *****phrase representation***** as well as context representation. | ||
| 2021.acl-long.518 We present an effective method to learn *****phrase representations***** from the supervision of reading comprehension tasks, coupled with novel negative sampling methods. | ||
| raw | 41 | |
| 2018.gwc-1.32 In this paper , we combine methods to estimate sense rankings from *****raw***** text with recent work on word embeddings to provide sense ranking estimates for the entries in the Open Multilingual WordNet ( OMW ) . | ||
| P19-1525 Relation Extraction is the task of identifying entity mention spans in *****raw***** text and then identifying relations between pairs of the entity mentions . | ||
| 2020.msr-1.2 We present a system for mapping Universal Dependency structures to *****raw***** text which learns to restore word order by training an Interpreted Regular Tree Grammar ( IRTG ) that establishes a mapping between string and graph operations . | ||
| 2021.emnlp-main.810 We study the problem of training named entity recognition ( NER ) models using only distantly - labeled data , which can be automatically obtained by matching entity mentions in the *****raw***** text with entity types in a knowledge base . | ||
| 2020.acl-main.669 Unsupervised relation extraction ( URE ) extracts relations between named entities from *****raw***** text without manually - labelled data and existing knowledge bases ( KBs ) . | ||
| Typically | 40 | |
| 2020.lrec-1.252 ***** Typically *****, the movement is captured as precise geographic coordinates and time stamps with Global Positioning Systems (GPS). | ||
| 2020.coling-main.545 ***** Typically *****, they do not consider in what aspects two documents are similar. | ||
| 2020.emnlp-main.105 ***** Typically *****, machine learning systems solve new tasks by training on thousands of examples. | ||
| N19-1249 ***** Typically *****, auxiliary tasks are selected specifically in order to improve the performance of a target task. | ||
| 2021.acl-long.470 ***** Typically ***** these systems are trained by fine-tuning a large pre-trained model to the target task | ||
| equivalence | 40 | |
| P17-2088 We show the ***** equivalence ***** of two state-of-the-art models for link prediction/knowledge graph completion: Nickel et al's holographic embeddings and Trouillon et al.'s complex embeddings. | ||
| J17-2001 Our discriminative model is capable of combining a wide variety of features that individually provide only weak indications of translation ***** equivalence *****. | ||
| D19-1677 Our EQ_Reg mod-el essentially softens the ***** equivalence ***** of two regular expressions when used as a reward function. | ||
| L10-1586 The underlying idea behind the ontology-based lexicon is its organization via two semantic relations - ***** equivalence ***** and subsumption. | ||
| 2020.lrec-1.398 The paper presents a dataset of 11,000 Polish-English translational equivalents in the form of pairs of plWordNet and Princeton WordNet lexical units linked by three types of ***** equivalence ***** links: strong ***** equivalence *****, regular ***** equivalence *****, and weak ***** equivalence ***** | ||
| syllable | 40 | |
| L14-1730 Annotation results at sentence, word, ***** syllable ***** and phoneme levels are stored in XML format. | ||
| 2018.jeptalnrecital-long.4 A rich literature explores unsupervised segmentation algorithms infants could use to parse their input, mainly focusing on English, an analytic language where word, morpheme, and ***** syllable ***** boundaries often coincide. | ||
| L10-1150 Automatic system output and expert's syllabification are in agreement for most of ***** syllable ***** boundaries in our corpus. | ||
| P17-2046 The best result combines logogram features from Chinese and Japanese with ***** syllable ***** features from Korean, providing an additional 3.0 points f-score when added to state-of-the-art generalisation features on the TAC KBP 2015 Event Nugget task. | ||
| L12-1263 However, the main arising challenge is the concatenation of recognised ***** syllable *****s into the originally spoken sentence or phrase, particularly in the presence of ***** syllable ***** recognition mistakes | ||
| BioASQ | 40 | |
| 2020.conll-1.23 We quantitatively and qualitatively demonstrate that our proposed method learns a context sensitive and spatially aware mapping, in both the inter-organ and intra-organ sense, using a large scale medical text dataset from the “Large-scale online biomedical semantic indexing” track of the 2020 ***** BioASQ ***** challenge. | ||
| W17-2344 In addition to the ***** BioASQ ***** evaluation, we compared our system to other on-line biomedical QA systems in terms of the response time and the quality of the answers. | ||
| W17-2309 We pre-trained the model on a large-scale open-domain QA dataset, SQuAD, and then fine-tuned the parameters on the ***** BioASQ ***** training set. | ||
| W16-5105 We evaluate our system on a datasets of 1009 human written summaries provided by ***** BioASQ ***** and on 1974 gene summaries, fetched from the Entrez Gene database. | ||
| W18-5305 In this work, we describe the system used in the ***** BioASQ ***** Challenge task 6b for document retrieval and snippet retrieval (with particular emphasis in this subtask) | ||
| Indic | 40 | |
| P18-3021 A comparative evaluation shows that our model is competitive with the existing spell checking and correction techniques for ***** Indic ***** languages. | ||
| 2021.mrl-1.14 Our results show that the token-level and sentence-level representations from the ***** Indic ***** language models (***** Indic *****BERT and MuRIL) do not capture the syntax in ***** Indic ***** languages as efficiently as the other highly multilingual language models. | ||
| 2020.findings-emnlp.445 We also include publicly available datasets for some ***** Indic ***** languages for tasks like Named Entity Recognition, Cross-lingual Sentence Retrieval, Paraphrase detection, etc | ||
| 2021.wat-1.30 This paper describes ANVITA-1.0 MT system , architected for submission to WAT2021 MultiIndicMT shared task by mcairt team , where the team participated in 20 translation directions : EnglishIndic and IndicEnglish ; *****Indic***** set comprised of 10 Indian languages . | ||
| 2020.nlposs-1.10 We present iNLTK , an open - source NLP library consisting of pre - trained language models and out - of - the - box support for Data Augmentation , Textual Similarity , Sentence Embeddings , Word Embeddings , Tokenization and Text Generation in 13 *****Indic***** Languages . | ||
| normalized | 40 | |
| L08-1031 Traditional Authorship Attribution models extract ***** normalized ***** counts of lexical elements such as nouns, common words and punctuation and use these ***** normalized ***** counts or ratios as features for author fingerprinting. | ||
| 2020.lrec-1.331 We introduce a dictionary containing ***** normalized ***** forms of common words in various Swiss German dialects into High German. | ||
| 2020.findings-emnlp.406 Structured prediction is often approached by training a locally ***** normalized ***** model with maximum likelihood and decoding approximately with beam search. | ||
| L10-1408 The challenges and methods applied to obtain similar prompts in terms of complexity and semantics across different languages, as well as the ***** normalized ***** recording procedures employed at different locations, is covered. | ||
| I17-1006 The second approach is based on the idea of constrained decoding for three parsers, i.e., a traditional linear graph-based parser (LGPar), a globally ***** normalized ***** neural network transition-based parser (GN3Par) and a traditional linear transition-based parser (LTPar) | ||
| absolute | 40 | |
| 2021.bppf-1.2 This bias, in effect, leads to poor performance on data without this bias: a preference elicitation architecture based on BERT suffers a 5.3% ***** absolute ***** drop in performance, when like is replaced with a synonymous phrase, and a 13.2% drop in performance when evaluated on out-of-sample data. | ||
| D19-6220 On the MIMIC-III dataset we achieve a 2.7% ***** absolute ***** (11% relative) improvement from 0.218 to 0.245 macro-F1 score compared to the previous state of the art across 3,912 codes. | ||
| 2020.emnlp-main.468 We further apply our methodology to SQuAD2.0 and show a 2.8 ***** absolute ***** gain on EM score compared to prior work using synthetic data. | ||
| 2020.acl-main.208 By using semantic scaffolds during inference, we achieve a 10% ***** absolute ***** improvement in top-100 accuracy over the previous state-of-the-art. | ||
| N18-2106 Our approach yields an 8% ***** absolute ***** improvement in performance over a competitive information-retrieval baseline on a novel dataset of plot summaries of 577 movie remakes from Wikipedia | ||
| subjective | 40 | |
| 2021.naacl-main.228 Existing topic models when applied to reviews may extract topics associated with writers' ***** subjective ***** opinions mixed with those related to factual descriptions such as plot summaries in movie and book reviews. | ||
| L08-1589 In this paper, we begin to reconcile the ***** subjective ***** and automated scores that underlie these correlations by explicitly grounding MT output with its Reference Translation (RT) prior to ***** subjective ***** or automated evaluation. | ||
| L12-1330 In both tasks the answers are unique, which eliminates the uncertainty usually present in ***** subjective ***** tasks, where it is not clear whether the unexpected answer is caused by a lack of worker's motivation, the worker's interpretation of the task or genuine ambiguity. | ||
| 2021.emnlp-main.799 In this paper, we address both research questions with real and simulated word-level QE, visualizations, and user studies, where time, ***** subjective ***** ratings, and quality of the final translations are assessed. | ||
| 2021.naacl-main.169 We annotate 17,000 SNS posts with both the writer's ***** subjective ***** emotional intensity and the reader's objective one to construct a Japanese emotion analysis dataset | ||
| graph convolutional | 40 | |
| 2021.wanlp-1.27 In particular, this work exploits the existence of the syntactic connections between the words in the dependency trees as the anchor knowledge to transfer the representation learning across languages for CEAE models (i.e., via ***** graph convolutional ***** neural networks – GCNs). | ||
| P19-1131 To tackle the joint type inference task, we propose a novel ***** graph convolutional ***** network (GCN) running on an entity-relation bipartite graph. | ||
| 2020.acl-main.297 We also propose dependency ***** graph convolutional ***** networks (DEPGCN) to encode parser information at different processing levels. | ||
| 2021.acl-long.440 In order to explore a more effective way of utilizing both multimodal and long-distance contextual information, we propose a new model based on multimodal fused ***** graph convolutional ***** network, MMGCN, in this work. | ||
| D19-6204 We propose a novel ***** graph convolutional ***** networks model that incorporates dependency parsing and contextualized embedding to effectively capture comprehensive contextual information | ||
| Irony | 40 | |
| S18-1102 ***** Irony ***** detection is a key task for many natural language processing works. | ||
| 2020.lrec-1.346 ***** Irony ***** is a linguistic device used to intend an idea while articulating an opposing expression. | ||
| S18-1095 This paper describes the ***** Irony ***** detection system that participates in SemEval-2018 Task 3: ***** Irony ***** detection in English tweets. | ||
| S18-1094 The objective of this paper is to provide a description for a system built as our participation in SemEval-2018 Task 3 on *****Irony***** detection in English tweets . | ||
| S18-1099 This paper describes the KLUEnicorn system submitted to the SemEval-2018 task on *****Irony***** detection in English tweets . | ||
| Suggestion | 40 | |
| S19-2152 This paper presents our system to the SemEval-2019 Task 9, ***** Suggestion ***** Mining from Online Reviews and Forums. | ||
| S19-2222 ***** Suggestion ***** mining task aims to extract tips, advice, and recommendations from unstructured text. | ||
| S19-2221 We present a system for cross - domain suggestion mining , prepared for the SemEval-2019 Task 9 : *****Suggestion***** Mining from Online Reviews and Forums ( Subtask B ) . | ||
| S19-2220 This paper describes our system , Joint Encoders for Stable Suggestion Inference ( JESSI ) , for the SemEval 2019 Task 9 : *****Suggestion***** Mining from Online Reviews and Forums . | ||
| S19-2151 We present the pilot SemEval task on *****Suggestion***** Mining . | ||
| Aggression | 40 | |
| W18-4406 In this paper, we describe the system submitted for the shared task on ***** Aggression ***** Identification in Facebook posts and comments by the team Nishnik. | ||
| W18-4418 This system description paper presents our submission to the First Shared Task on ***** Aggression ***** Identification. | ||
| W18-4401 In this paper, we present the report and findings of the Shared Task on ***** Aggression ***** Identification organised as part of the First Workshop on Trolling, ***** Aggression ***** and Cyberbullying (TRAC - 1) at COLING 2018. | ||
| W18-4407 This paper describes the work that our team bhanodaig did at Indian Institute of Technology (ISM) towards TRAC-1 Shared Task on ***** Aggression ***** Identification in Social Media for COLING 2018 | ||
| W18-4408 This paper describes our system submitted in the shared task at COLING 2018 TRAC-1 : *****Aggression***** Identification . | ||
| object | 40 | |
| I17-1033 Thus we propose to train Faster R-CNN network for ***** object ***** recognition and LSTM for text generation and combine them at run time. | ||
| W17-2801 Grounding is performed using knowledge from the grammar itself, from the linguistic context, from the agents perception, and from an ontology of long-term knowledge about ***** object ***** categories and properties and actions the agent can perform. | ||
| 2020.ai4hi-1.1 Manual verification of the extracted annotations yields an accuracy rate of 97.5%, compared to 70.7% for relations extracted from ***** object ***** detection and 31.5% for automatically generated captions. | ||
| 2020.findings-emnlp.253 We present RationaleVT Transformer, an integrated model that learns to generate free-text rationales by combining pretrained language models with ***** object ***** recognition, grounded visual semantic frames, and visual commonsense graphs | ||
| 2020.lrec-1.710 People choose particular names for ***** object *****s, such as dog or puppy for a given dog. | ||
| linguistic annotation | 40 | |
| R17-1073 It consists of three main steps: (1) the source-language text is linguistically annotated, (2) it is translated to the target language with the Moses system, and (3) translation is post-processed with the help of the transferred ***** linguistic annotation ***** from the source text. | ||
| W17-5005 The result is a database that links learning content, ***** linguistic annotation ***** and open-source resources, on top of which a diverse range of tools for language-learning applications can be built. | ||
| L12-1142 Extracting and normalizing the temporal information in texts through ***** linguistic annotation ***** is an essential step towards attaining this objective. | ||
| D19-1293 Annotation quality control is a critical aspect for building reliable corpora through ***** linguistic annotation *****. | ||
| 2020.acl-main.684 This is an interesting example of pragmatic language acquisition without any ***** linguistic annotation *****. | ||
| benchmark | 40 | |
| 2021.emnlp-main.365 We hope that this study could ***** benchmark ***** Chinese dialogue summarization and benefit further studies. | ||
| D19-1678 We propose a method to model them jointly, achieving considerable improvement across ***** benchmark ***** tasks over baseline time-series model. | ||
| N18-1049 Our method outperforms the state-of-the-art unsupervised models on most ***** benchmark ***** tasks, highlighting the robustness of the produced general-purpose sentence embeddings. | ||
| 2020.wmt-1.105 After performing empirical analyses of the finetuning task, we ***** benchmark ***** our approach by comparing the results with past years' state-of-theart records. | ||
| 2020.nlp4convai-1.13 We also ***** benchmark ***** a few state of the art dialogue state tracking models on the corrected dataset to facilitate comparison for future work | ||
| linking | 40 | |
| L08-1605 Vast and consistent electronic lexical resources do exist which can be further enhanced and enriched through their ***** linking ***** and integration. | ||
| W19-6144 The use of a ***** linking ***** element between compound members is a common phenomenon in Germanic languages. | ||
| W19-0424 Specifically, we show that in this way we can create data that can be used to learn and evaluate lexical and compositional grounded semantics, and we show that the “linked to same image” relation tracks a semantic implication relation that is recognisable to annotators even in the absence of the ***** linking ***** image as evidence. | ||
| 2018.gwc-1.6 We present an overview of resources relevant to Polish and a plan for their ***** linking ***** to plWordNet | ||
| 2020.emnlp-main.564 We find when schema ***** linking ***** is done well, SLSQL demonstrates good performance on Spider despite its structural simplicity. | ||
| matching | 40 | |
| 2021.ranlp-srw.7 They store translations allowing to save time by presenting translations on the database through ***** matching ***** of several types such as fuzzy matches, which are calculated by algorithms like the edit distance. | ||
| D19-1267 We then additionally design deep fusion to propagate the attention information at each ***** matching ***** layer. | ||
| 2021.emnlp-main.715 We obtain strong improvements, ***** matching ***** the current state of the art. | ||
| 2006.jeptalnrecital-long.29 Natural Language Processing (NLP) for IR aims to transform the potentially ambiguous words of queries and documents into unambiguous internal representations on which ***** matching ***** and retrieval can take place. | ||
| R19-1106 Following this classification task, we use a string ***** matching ***** algorithm with a gazetteer to identify the exact index of a toponym within the sentence | ||
| labels | 40 | |
| 2021.conll-1.44 Moreover, on some standard WiC benchmarks, MirrorWiC is even on-par with supervised models fine-tuned with in-task data and sense ***** labels *****. | ||
| 2021.emnlp-main.481 Multi-label document classification (MLDC) problems can be challenging, especially for long documents with a large label set and a long-tail distribution over ***** labels *****. | ||
| 2021.acl-long.77 to propose an event-based modality detection task where modal expressions can be words of any syntactic class and sense ***** labels ***** are drawn from a comprehensive taxonomy which harmonizes the modal concepts contributed by the different studies. | ||
| 2020.emnlp-main.638 Real world scenarios present a challenge for text classification, since ***** labels ***** are usually expensive and the data is often characterized by class imbalance. | ||
| 2020.acl-main.264 In this paper, we investigate the gender bias amplification issue from the distribution perspective and demonstrate that the bias is amplified in the view of predicted probability distribution over ***** labels ***** | ||
| integer linear programming | 40 | |
| 2020.coling-main.418 We then introduce a total optimization method using ***** integer linear programming ***** to prevent span overlapping and obtain non-monotonic alignments. | ||
| E17-1108 We employ a structured perceptron, together with ***** integer linear programming ***** constraints for document-level inference during training and prediction to exploit relational properties of temporality, together with global learning of the relations at the document level. | ||
| D19-1398 However, it is nontrivial to make use of ***** integer linear programming ***** as a blackbox solver for RE. | ||
| Q15-1003 The algorithm tractably captures a majority of the structural constraints examined by prior work in this area, which has resorted to either approximate methods or off-the-shelf ***** integer linear programming ***** solvers. | ||
| P18-1212 Specifically, we formulate the joint problem as an ***** integer linear programming ***** (ILP) problem, enforcing constraints that are inherent in the nature of time and causality. | ||
| discourse connectives | 40 | |
| 2020.lrec-1.138 We present DiMLex-Bangla, a newly developed lexicon of ***** discourse connectives ***** in Bangla. | ||
| 2020.lrec-1.142 CzeDLex is an electronic lexicon of Czech ***** discourse connectives ***** with its data coming from a large treebank annotated with discourse relations. | ||
| W17-0809 enrichments on 10% of the corpus are described (namely, senses for explicit ***** discourse connectives *****, and new annotations for three discourse relation types - implicit relations, entity relations and alternative lexicalizations). | ||
| 2012.amta-caas14.1 The automatic identification of the Arabic translations of seven English ***** discourse connectives ***** shows how these connectives are differently translated depending on their actual senses. | ||
| 2012.amta-papers.20 The improvement of translation quality is demonstrated using a new semi-automated metric for ***** discourse connectives *****, on the English/French WMT10 data, while BLEU scores remain comparable to non-discourse-aware systems, due to the low frequency of ***** discourse connectives *****. | ||
| emotion classification | 40 | |
| W18-6230 This paper describes an approach to solve implicit ***** emotion classification ***** with the use of pre-trained word embedding models to train multiple neural networks. | ||
| 2021.latechclfl-1.8 We have evaluated multiple traditional machine learning approaches as well as transformer-based models pretrained on historical and contemporary language for a single-label text sequence ***** emotion classification ***** for the different emotion categories. | ||
| S18-1016 This paper presents an ***** emotion classification ***** system for English tweets, submitted for the SemEval shared task on Affect in Tweets, subtask 5: Detecting Emotions. | ||
| 2020.lrec-1.200 In the case of using a deep learning (machine learning) framework for ***** emotion classification *****, one significant difficulty faced is the requirement of building a large, emotion corpus in which each sentence is assigned emotion labels. | ||
| S18-1001 The individual tasks are: 1. emotion intensity regression, 2. emotion intensity ordinal classification, 3. valence (sentiment) regression, 4. valence ordinal classification, and 5. ***** emotion classification *****. | ||
| responses | 40 | |
| 2021.nlp4convai-1.23 Humans make appropriate ***** responses ***** not only based on previous dialogue utterances but also on implicit background knowledge such as common sense. | ||
| 2020.emnlp-main.190 We evaluate models on their generalizability to out-of-domain examples, ***** responses ***** to missing or incorrect data, and ability to handle question variations. | ||
| P18-1103 Human generates ***** responses ***** relying on semantic and functional dependencies, including coreference relation, among dialogue elements and their context. | ||
| P18-1139 Experiments show that our model outperforms state-of-the-art baselines, and it has the ability to generate ***** responses ***** with both controlled sentence function and informative content. | ||
| 2020.ecomnlp-1.5 Managerial ***** responses ***** to such reviews provide businesses with the opportunity to influence the public discourse and to attain improved ratings over time. | ||
| text normalization | 40 | |
| R19-1086 One way to overcome this is to first perform ***** text normalization *****. | ||
| L14-1379 This corpus is intended for development and testing of micro***** text normalization ***** systems. | ||
| 2020.lrec-1.508 This publicly available resource is intended to support research on spelling correction and ***** text normalization ***** for Arabic dialects. | ||
| 2020.emnlp-main.383 Additional ***** text normalization ***** experiments and case studies show that TNT is a new potential approach to misspelling correction. | ||
| L14-1574 In this paper we present a Dutch and English dataset that can serve as a gold standard for evaluating *****text normalization***** approaches . | ||
| discourse analysis | 40 | |
| P17-1144 Drafts are manually aligned at the sentence level, and the writer's purpose for each revision is annotated with categories analogous to those used in argument mining and ***** discourse analysis *****. | ||
| L16-1276 Discourse parsing is a challenging task in NLP and plays a crucial role in ***** discourse analysis *****. | ||
| 2020.peoples-1.10 Based on the results of sentiment analysis and ***** discourse analysis *****, we described the emotions expressed in the forum and the linguistic means the forum participants used to verbalise their attitudes and emotions while discussing the Covid-19 pandemic. | ||
| 2020.emnlp-main.430 The acquired subevent knowledge has been shown useful for ***** discourse analysis ***** and identifying a range of event-event relations. | ||
| 2021.isa-1.5 The paper presents a discourse - based approach to the analysis of argumentative texts departing from the assumption that the coherence of a text should capture argumentation structure as well and , therefore , existing *****discourse analysis***** tools can be successfully applied for argument segmentation and annotation tasks . | ||
| fact checking | 40 | |
| C18-1283 The recently increased focus on misinformation has stimulated research in ***** fact checking *****, the task of assessing the truthfulness of a claim. | ||
| 2020.acl-main.549 We evaluate our system on FEVER, a benchmark dataset for ***** fact checking *****, and find that rich structural information is helpful and both our graph-based mechanisms improve the accuracy. | ||
| 2020.findings-emnlp.43 As the first step of automatic ***** fact checking *****, claim check-worthiness detection is a critical component of ***** fact checking ***** systems. | ||
| 2020.acl-main.403 Beyond research in linguistics and political communication, accurately and automatically detecting parody is important to improving ***** fact checking ***** for journalists and analytics such as sentiment analysis through filtering out parodical utterances. | ||
| 2020.emnlp-main.580 Fact checking at scale is difficultwhile the number of active *****fact checking***** websites is growing , it remains too small for the needs of the contemporary media ecosystem . | ||
| text analysis | 40 | |
| W18-0530 We present a novel rule-based system for automatic generation of factual questions from sentences, using semantic role labeling (SRL) as the main form of ***** text analysis *****. | ||
| 2004.amta-papers.20 Named-entities in free text represent a challenge to ***** text analysis ***** in Machine Translation and Cross Language Information Retrieval. | ||
| L10-1633 The development of ***** text analysis ***** systems has been greatly facilitated by modern NLP frameworks, such as the General Architecture for Text Engineering (GATE). | ||
| W16-4001 A promising approach lies in the identification of re-occurring types of analytical subtasks, beyond linguistic standard tasks, which can form building blocks for ***** text analysis ***** across disciplines, and for which corpus-based characterizations (viz. | ||
| 2020.lrec-1.883 In this paper , we introduce ProfilingUD , a new *****text analysis***** tool inspired to the principles of linguistic profiling that can support language variation research from different perspectives . | ||
| supervised machine learning | 40 | |
| C18-1144 Annotated corpora enable ***** supervised machine learning ***** and data analysis. | ||
| W19-5028 However, recent work has demonstrated the potential of ***** supervised machine learning ***** to extract document-level codes directly from the raw text of clinical notes. | ||
| 2021.acl-long.564 Active learning promises to alleviate the massive data needs of ***** supervised machine learning *****: it has successfully improved sample efficiency by an order of magnitude on traditional tasks like topic classification and object recognition. | ||
| W19-5938 Despite recent attempts in the field of explainable AI to go beyond black box prediction models, typically already the training data for ***** supervised machine learning ***** is collected in a manner that treats the annotator as a “black box”, the internal workings of which remains unobserved. | ||
| L08-1223 This paper describes the origins of the corpus, its creation, ways to access it, design criteria, and an analysis with common ***** supervised machine learning ***** methods. | ||
| common | 40 | |
| D19-6008 This paper explores the use of Bidirectional Encoder Representations from Transformers(BERT) along with external relational knowledge from ConceptNet to tackle the problem of ***** common *****sense inference. | ||
| 2021.emnlp-main.777 At the script level, most existing studies only consider a single event sequence corresponding to one ***** common ***** protagonist. | ||
| 2021.nlp4convai-1.23 Humans make appropriate responses not only based on previous dialogue utterances but also on implicit background knowledge such as ***** common ***** sense. | ||
| C16-1177 According to the hearer's ***** common ***** sense knowledge and his comprehension of the preceding text, a discourse entity could be old, mediated or new. | ||
| 2020.clinicalnlp-1.15 We pre-trained several models of ***** common ***** architectures on this dataset and empirically showed that such pre-training leads to improved performance and convergence speed when fine-tuning on downstream medical tasks. | ||
| joint | 40 | |
| D18-1212 Through the ***** joint ***** exploitation of these constraints in an adversarial manner, the underlying cross-language semantics relevant to retrieval tasks are better preserved in the embedding space. | ||
| 2010.jeptalnrecital-long.29 Additionally, training on orthographically normalized (reduced) text then ***** joint *****ly enriching and detokenizing the output outperforms training on enriched text. | ||
| P19-1131 To tackle the ***** joint ***** type inference task, we propose a novel graph convolutional network (GCN) running on an entity-relation bipartite graph. | ||
| 2020.findings-emnlp.98 In the proposed study, we make the first attempt to train the video captioning model on labeled data and unlabeled data ***** joint *****ly, in a semi-supervised learning manner. | ||
| 2020.semeval-1.159 To utilise both text and image data, a multi-modal CNN-LSTM model is proposed to ***** joint *****ly learn latent features for positive, negative and neutral category predictions. | ||
| inflection generation | 40 | |
| 2021.eacl-main.163 morphological ***** inflection generation ***** and historical text normalization, there are few works that outperform recurrent models using the transformer. | ||
| C18-1008 We present a neural transition-based model that uses a simple set of edit actions (copy, delete, insert) for morphological transduction tasks such as ***** inflection generation *****, lemmatization, and reinflection. | ||
| 2020.sigmorphon-1.9 In particular, we experiment with substituting the ***** inflection generation ***** component with an LSTM sequence-to-sequence model and an LSTM pointer-generator network. | ||
| 2021.insights-1.13 The method is directly applicable to morphological ***** inflection generation ***** if unlabeled word forms are available. | ||
| 2020.sigmorphon-1.25 We provide a recipe and evaluation set for the community to use as an extrinsic measure of the performance of ***** inflection generation ***** approaches. | ||
| large - scale | 40 | |
| P18-1191 We present a new *****large - scale***** corpus of Question - Answer driven Semantic Role Labeling ( QA - SRL ) annotations , and the first high - quality QA - SRL parser . | ||
| 2021.naacl-main.396 Recent years have seen a flourishing of neural keyphrase generation ( KPG ) works , including the release of several *****large - scale***** datasets and a host of new models to tackle them . | ||
| L12-1504 This paper describes the KnowledgeStore , a *****large - scale***** infrastructure for the combined storage and interlinking of multimedia resources and ontological knowledge . | ||
| 2021.alta-1.3 Visual question answering ( VQA ) models , in particular modular ones , are commonly trained on *****large - scale***** datasets to achieve state - of - the - art performance . | ||
| E17-2105 We propose a variant of Convolutional Neural Network ( CNN ) models , the Attention CNN ( ACNN ) ; for *****large - scale***** categorization of millions of Japanese items into thirty - five product categories . | ||
| News | 40 | |
| 2020.semeval-1.236 This paper describes our system ( Solomon ) details and results of participation in the SemEval 2020 Task 11 Detection of Propaganda Techniques in *****News***** Articles . | ||
| 2020.semeval-1.196 This paper presents our systems for SemEval 2020 Shared Task 11 : Detection of Propaganda Techniques in *****News***** Articles . | ||
| 2020.wmt-1.36 This paper describes the DeepMind submission to the ChineseEnglish constrained data track of the WMT2020 Shared Task on *****News***** Translation . | ||
| 2020.semeval-1.192 This paper describes our participation in the SemEval-2020 task Detection of Propaganda Techniques in *****News***** Articles . | ||
| 2020.semeval-1.194 This paper describes our contribution to SemEval-2020 Task 11 : Detection Of Propaganda Techniques In *****News***** Articles . | ||
| sequence - to - sequence | 40 | |
| D18-1336 Autoregressive decoding is the only part of *****sequence - to - sequence***** models that prevents them from massive parallelization at inference time . | ||
| 2021.naacl-main.210 Current *****sequence - to - sequence***** models are trained to minimize cross - entropy and use softmax to compute the locally normalized probabilities over target sequences . | ||
| W19-4224 We use *****sequence - to - sequence***** networks trained on sequential phonetic encoding tasks to construct compositional phonological representations of words . | ||
| D19-6309 We describe our exploratory system for the shallow surface realization task , which combines morphological inflection using character sequence - to - sequence models with a baseline linearizer that implements a tree - to - tree model using *****sequence - to - sequence***** models on serialized trees . | ||
| D17-1040 The standard content - based attention mechanism typically used in *****sequence - to - sequence***** models is computationally expensive as it requires the comparison of large encoder and decoder states at each time step . | ||
| disentanglement | 39 | |
| 2020.emnlp-main.533 With this new formulation, we propose a novel multi-task learning framework that supports efficient encoding through large pretrained models with only two utterances at once to perform dynamic topic ***** disentanglement ***** and response selection. | ||
| 2020.aacl-main.72 However, how to transfer the desired properties of ***** disentanglement ***** to word representations is unclear. | ||
| D18-1420 Our framework explores the pseudo-parallel sentences by modeling their content similarity and outcome differences to enable a better ***** disentanglement ***** of the latent factors, which allows generating an output to better satisfy the desired outcome and keep the content. | ||
| 2021.eacl-main.32 Polarized-VAE outperforms the VAE baseline and is competitive with state-of-the-art approaches, while being more a general framework that is applicable to other attribute ***** disentanglement ***** tasks. | ||
| N18-1164 In this paper, we propose to leverage representation learning for conversation ***** disentanglement ***** | ||
| factorization | 39 | |
| 2020.findings-emnlp.204 We use a number of different ***** factorization ***** techniques, and evaluate the various models using a large set of evaluation metrics, including previously published coherence measures, as well as a number of novel measures that we suggest better correspond to real-world applications of topic models. | ||
| N19-1267 The first baseline assumes a conditional ***** factorization ***** of the utterance into unimodal factors. | ||
| 2021.acl-long.46 In particular, we show the tractability and empirical effectiveness of structural knowledge distillation between sequence labeling and dependency parsing models under four different scenarios: 1) the teacher and student share the same ***** factorization ***** form of the output structure scoring function; 2) the student ***** factorization ***** produces more fine-grained substructures than the teacher ***** factorization *****; 3) the teacher ***** factorization ***** produces more fine-grained substructures than the student ***** factorization *****; 4) the ***** factorization ***** forms from the teacher and the student are incompatible. | ||
| P17-1037 Inspired by previous work on item recommendation, we formalize the task of modeling inter-topic preferences as matrix ***** factorization *****: representing users' preference as a user-topic matrix and mapping both users and topics onto a latent feature space that abstracts the preferences | ||
| 2021.eacl-main.32 Most previous methods have focused on either supervised approaches which use attribute labels or unsupervised approaches that manipulate the *****factorization***** in the latent space of models such as the variational autoencoder ( VAE ) by training with task - specific losses . | ||
| provided | 39 | |
| 2020.coling-main.182 Commonsense generation aims at generating plausible everyday scenario description based on a set of ***** provided ***** concepts. | ||
| W18-6404 We do data filtering not only for ***** provided ***** sentences but also for the back translated sentences. | ||
| W17-2309 We focus on factoid and list question, using an extractive QA model, that is, we restrict our system to output substrings of the ***** provided ***** text snippets. | ||
| W17-4107 We show that the ***** provided ***** information offers a significant advantage for both word segmentation and the learning of allomorphy. | ||
| 2020.wnut-1.65 Our best performing model achieves an F1-score of 0.9179 on the ***** provided ***** validation set and 0.8805 on the blind test-set | ||
| decomposed | 39 | |
| D19-1468 User-generated reviews can be ***** decomposed ***** into fine-grained segments (e.g., sentences, clauses), each evaluating a different aspect of the principal entity (e.g., price, quality, appearance). | ||
| 2020.acl-main.440 Many high-level procedural tasks can be ***** decomposed ***** into sequences of instructions that vary in their order and choice of tools. | ||
| P19-1055 Many NLP learning tasks can be ***** decomposed ***** into several distinct sub-tasks, each associated with a partial label. | ||
| P17-2005 We prove that every extractive summarizer can be ***** decomposed ***** into an objective function and an optimization technique. | ||
| 1963.earlymt-1.7 However, it is shown in this paper that letter strings can be ***** decomposed ***** into 3 sets of roughly the same size in the following manner: in the first, strings are never broken in English words; in the second, the strings are always broken in English words; and in the third, both situations occur | ||
| RBMT | 39 | |
| 2021.nodalida-main.37 To make the best use of the monolingual data in a neural machine translation (NMT) system, we use the backtranslation approach to create synthetic parallel data from it using both NMT and ***** RBMT ***** systems. | ||
| W19-8715 However, improved fluency makes it more difficult for post editors to identify and correct adequacy errors, because unlike ***** RBMT ***** and SMT, in NMT adequacy errors are frequently not anticipated by fluency errors. | ||
| W19-5336 We describe in this article the use of the shared task data as a kind of a test-driven development workflow in ***** RBMT ***** development and show that it suits perfectly to a modern software engineering continuous integration workflow of ***** RBMT ***** and yields big increases to BLEU scores with minimal effort | ||
| 2011.freeopmt-1.7 This paper proposes to enrich *****RBMT***** dictionaries with Named Entities ( NEs ) automatically acquired from Wikipedia . | ||
| L10-1522 Recent developments on hybrid systems that combine rule - based machine translation ( RBMT ) systems with statistical machine translation ( SMT ) generally neglect the fact that *****RBMT***** systems tend to produce more syntactically well - formed translations than data - driven systems . | ||
| EHR | 39 | |
| K19-1095 The proposed approach achieved a superior predictive performance when benchmarked against the structured ***** EHR ***** data based state-of-the-art model, with an improvement of 11.50% in AUPRC and 1.16% in AUROC. | ||
| W19-1906 In this paper, we present our clinical note processing pipeline, which extends beyond basic medical natural language processing (NLP) with concept recognition and relation detection to also include components specific to ***** EHR ***** data, such as structured data associated with the encounter, sentence-level clinical aspects, and structures of the clinical notes. | ||
| 2021.naacl-main.73 Large Transformers pretrained over clinical notes from Electronic Health Records (***** EHR *****) have afforded substantial gains in performance on predictive clinical tasks. | ||
| 2020.bionlp-1.8 This work presents an open-source, reproducible experimental methodology for assessing the validity of ***** EHR ***** discharge summaries | ||
| N19-5006 We will review NLP techniques in solving clinical problems and facilitating clinical research , the state - of - the art clinical NLP tools , and share collaboration experience with clinicians , as well as publicly available *****EHR***** data and medical resources , and finally conclude the tutorial with vast opportunities and challenges of clinical NLP . | ||
| authoring | 39 | |
| 2005.mtsummit-papers.9 This paper describes one approach to document ***** authoring ***** and natural language generation being pursued by the Summer Institute of Linguistics in cooperation with the University of Maryland, Baltimore County. | ||
| L08-1271 This paper proposes a workbench with three ***** authoring ***** tools for collaborative multilingual ontological knowledge construction and maintenance, in order to add value and support communities in the field of food and agriculture. | ||
| L16-1213 In this paper, we investigate some language acquisition facets of an auto-adaptative system that can automatically acquire most of the relevant lexical knowledge and ***** authoring ***** practices for an application in a given domain. | ||
| 2020.lrec-1.28 We develop an approach to ***** authoring ***** these schemas using corpus analysis and crowdsourcing, to maximize realism and minimize the amount of expert ***** authoring ***** needed | ||
| 1998.amta-papers.15 EasyEnglish is an *****authoring***** tool which is part of IBM 's internal SGML editing environment , Information Development Workbench . | ||
| sampled | 39 | |
| P18-3002 We show that language models trained on data ***** sampled ***** using our proposed approach outperform models trained over randomly ***** sampled ***** subsets of both the Billion Word (Chelba et al., 2014 Wikitext-103 benchmark corpora (Merity et al., 2016). | ||
| K17-1015 We do so by designing an artificial language framework, training a predictive and a count-based model on data ***** sampled ***** from this grammar, and evaluating the resulting word vectors in paradigmatic and syntagmatic tasks defined with respect to the grammar. | ||
| 2020.ngt-1.28 We find that strong systems start with a large amount of generic training data, and then fine-tune with in-domain data, ***** sampled ***** according to our provided learner response frequencies. | ||
| 2021.eacl-srw.23 To ensure that the text is generated conditioned upon the ***** sampled ***** latent code, reconstruction loss is introduced in our objective function. | ||
| 2020.findings-emnlp.376 We provide qualitative samples ***** sampled ***** unconditionally from the generative joint distribution | ||
| pedagogical | 39 | |
| 2020.bea-1.5 To remedy this, we propose a novel asynchronous method for collecting tutoring dialogue via crowdworkers that is both amenable to the needs of deep learning algorithms and reflective of ***** pedagogical ***** concerns. | ||
| 2019.lilt-18.6 Alongside subject knowledge, teachers need ***** pedagogical ***** knowledge – how to teach grammar effectively and how to integrate this teaching into other kinds of language learning. | ||
| W17-5012 Given the lack of available corpora for our exploration, we create the first annotated corpus of ***** pedagogical ***** roles and use it to test baseline techniques for automatic prediction of such roles. | ||
| L12-1599 The paper defines the notion of *****pedagogical***** stance , viewed as the type of position taken , the role assumed , the image projected and the types of social behaviours performed by a teacher in her teaching interaction with a pupil . | ||
| W19-4433 This paper provides an analytical assessment of student short answer responses with a view to potential benefits in *****pedagogical***** contexts . | ||
| corpora annotated | 39 | |
| L12-1117 Parallel aligned treebanks (PAT) are linguistic ***** corpora annotated ***** with morphological and syntactic structures that are aligned at sentence as well as sub-sentence levels. | ||
| W19-4001 Due to the limited availability of existing ***** corpora annotated ***** for hedging, linguists and other language scientists have been constrained as to the extent they can study this phenomenon. | ||
| L14-1142 Corpus-dictionary linked resources include concordances, dictionaries with word usage examples, and ***** corpora annotated ***** with lemmas or word-senses. | ||
| S18-1105 We report on the results obtained by our system both in a constrained setting and unconstrained setting, where we explored the impact of using additional data in the training phase, such as ***** corpora annotated ***** for the presence of irony or sarcasm from the state of the art. | ||
| 2020.coling-tutorials.1 This is an introductory tutorial to UCCA (Universal Conceptual Cognitive Annotation), a cross-linguistically applicable framework for semantic representation, with ***** corpora annotated ***** in English, German and French, and ongoing annotation in Russian and Hebrew | ||
| spurious | 39 | |
| 2020.findings-emnlp.308 The predictions of text classifiers are often driven by ***** spurious ***** correlations – e.g., the term “Spielberg” correlates with positively reviewed movies, even though the term itself does not semantically convey a positive sentiment. | ||
| 2021.clpsych-1.20 With the integration of human-in-the-loop machine learning in the clinical implementation process, incorporating safeguards such as these into the models will offer patients increased protection from ***** spurious ***** predictions. | ||
| 2020.emnlp-main.665 Natural Language Inference (NLI) datasets contain annotation artefacts resulting in ***** spurious ***** correlations between the natural language utterances and their respective entailment classes. | ||
| D18-1314 Previous approaches to training this type of model either rely on an external character aligner for the production of gold action sequences, which results in a suboptimal model due to the unwarranted dependence on a single gold action sequence despite ***** spurious ***** ambiguity, or require warm starting with an MLE model. | ||
| 2020.emnlp-main.265 In the task of Visual Question Answering ( VQA ) , most state - of - the - art models tend to learn *****spurious***** correlations in the training set and achieve poor performance in out - of - distribution test data . | ||
| link | 39 | |
| 2021.naacl-main.221 Experimental results on four benchmark datasets demonstrate the robustness and effectiveness of Edge in ***** link ***** prediction and node classification. | ||
| D19-1268 Most existing KG completion methods only consider the direct relation between nodes and ignore the relation paths which contain useful information for ***** link ***** prediction. | ||
| N19-1337 In this paper, we propose adversarial modifications for ***** link ***** prediction models: identifying the fact to add into or remove from the knowledge graph that changes the prediction for a target fact after the model is retrained. | ||
| 2021.emnlp-main.648 We study data poisoning attacks against KGE models for ***** link ***** prediction. | ||
| 2021.acl-long.147 Embedding (KGE) models for the task of ***** link ***** prediction in knowledge graphs | ||
| ensemble | 39 | |
| S18-1045 We evaluate the effectiveness of our ***** ensemble ***** feature sets on the SemEval-2018 Task 1 datasets and achieve a Pearson correlation of 72% on the task of tweet emotion intensity prediction. | ||
| S19-2205 This is followed by a description of two contrastive solutions based on ***** ensemble ***** methods. | ||
| D18-1191 Experimental results demonstrate that our ***** ensemble ***** model achieves the state-of-the-art results, 87.4 F1 and 87.0 F1 on the CoNLL-2005 and 2012 datasets, respectively. | ||
| 2021.vardial-1.10 The models included in our ***** ensemble ***** range from simple regression techniques, such as Support Vector Regression, to deep neural models, such as a hybrid neural network and a neural transformer. | ||
| S18-1097 This paper describes an *****ensemble***** approach to the SemEval-2018 Task 3 . | ||
| metric | 39 | |
| 2021.trustnlp-1.3 The mechanism satisfies an extension of differential privacy to ***** metric ***** spaces. | ||
| 2020.acl-main.448 Together, these findings suggest improvements to the protocols for ***** metric ***** evaluation and system performance evaluation in machine translation. | ||
| 2020.emnlp-main.625 Considering the above aspects, in our work, we automate the optimization of multiple ***** metric ***** rewards simultaneously via a multi-armed bandit approach (DORB), where at each round, the bandit chooses which ***** metric ***** reward to optimize next, based on expected arm gains. | ||
| D18-1214 In this paper, we cast the correspondence problem directly as an optimal transport (OT) problem, building on the idea that word embeddings arise from ***** metric ***** recovery algorithms. | ||
| W19-1311 Through the human ratings that we obtained, we also argue for preference ***** metric ***** to better evaluate the usefulness of an emoji prediction system | ||
| accuracy | 39 | |
| N19-2014 In our company, we are exploring and comparing several toolsets in an effort to determine their strengths and weaknesses in meeting our goals for dialog system development: ***** accuracy *****, time to market, ease of replicating and extending applications, and efficiency and ease of use by developers. | ||
| L06-1272 Sentence alignment is a task that requires not only ***** accuracy *****, as possible errors can affect further processing, but also requires small computation resources and to be language pair independent. | ||
| 2021.naacl-main.165 However, in this paper, we find that it is possible to hack the model in a data-free way by modifying one single word embedding vector, with almost no ***** accuracy ***** sacrificed on clean samples. | ||
| D19-1681 We then empirically study which aspects of a neural architecture are important for the RUN success, and empirically show that entity abstraction, attention over words and worlds, and a constantly updating world-state, significantly contribute to task ***** accuracy *****. | ||
| 2021.naacl-main.458 We attribute this ***** accuracy ***** gap to the lack of dependency modeling among decoder inputs | ||
| imbalanced dataset | 39 | |
| 2021.smm4h-1.24 Furthermore, we address the challenge of the ***** imbalanced dataset ***** and propose techniques such as undersampling, oversampling, and data augmentation to overcome the imbalanced nature of a given health-related dataset. | ||
| 2020.lrec-1.615 In this paper, we propose a neural-based model to address the first task of the DEFT 2013 shared task, with the main challenge of a highly ***** imbalanced dataset *****, using state-of-the-art embedding approaches and deep architectures. | ||
| 2020.smm4h-1.27 This task is specifically challenging due to its highly ***** imbalanced dataset *****, with only 0.2% of the tweets mentioning a drug. | ||
| 2020.findings-emnlp.202 We propose a novel reinforcement learning method with a reconstructor to improve the clinical correctness of generated reports to train the data-to-text module with a highly ***** imbalanced dataset *****. | ||
| C18-1162 Especially, PL-DNB performs well on the ***** imbalanced dataset ***** | ||
| hyperpartisan news detection | 39 | |
| S19-2145 It is an open question how successfully ***** hyperpartisan news detection ***** can be automated, and the goal of this SemEval task was to shed light on the state of the art. | ||
| 2021.woah-1.13 While earlier work on ***** hyperpartisan news detection ***** uses binary classification (i.e., hyperpartisan or not) and English data, we argue for a more fine-grained classification, covering the full political spectrum (i.e., far-left, left, centre, right, far-right) and for extending research to German data. | ||
| S19-2165 2018) for the ***** hyperpartisan news detection ***** task. | ||
| S19-2168 While fake news detection received quite a bit of attention in recent years, ***** hyperpartisan news detection ***** is still an underresearched topic. | ||
| S19-2171 This paper describes our submission to task 4 in SemEval 2019, i.e., ***** hyperpartisan news detection *****. | ||
| modern standard | 39 | |
| 2020.acl-srw.29 We also present the first deep learning-based text classifier widely evaluated on ***** modern standard ***** Arabic, colloquial Arabic, and Classical Arabic. | ||
| I17-1026 We focus on one-layer CNNs (to the exclusion of more complex models) due to their comparative simplicity and strong empirical performance, which makes it a ***** modern standard ***** baseline method akin to Support Vector Machine (SVMs) and logistic regression. | ||
| L16-1080 The data sets used in this experiment are rendered from a new developed corpus-based Arabic wordlist consisting of 5,189 lexical items which represent a variety of ***** modern standard ***** Arabic (MSA) genres and regions, the new wordlist being based on an overlapping frequency based on a comprehensive comparison of four large Arabic corpora with a total size of over 8 billion running words. | ||
| L16-1176 Lemmas and morphological analyses are transferred to a ***** modern standard ***** of encoding by first merging orthographic and morphological information of the lemmas and their entries and then performing a second substitution for the morphs within their morphological analyses. | ||
| 2012.amta-caas14.8 The proposed method was tested using the Baseline system that contains a pronunciation dictionary of 17,236 vocabularies (28,682 words and variants) from 7.57 hours pronunciation corpus of ***** modern standard ***** Arabic (MSA) broadcast news. | ||
| lexical complexity prediction | 39 | |
| 2021.semeval-1.12 In this paper, we present our systems submitted to SemEval-2021 Task 1 on ***** lexical complexity prediction *****.The aim of this shared task was to create systems able to predict the lexical complexity of word tokens and bigram multiword expressions within a given sentence context, a continuous value indicating the difficulty in understanding a respective utterance. | ||
| 2021.semeval-1.86 Visualizations of BERT attention maps offer insight into potential features that Transformers models may learn when fine-tuned for ***** lexical complexity prediction *****. | ||
| 2021.semeval-1.83 The results indicate that information from masked language models and character-level encoders can be combined to improve ***** lexical complexity prediction *****. | ||
| 2021.semeval-1.89 This paper presents our system for the single- and multi-word ***** lexical complexity prediction ***** tasks of SemEval Task 1: Lexical Complexity Prediction. | ||
| 2021.semeval-1.72 ***** lexical complexity prediction ***** (LCP) can not only be used as a part of Lexical Simplification systems, but also as a stand-alone application to help people better reading. | ||
| similar language translation | 39 | |
| 2020.wmt-1.1 In the ***** similar language translation ***** task, participants built machine translation systems for translating between closely related pairs of languages. | ||
| 2021.wmt-1.29 We have participated in the WMT21 shared task of ***** similar language translation ***** on a Tamil-Telugu pair with the team name: CNLP-NITS. | ||
| 2020.wmt-1.45 We have participated in WMT20 shared task of ***** similar language translation ***** on Hindi-Marathi pair. | ||
| 2020.wmt-1.48 This paper describes the participation of team F1toF6 (LTRC, IIIT-Hyderabad) for the WMT 2020 task, ***** similar language translation *****. | ||
| 2020.wmt-1.43 This paper illustrates our approach to the shared task on ***** similar language translation ***** in the fifth conference on machine translation (WMT-20). | ||
| hybrid machine translation | 39 | |
| L12-1129 The taraXÜ project paves the way for wide usage of ***** hybrid machine translation ***** outputs through various feedback loops in system development. | ||
| 2012.amta-government.5 While we have made strides from rule based, to statistical and ***** hybrid machine translation ***** engines, we cannot rely solely on machine translation to overcome the language barrier and accomplish the mission. | ||
| W16-4504 This is useful for instance in ***** hybrid machine translation ***** systems which are usually more dependent on high-quality translation dictionaries. | ||
| 2008.amta-govandcom.21 The recognized utterances are normalized into Modern Standard Arabic and the output of this Modern Standard Arabic interlingua is then translated by a ***** hybrid machine translation ***** system, combining statistical and rule-based features. | ||
| 2010.amta-papers.5 In this paper , we describe an extension to a *****hybrid machine translation***** system for handling dialect Arabic , using a decoding algorithm to normalize non - standard , spontaneous and dialectal Arabic into Modern Standard Arabic . | ||
| lexical substitution | 39 | |
| D19-5552 Herein we propose a method that combines these two approaches to contextualize word embeddings for ***** lexical substitution *****. | ||
| P19-1328 To address these issues, we propose an end-to-end BERT-based ***** lexical substitution ***** approach which can propose and validate substitute candidates without using any annotated data or manually curated resources. | ||
| W17-1403 Furthermore, we apply a recently-proposed, dependency-based ***** lexical substitution ***** model to our dataset. | ||
| 2021.emnlp-main.844 The ***** lexical substitution ***** task aims at generating a list of suitable replacements for a target word in context, ideally keeping the meaning of the modified text unchanged. | ||
| L10-1539 One of the new metrics addresses how effective systems are in ranking substitution candidates, a key ability for ***** lexical substitution ***** systems, and we report some results concerning the assessment of systems produced by this measure as compared to the relevant measure from SemEval-2007. | ||
| challenge | 39 | |
| 2006.bcs-1.1 Processing of Colloquial Arabic is a relatively new area of research, and a number of interesting ***** challenge *****s pertaining to spoken Arabic dialects arise. | ||
| L12-1249 This paper adresses the described ***** challenge ***** of phrase extraction from documents in different domains and languages and proposes an approach, which does not use comprehensive lexica and therefore can be easily transferred to new domains and languages. | ||
| W19-8664 This paper describes our submission to the TL;DR ***** challenge *****. | ||
| 2020.nlpcss-1.9 While this task has been closely associated with emotion prediction, we argue and show that identifying worry needs to be addressed as a separate task given the unique ***** challenge *****s associated with it. | ||
| W17-1606 Speakers' dialect and gender was controlled for by using videos uploaded as part of the “accent tag ***** challenge *****”, where speakers explicitly identify their language background. | ||
| semantic representation | 39 | |
| P19-1514 In this work, we propose a novel approach to support value extraction scaling up to thousands of attributes without losing performance: (1) We propose to regard attribute as a query and adopt only one global set of BIO tags for any attributes to reduce the burden of attribute tag or model explosion; (2) We explicitly model the ***** semantic representation *****s for attribute and title, and develop an attention mechanism to capture the interactive semantic relations in-between to enforce our framework to be attribute comprehensive. | ||
| P19-1418 Neural semantic parsers utilize the encoder-decoder framework to learn an end-to-end model for semantic parsing that transduces a natural language sentence to the formal ***** semantic representation *****. | ||
| 2020.lrec-1.94 A stream of this network also utilizes transfer learning by pre-training a bidirectional transformer to extract ***** semantic representation ***** for each input sentence and incorporates external knowledge through the neighborhood of the entities from a Knowledge Base (KB). | ||
| D18-1263 Beyond SDP, our linearization technique opens the door to integration of graph-based ***** semantic representation *****s as features in neural models for downstream applications. | ||
| L12-1094 We introduce a generic ***** semantic representation ***** of procedures for analysing instructions, using which natural language techniques are applied to automatically extract structured procedures from instructions. | ||
| part of speech | 39 | |
| 2020.sltu-1.22 We also considered the interaction of adjectives with other grammatical means, especially other ***** part of speech *****es, e.g. | ||
| C16-1126 This study shows that gaze and ***** part of speech ***** (PoS) correlations largely transfer across English and French. | ||
| 2010.jeptalnrecital-court.36 A named entity recognizer and a ***** part of speech ***** tagger are applied on each of these sentences to encode necessary information.We classify the sentences based on their subject, verb, object and preposition for determining the possible type of questions to be generated. | ||
| L10-1262 Arabic is a morphologically rich language, which presents a challenge for ***** part of speech ***** tagging. | ||
| L04-1162 More specifically, we consider only ***** part of speech ***** tagging, the voice and the mood of the verb as well as the head word of a noun phrase. | ||
| constituency parsing | 39 | |
| P17-2025 Recent work has proposed several generative neural models for ***** constituency parsing ***** that achieve state-of-the-art results. | ||
| D19-1539 Experiments demonstrate large performance gains on GLUE and new state of the art results on NER as well as ***** constituency parsing ***** benchmarks, consistent with BERT. | ||
| I17-2002 In this paper, we introduce supervised attention to ***** constituency parsing ***** that can be regarded as another translation task. | ||
| P18-1108 In this work, we propose a novel ***** constituency parsing ***** scheme. | ||
| 2020.acl-main.557 We present a *****constituency parsing***** algorithm that , like a supertagger , works by assigning labels to each word in a sentence . | ||
| free word order | 39 | |
| 2004.jeptalnrecital-long.24 Tree Adjoining Grammars (TAG) are known not to be powerful enough to deal with scrambling in ***** free word order ***** languages. | ||
| 2000.iwpt-1.38 This technique is based on a cascade of finite state machines, adding to them a characteristic very crucial in the parsing of words with ***** free word order *****: the simultaneous examination of part of speech and grammatical feature information, which are deemed equally important during the parsing procedure, in contrast with other methodologies. | ||
| N19-3017 Pregroup calculus has been used for the representation of ***** free word order ***** languages (Sanskrit and Hungarian), using a construction called precyclicity. | ||
| L10-1342 This finding is novel, for Czech, with its ***** free word order ***** and rich morphology, is typologically different than languages analyzed with (R)MRS to date. | ||
| L10-1154 Furthermore, a novel feature intending to reflect the relatively ***** free word order ***** scheme of the Latvian language is proposed and successfully applied on the n-best list rescoring step. | ||
| sketch engine | 39 | |
| L10-1007 The *****Sketch Engine***** representing itself a corpus tool which takes as input a corpus of any language and corresponding grammar patterns. | ||
| L16-1445 They were processed by state-of-the-art tools and made available for researchers in the corpus manager *****Sketch Engine*****. | ||
| L10-1044 This forms the corpus, which we then * 'clean' (to remove navigation bars, advertisements etc) * remove duplicates * tokenise and (if tools are available) lemmatise and part-of-speech tag * load into our corpus query tool, the *****Sketch Engine***** | ||
| 2020.wac-1.1 In this paper we discuss some of the current challenges in web corpus building that we faced in the recent years when expanding the corpora in *****Sketch Engine*****. | ||
| L16-1061 The paper describes automatic definition finding implemented within the leading corpus query and management tool, *****Sketch Engine*****. | ||
| arabic wordnet | 39 | |
| L10-1548 We have adapted and extended the automatic Multilingual, Interoperable Named Entity Lexicon approach to Arabic, using *****Arabic WordNet***** (AWN) and Arabic Wikipedia (AWK). | ||
| L08-1211 This presentation focuses on the semi-automatic extension of *****Arabic WordNet***** (AWN) using lexical and morphological rules and applying Bayesian inference. | ||
| 2016.gwc-1.47 The *****Arabic WordNet***** project has provided the Arabic Natural Language Processing (NLP) community with the first WordNet-compliant resource. | ||
| 2020.alvr-1.1 This paper investigates the extension of ImageNet to Arabic using *****Arabic WordNet*****. | ||
| 2018.gwc-1.16 When derivational relations deficiency exists in a wordnet, such as the *****Arabic WordNet*****, it makes it very difficult to exploit in the natural language processing community. | ||
| cross - lingual word embedding | 39 | |
| 2020.acl-main.143 Experiments on various language pairs show that our approaches are significantly better than various baselines, including dictionary-based word-by-word translation, dictionary-supervised *****cross-lingual word embedding***** transformation, and unsupervised MT. | ||
| 2020.lrec-1.243 (2) Represent each node in the graph structure with a *****cross-lingual word embedding***** so that all sentences in multiple languages can be represented with one shared semantic space. | ||
| P19-1310 In this paper, we present the resource and showcase its utility in experiments with *****cross-lingual word embedding***** induction and multi-source part-of-speech projection. | ||
| N19-1161 Recent approaches to *****cross-lingual word embedding***** have generally been based on linear transformations between the sets of embedding vectors in the two languages. | ||
| 2020.acl-main.329 We present results on all these tasks using *****cross-lingual word embedding***** models and multilingual models. | ||
| object detection | 39 | |
| 2020.ai4hi-1.1 Manual verification of the extracted annotations yields an accuracy rate of 97.5%, compared to 70.7% for relations extracted from *****object detection***** and 31.5% for automatically generated captions. | ||
| N18-1198 We provide an in-depth analysis of end-to-end image captioning by exploring a variety of cues that can be derived from such *****object detections*****. | ||
| D19-1348 We present 1) a work in progress method to visually segment key regions of scientific articles using an *****object detection***** technique augmented with contextual features, and 2) a novel dataset of region-labeled articles. | ||
| 2020.aacl-main.50 Many top-performing image captioning models rely solely on object features computed with an *****object detection***** model to generate image descriptions. | ||
| 2020.coling-main.171 Following the intuition that texts and images are complementary in advertising, we introduce a multimodal ensemble of a state of the art image-based classifier, a classifier based on an *****object detection***** architecture, and a fine-tuned language model applied to texts extracted from ads by OCR. | ||
| Natural language | 39 | |
| L12-1032 *****Natural language***** generation in the medical domain is heavily influenced by domain knowledge and genre - specific text characteristics . | ||
| P17-2013 *****Natural language***** processing has increasingly moved from modeling documents and words toward studying the people behind the language . | ||
| 2020.acl-main.192 *****Natural language***** processing covers a wide variety of tasks predicting syntax , semantics , and information content , and usually each type of output is generated with specially designed architectures . | ||
| 2020.findings-emnlp.253 *****Natural language***** rationales could provide intuitive , higher - level explanations that are easily understandable by humans , complementing the more broadly studied lower - level explanations based on gradients or attention weights . | ||
| 2020.blackboxnlp-1.18 *****Natural language***** numbers are an example of compositional structures , where larger numbers are composed of operations on smaller numbers . | ||
| Machine Translation ( MT ) | 39 | |
| 2021.mtsummit-research.13 *****Machine Translation ( MT )***** systems often fail to preserve different stylistic and pragmatic properties of the source text ( e.g. | ||
| 2021.acl-demo.9 We present MT - Telescope , a visualization platform designed to facilitate comparative analysis of the output quality of two *****Machine Translation ( MT )***** systems . | ||
| W19-5427 With the extensive use of *****Machine Translation ( MT )***** technology , there is progressively interest in directly translating between pairs of similar languages . | ||
| 2017.iwslt-1.8 We describe here our *****Machine Translation ( MT )***** model and the results we obtained for the IWSLT 2017 Multilingual Shared Task . | ||
| 2010.amta-government.1 We describe a case study that presents a framework for examining whether *****Machine Translation ( MT )***** output enables translation professionals to translate faster while at the same time producing better quality translations than without MT output . | ||
| Named Entity Recognition ( NER | 39 | |
| C18-1185 In this paper , we propose to use a sequence to sequence model for *****Named Entity Recognition ( NER***** ) and we explore the effectiveness of such model in a progressive NER setting a Transfer Learning ( TL ) setting . | ||
| 2020.lrec-1.37 *****Named Entity Recognition ( NER***** ) is an essential component of many Natural Language Processing pipelines . | ||
| 2020.findings-emnlp.430 *****Named Entity Recognition ( NER***** ) is deeply explored and widely used in various tasks . | ||
| W18-2405 *****Named Entity Recognition ( NER***** ) is a major task in the field of Natural Language Processing ( NLP ) , and also is a sub - task of Information Extraction . | ||
| 2021.acl-long.61 Neural methods have been shown to achieve high performance in *****Named Entity Recognition ( NER***** ) , but rely on costly high - quality labeled data for training , which is not always available across languages . | ||
| external | 39 | |
| R19-1121 Recurrent Neural Network Language Models composed of LSTM units , especially those augmented with an *****external***** memory , have achieved state - of - the - art results in Language Modeling . | ||
| I17-2020 Word embeddings learned from text corpus can be improved by injecting knowledge from *****external***** resources , while at the same time also specializing them for similarity or relatedness . | ||
| D19-1463 How to incorporate *****external***** knowledge into a neural dialogue model is critically important for dialogue systems to behave like real humans . | ||
| 2020.insights-1.18 Previous work has shown how to effectively use *****external***** resources such as dictionaries to improve English - language word embeddings , either by manipulating the training process or by applying post - hoc adjustments to the embedding space . | ||
| 2021.tacl-1.6 Various machine learning tasks can benefit from access to *****external***** information of different modalities , such as text and images . | ||
| pragmatics | 38 | |
| S17-1028 As such, we examine the writings of schizophrenia patients analyzing their syntax, semantics and ***** pragmatics *****. | ||
| P18-5001 The purpose of this tutorial is to present a selection of useful information about semantics and ***** pragmatics *****, as understood in linguistics, in a way that's accessible to and useful for NLP practitioners with minimal (or even no) prior training in linguistics. | ||
| L10-1037 The paper presents a solution consisting of a data model and an annotation tool that tries to fill this gap between âannotation science and the practice of transcribing spoken language in the area of discourse analysis and ***** pragmatics *****, where the lack of ready-to-use annotation solutions is especially remarkable. | ||
| 2021.conll-1.29 In this paper, we target pre-trained LMs' competence in ***** pragmatics *****, with a focus on ***** pragmatics ***** relating to discourse connectives. | ||
| L06-1036 It is also our goal to relate both prosody and ***** pragmatics ***** to emotion, style and attitude | ||
| constructs | 38 | |
| 2021.spnlp-1.2 In this work, we propose an alternative approach: a Semi-autoregressive Bottom-up Parser (SmBoP) that ***** constructs ***** at decoding step t the top-K sub-trees of height t. | ||
| 2021.acl-long.141 Given a mention in a sentence, our approach ***** constructs ***** an input for the BERT MLM so that it predicts context dependent hypernyms of the mention, which can be used as type labels. | ||
| D19-1381 The search process ***** constructs ***** events in a bottom-up manner while modelling the global properties for nested and overlapping structures simultaneously using neural networks. | ||
| L16-1244 The previous error correction method ***** constructs ***** a pseudo parallel corpus where incorrect partial parse trees are paired with correct ones, and extracts error correction rules from the parallel corpus. | ||
| L04-1236 While most current clustering-based summarization systems base their summaries only on the common information contained in a collection of highly-related sentences, our system ***** constructs ***** more informative summaries that incorporate both the redundant and unique contributions of the sentences in the cluster | ||
| additionally | 38 | |
| P19-3016 It ***** additionally ***** provides a flexible architecture in which modules can be arbitrarily combined or exchanged - allowing for easy switching between rules-based and neural network based implementations. | ||
| L10-1594 These methods allow language processing applications to take advantage of much larger language models than previously was possible using the same hardware and we ***** additionally ***** describe how they can be used in a distributed environment to store even larger models. | ||
| N19-1107 For this purpose, we construct a dataset called WIKI-TIME which ***** additionally ***** includes the valid period of a certain relation of two entities in the knowledge base. | ||
| W19-5435 We then use the representations directly to score and filter the noisy parallel sentences without ***** additionally ***** training a scoring function. | ||
| 2009.iwslt-evaluation.13 For all of the tasks, system performance is improved with some special methods as follows: 1) combining different results of Chinese word segmentation, 2) combining different results of word alignments, 3) adding reliable bilingual words with high probabilities to the training data, 4) handling named entities including person names, location names, organization names, temporal and numerical expressions ***** additionally *****, 5) combining and selecting translations from the outputs of multiple translation engines, 6) replacing Chinese character with Chinese Pinyin to train the translation model for Chinese-to-English ASR challenge task | ||
| backtranslation | 38 | |
| 2021.wat-1.27 We combined a variety of techniques: transliteration, filtering, ***** backtranslation *****, domain adaptation, knowledge-distillation and finally ensembling of NMT models. | ||
| W18-6424 We use an improved technique of ***** backtranslation *****, where we iterate the process of translating monolingual data in one direction and training an NMT model for the opposite direction using synthetic parallel data. | ||
| 2021.insights-1.13 Our core finding is that ***** backtranslation ***** can offer modest improvements in low-resource scenarios, but only if the unlabeled data is very clean and has been filtered by the same annotation standards as the labeled data. | ||
| 2021.calcs-1.6 We find that, although simple, our synthetic code-mixing method is competitive with (and in some cases is even superior to) several standard methods (***** backtranslation *****, method based on equivalence constraint theory) under a diverse set of conditions. | ||
| 2021.wmt-1.7 We use the latter for experiments with various ***** backtranslation ***** techniques | ||
| underspecified | 38 | |
| P19-1080 The semantic representations used, however, are often ***** underspecified *****, which places a higher burden on the generation model for sentence planning, and also limits the extent to which generated responses can be controlled in a live system. | ||
| 2021.unimplicit-1.8 In this report, we describe our transformers for text classification baseline (TTCB) submissions to a shared task on implicit and ***** underspecified ***** language 2021. | ||
| D18-1233 It is further complicated due to the fact that, in practice, most questions are ***** underspecified *****, and a human assistant will regularly have to ask clarification questions such as “How long have you been working abroad?” | ||
| 1995.iwpt-1.21 We extend the items involved to include the relevant ***** underspecified ***** information using it in the completion steps to ensure the acceptability of the resulting structure. | ||
| 2020.findings-emnlp.311 We present UNQOVER, a general framework to probe and quantify biases through ***** underspecified ***** questions | ||
| prepositional | 38 | |
| 1991.mtsummit-papers.14 This paper describes some techniques used in the Chinese system to solve problems in word ordering, language equivalency, Chinese verb constituent and ***** prepositional ***** phrase attachment. | ||
| N18-1082 In addition, recent studies aiming at solving ***** prepositional ***** attachment and preposition selection problems depend heavily on external linguistic resources and use dataset-specific word representations. | ||
| N19-3017 The replicability of these methods is explained in the representation of adverbs and ***** prepositional ***** phrases in English. | ||
| W17-6305 We present a low-rank multi-linear model for the task of solving ***** prepositional ***** phrase attachment ambiguity (PP task). | ||
| L10-1456 Our paper presents the details of a pilot study in which we tagged portions of the American National Corpus ( ANC ) for idioms composed of verb - noun constructions , *****prepositional***** phrases , and subordinate clauses . | ||
| rewriting | 38 | |
| 2020.emnlp-main.537 For multi-turn dialogue ***** rewriting *****, the capacity of effectively modeling the linguistic knowledge in dialog context and getting ride of the noises is essential to improve its performance. | ||
| D19-3003 In this paper, we describe ALTER, an auxiliary text ***** rewriting ***** tool that facilitates the ***** rewriting ***** process for natural language generation tasks, such as paraphrasing, text simplification, fairness-aware text ***** rewriting *****, and text style transfer. | ||
| D19-1322 In this work we introduce the Generative Style Transformer (GST) - a new approach to ***** rewriting ***** sentences to a target style in the absence of parallel style corpora. | ||
| P19-2036 Our approach can control both the lexical and syntactic complexity and achieve an aggressive ***** rewriting *****. | ||
| 2021.naacl-main.44 QReCC provides annotations that allow us to train and evaluate individual subtasks of question ***** rewriting *****, passage retrieval and reading comprehension required for the end-to-end conversational question answering (QA) task | ||
| compositional semantics | 38 | |
| 2020.acl-srw.35 In this paper, we present a ***** compositional semantics ***** that maps various comparative constructions in English to semantic representations via Combinatory Categorial Grammar (CCG) parsers and combine it with an inference system based on automated theorem proving. | ||
| L14-1021 Especially, we focus on the distinction between references to representational content and structural components of images, and the utility of such a distinction within a ***** compositional semantics *****. | ||
| L08-1047 (2003), the paper argues that in contrast crucially involves discourse anaphora and, thus, resembles other discourse adverbials such as then, otherwise, and nevertheless. The ***** compositional semantics ***** proposed for other discourse connectives, however, does not straightforwardly generalize to in contrast, for which the notions of contrast pairs and contrast properties are essential. | ||
| W19-0403 The QuantML scheme consists of (1) an abstract syntax which defines `annotation structures' as triples and other set-theoretic constructs; (b) a ***** compositional semantics ***** of annotation structures; (3) an XML representation of annotation structures. | ||
| 2021.cmcl-1.3 CCG has well-defined incremental parsing algorithms, surface ***** compositional semantics *****, and can explain long-range dependencies as well as complicated cases of coordination. | ||
| constraints | 38 | |
| 2017.jeptalnrecital-court.12 These ***** constraints ***** have the form of type ***** constraints ***** and specify which arguments in the frame of the verbal base are compatible with the referential argument of the derivative. | ||
| P17-1181 We propose methods for using global ***** constraints ***** by performing rescoring of the score matrices produced by state of the art cognates detection systems. | ||
| D17-1160 These results suggest that type ***** constraints ***** and entity linking are valuable components to incorporate in neural semantic parsers. | ||
| 2021.ecnlp-1.1 Finally, we provide guidelines to practitioners for training embeddings under a variety of computational and data ***** constraints *****. | ||
| P19-1637 Informed prior-based methods provide better control than ***** constraints *****, but ***** constraints ***** yield higher quality topics | ||
| noun phrases | 38 | |
| L12-1614 ***** noun phrases *****, verb phrases, adjectival phrases, etc.) | ||
| P18-1009 This formulation allows us to use a new type of distant supervision at large scale: head words, which indicate the type of the ***** noun phrases ***** they appear in. | ||
| L16-1555 Verbs and ***** noun phrases ***** are annotated with event and participant types, respectively. | ||
| P18-2011 In this paper, we propose an approach based on linguistic knowledge for identification of aliases mentioned using proper nouns, pronouns or ***** noun phrases ***** with common noun headword. | ||
| 2020.emnlp-main.585 Finally, we show that Generationary benefits from training on data from multiple inventories, with strong gains on various zero-shot benchmarks, including a novel dataset of definitions for free adjective-***** noun phrases *****. | ||
| surface realization | 38 | |
| W18-6503 In this work, we present our system for Natural Language Generation where we control various aspects of the ***** surface realization ***** in order to increase the lexical variability of the utterances, such that they sound more diverse and interesting. | ||
| P19-1197 The training of key fact prediction needs much fewer annotated data, while ***** surface realization ***** can be trained with pseudo parallel corpus. | ||
| 2021.ranlp-1.92 Generating an utterance from a Meaning representation (MR) usually passes two steps: sentence planning and ***** surface realization *****. | ||
| D19-6309 We describe our exploratory system for the shallow ***** surface realization ***** task, which combines morphological inflection using character sequence-to-sequence models with a baseline linearizer that implements a tree-to-tree model using sequence-to-sequence models on serialized trees. | ||
| 2020.msr-1.7 In the context of Natural Language Generation, ***** surface realization ***** is the task of generating the linear form of a text following a given grammar. | ||
| annotated data | 38 | |
| 2020.lrec-1.856 We first draw on a small set of ***** annotated data ***** to compute spelling error statistics. | ||
| 2020.coling-main.16 Experiments show that our framework using sentiment-related discourse augmentations for sentiment prediction enhances the overall performance for long documents, even beyond previous approaches using well-established discourse parsers trained on human ***** annotated data *****. | ||
| 2021.eacl-main.88 We also demonstrate that the performance of SynPG is competitive or even better than supervised models when the un***** annotated data ***** is large. | ||
| 2021.humeval-1.9 Our contributions include the ***** annotated data *****set that we make publicly available and the proposal of Success Rate @k as an evaluation metric that is more appropriate than the traditional QA's and information retrieval's metrics. | ||
| 2020.semeval-1.18 We use existing semantically ***** annotated data *****sets, and propose to approximate similarity through automatically generated lexical substitutes in context. | ||
| gender bias | 38 | |
| N19-1061 Several recent works tackle this problem, and propose methods for significantly reducing this ***** gender bias ***** in word embeddings, demonstrating convincing results. | ||
| W19-3812 In this work, contribution of transfer learning technique to pronoun resolution systems is investigated and the ***** gender bias ***** contained in classification models is evaluated. | ||
| 2021.wmt-1.61 In this set measurement of ***** gender bias ***** is solely based on the translation of occupations. | ||
| 2020.emnlp-demos.15 We include case studies for a diverse set of workflows, including exploring counterfactuals for sentiment analysis, measuring ***** gender bias ***** in coreference systems, and exploring local behavior in text generation. | ||
| W19-3820 Injecting evidence from the coreference models compliments the base architecture, and analysis shows that the model is not hindered by their weaknesses, specifically ***** gender bias *****. | ||
| source language | 38 | |
| D18-1270 This enables our approach to: (a) augment the limited supervision in the target language with additional supervision from a high-re***** source language ***** (like English), and (b) train a single entity linking model for multiple languages, improving upon individually trained models for each language. | ||
| 2005.mtsummit-papers.31 We show that the parts of a sentence that are automatically identified as nonmachine-translatable provide useful information for paraphrasing or revising the sentence in the ***** source language *****, thus improving the quality of the final translation. | ||
| 1999.mtsummit-1.47 The system is meant for a ***** source language ***** (SL) speaker who does not know the target language (TL). | ||
| Q16-1022 We propose a novel approach to cross-lingual part-of-speech tagging and dependency parsing for truly low-re***** source language *****s. | ||
| 2020.repl4nlp-1.16 We find that better models for low re***** source language *****s require more efficient pretraining techniques or more data. | ||
| weak supervision | 38 | |
| 2021.semeval-1.127 In the second approach, we perform ***** weak supervision ***** with soft attention to learn token level labels from sentence labels. | ||
| 2021.naacl-main.242 In this paper, we explore text classification with extremely ***** weak supervision *****, i.e., only relying on the surface text of class names. | ||
| 2021.emnlp-main.46 Instead of asking for new fine-grained human annotations, we opt to leverage label surface names as the only human guidance and weave in rich pre-trained generative language models into the iterative ***** weak supervision ***** strategy. | ||
| 2020.emnlp-main.546 An essay scorer is first pre- trained on a large essay dataset covering diverse topics and with coarse ratings, i.e., good and poor, which are used as a kind of ***** weak supervision *****. | ||
| 2021.naacl-main.66 In this work, we develop a ***** weak supervision ***** framework (ASTRA) that leverages all the available data for a given task. | ||
| story generation | 38 | |
| C16-2053 Although related studies generally use one or more scripts for ***** story generation *****, this research synthetically uses many scripts to flexibly generate a diverse narrative. | ||
| 2021.alta-1.13 Generating long and coherent text is an important and challenging task encompassing many application areas such as summarization, document level machine translation and ***** story generation *****. | ||
| W19-3405 Additionally, we provide results for a full end-to-end automated ***** story generation ***** system, demonstrating how our model works with existing systems designed for the event-to-event problem. | ||
| 2020.emnlp-main.349 We propose the task of outline-conditioned ***** story generation *****: given an outline as a set of phrases that describe key characters and events to appear in a story, the task is to generate a coherent narrative that is consistent with the provided outline. | ||
| 2020.emnlp-main.525 We release both the STORIUM dataset and evaluation platform to spur more principled research into ***** story generation *****. | ||
| digital humanities | 38 | |
| 2020.acl-main.51 The problem of comparing two bodies of text and searching for words that differ in their usage between them arises often in ***** digital humanities ***** and computational social science. | ||
| L16-1300 Text analysis methods widely used in ***** digital humanities ***** often involve word co-occurrence, e.g. | ||
| 2021.eval4nlp-1.11 in historical linguistics and ***** digital humanities *****, is challenging due to a lack of statistical power. | ||
| W16-4007 We demonstrate its utility with selected case-studies in which we show its application to the ***** digital humanities *****. | ||
| D19-1661 However, lack of interpretability and the unsupervised nature of word embeddings have limited their use within computational social science and ***** digital humanities *****. | ||
| sentence simplification | 38 | |
| P19-1331 Our model outperforms previous state-of-the-art neural ***** sentence simplification ***** models (without external knowledge) by large margins on three benchmark text simplification corpora in terms of SARI (+0.95 WikiLarge, +1.89 WikiSmall, +1.41 Newsela), and is judged by humans to produce overall better and simpler output sentences. | ||
| 2020.coling-main.121 In this paper, we are the first to investigate the helpfulness of document context on ***** sentence simplification ***** and apply it to the sequence-to-sequence model. | ||
| R19-1033 The paper begins with our observation of challenges in the intrinsic evaluation of ***** sentence simplification ***** systems, which motivates the use of extrinsic evaluation of these systems with respect to other NLP tasks. | ||
| 2020.acl-main.707 We present a novel iterative, edit-based approach to unsupervised ***** sentence simplification *****. | ||
| Q16-1029 Most recent *****sentence simplification***** systems use basic machine translation models to learn lexical and syntactic paraphrases from a manually simplified parallel corpus . | ||
| generating natural language | 38 | |
| 2020.inlg-1.20 In this work, we introduce a new dataset and present a neural model for automatically ***** generating natural language ***** summaries for charts. | ||
| W19-4113 In our work, we contribute to the under-explored area of ***** generating natural language ***** explanations for general phenomena. | ||
| N18-1139 In this work, we focus on the task of ***** generating natural language ***** descriptions from a structured table of facts containing fields (such as nationality, occupation, etc) and values (such as Indian, actor, director, etc). | ||
| 2020.tacl-1.2 Abstract meaning representation (AMR)-to-text generation is the challenging task of ***** generating natural language ***** texts from AMR graphs, where nodes represent concepts and edges denote relations. | ||
| W18-6520 In this paper, we propose a self-learning architecture for ***** generating natural language ***** templates for conversational assistants. | ||
| semantic relation | 38 | |
| P19-1514 In this work, we propose a novel approach to support value extraction scaling up to thousands of attributes without losing performance: (1) We propose to regard attribute as a query and adopt only one global set of BIO tags for any attributes to reduce the burden of attribute tag or model explosion; (2) We explicitly model the semantic representations for attribute and title, and develop an attention mechanism to capture the interactive ***** semantic relation *****s in-between to enforce our framework to be attribute comprehensive. | ||
| L12-1040 Polaris is a supervised semantic parser that given text extracts ***** semantic relation *****s. | ||
| L16-1425 In this paper we describe VerbCROcean, a broad-coverage repository of fine-grained ***** semantic relation *****s between Croatian verbs. | ||
| D19-1357 While the meanings of defining words are important in dictionary definitions, it is crucial to capture the lexical ***** semantic relation *****s between defined words and defining words. | ||
| L06-1368 The overview consists of presentation of an architecture of the ontology extraction system, description of methods used for mining of ***** semantic relation *****s and analysis of selected results and examples. | ||
| voice | 38 | |
| 2021.acl-long.3 It offers an easy way to hear the ***** voice ***** from the public and learn from their feelings to important social topics. | ||
| 2020.rail-1.1 The ǂKhomani San, Hugh Brody Collection features the ***** voice *****s and history of indigenous hunter gatherer descendants in three endangered languages namely, N|uu, Kora and Khoekhoe as well as a regional dialect of Afrikaans. | ||
| L10-1249 The synthetic ***** voice *****s for Viennese varieties, implemented with the open domain unit selection speech synthesis engine Multisyn of Festival will also be released within Festival. | ||
| 2020.ecnlp-1.6 In this work, we improve the intent classification in an English based e-commerce ***** voice ***** assistant by using inter-utterance context. | ||
| L10-1498 The toolkit can be easily employed to create ***** voice *****s in the languages already supported by MARY TTS. | ||
| formality style transfer | 38 | |
| P19-1609 We conduct experiments across various seq2seq text generation tasks including machine translation, ***** formality style transfer *****, sentence compression and simplification. | ||
| 2020.coling-main.203 In this paper, we present a new approach, Sequence-to-Sequence with Shared Latent Space (S2S-SLS), for ***** formality style transfer *****, where we propose two auxiliary losses and adopt joint training of bi-directional transfer and auto-encoding. | ||
| 2021.emnlp-main.100 In this paper, we evaluate leading automatic metrics on the oft-researched task of ***** formality style transfer *****. | ||
| 2020.findings-emnlp.212 *****Formality style transfer***** is the task of converting informal sentences to grammatically-correct formal sentences, which can be used to improve performance of many downstream NLP tasks. | ||
| 2020.acl-main.294 The main barrier to progress in the task of *****Formality Style Transfer***** is the inadequacy of training data. | ||
| negation cue | 38 | |
| 2020.acl-main.429 We apply this methodology to test BERT and RoBERTa on a hypothesis that some attention heads will consistently attend from a word in negation scope to the *****negation cue*****. | ||
| W19-1306 We define rules to identify true *****negation cues***** and scope more suited to conversational data than existing general review data. | ||
| 2021.conll-1.19 We use the benchmark to probe the negation-awareness of multilingual language models and find that models that correctly predict examples with *****negation cues*****, often fail to correctly predict their counter-examples without *****negation cues*****, even when the cues are irrelevant for semantic inference. | ||
| L10-1229 In this paper we present a description of *****negation cues***** and their scope in biomedical texts, based on the cues that occur in the BioScope corpus. | ||
| W17-1809 *****Negation cue***** detection involves identifying the span inherently expressing negation in a negative sentence. | ||
| sentence prediction | 38 | |
| 2020.coling-main.118 To this end, we generalize the standard BERT model to a multi-task learning setting where we couple BERT's masked language modeling and next *****sentence prediction***** objectives with an auxiliary task of binary word relation classification. | ||
| 2020.acl-main.247 BERT is pretrained on two auxiliary tasks: Masked Language Model and Next *****Sentence Prediction*****. | ||
| 2021.naacl-main.218 We also introduce two simple auxiliary tasks: next *****sentence prediction***** and task-id prediction, for learning better generic and specific representation spaces. | ||
| 2020.aacl-srw.15 BERT is able to understand sentence relationships since BERT is pre-trained using the next *****sentence prediction***** task. | ||
| 2020.acl-main.666 We demonstrate the effectiveness of our approach with state-of-the-art accuracy on the unsupervised Story Cloze task and with promising results on larger-scale next *****sentence prediction***** tasks. | ||
| customer review | 38 | |
| 2021.calcs-1.13 Sentiment analysis is an important task in understanding social media content like *****customer reviews*****, Twitter and Facebook feeds etc. | ||
| D18-1384 Many existing systems for analyzing and summarizing *****customer reviews***** about products or service are based on a number of prominent review aspects. | ||
| 2021.wnut-1.27 In this paper we propose to tackle multilingual sequence tagging with a new span alignment method and apply it to opinion target extraction from *****customer reviews*****. | ||
| 2020.emnlp-main.442 We release an English QA dataset (SubjQA) based on *****customer reviews*****, containing subjectivity annotations for questions and answer spans across 6 domains. | ||
| 2021.hcinlp-1.9 *****Customer reviews***** are useful in providing an indirect, secondhand experience of a product. | ||
| statistical machine translation ( SMT | 38 | |
| 2008.amta-papers.10 Phrase - based translation models are widely studied in *****statistical machine translation ( SMT***** ) . | ||
| L16-1101 Out - of - vocabulary ( OOV ) word is a crucial problem in *****statistical machine translation ( SMT***** ) with low resources . | ||
| L16-1721 This paper discusses the role that *****statistical machine translation ( SMT***** ) can play in the development of cross - border EU e - commerce , by highlighting extant obstacles and identifying relevant technologies to overcome them . | ||
| 2016.amta-researchers.4 The utilization of *****statistical machine translation ( SMT***** ) has grown enormously over the last decade , many using open - source software developed by the NLP community . | ||
| I17-1038 The recent technological shift in machine translation from *****statistical machine translation ( SMT***** ) to neural machine translation ( NMT ) raises the question of the strengths and weaknesses of NMT . | ||
| WMT 2021 | 37 | |
| 2021.wmt-1.55 This paper presents the submission of Huawei Translation Services Center (HW-TSC) to the ***** WMT 2021 ***** | ||
| 2021.wmt-1.111 In this paper, we present the joint contribution of Unbabel and IST to the ***** WMT 2021 ***** | ||
| 2021.wmt-1.93 In this paper, we discuss our submission to the ***** WMT 2021 ***** QE Shared Task. | ||
| 2021.wmt-1.11 We use parallel corpora provided by ***** WMT 2021 ***** organizers for training, and development and test data from WMT 2020 for evaluation of different experiment models | ||
| 2021.wmt-1.30 This paper describes the participation of team oneNLP ( LTRC , IIIT - Hyderabad ) for the *****WMT 2021***** task , similar language translation . | ||
| scalability | 37 | |
| 2020.acl-main.19 We further show our model's ***** scalability ***** by conducting tests on the CoQA dataset. | ||
| 2020.findings-emnlp.152 While recent work seeks to address these ***** scalability ***** issues at pre-training, these issues are also prominent in fine-tuning especially for long sequence tasks like document classification. | ||
| P19-1434 We consider a novel question answering (QA) task where the machine needs to read from large streaming data (long documents or videos) without knowing when the questions will be given, which is difficult to solve with existing QA methods due to their lack of ***** scalability *****. | ||
| D18-1052 It additionally leads to a significant ***** scalability ***** advantage since the encoding of the answer candidate phrases in the document can be pre-computed and indexed offline for efficient retrieval. | ||
| 2021.ecnlp-1.2 This has limited their ***** scalability ***** and generalization for large scale real world e-commerce applications | ||
| autoencoder | 37 | |
| 2020.coling-main.224 In previous work, the ***** autoencoder ***** framework is a prevalent approach for the utilization of unlabelled data. | ||
| W18-4410 Using a log-normalized, weighted word-count vector at input dimensions, the ***** autoencoder ***** simulates a competition between neurons in the hidden layer to minimize the reconstruction loss between the input and final output layers. | ||
| D18-1413 To capture this type of information, we propose an ***** autoencoder ***** model with a latent space defined by a hierarchy of categorical variables. | ||
| D17-1214 We evaluate models that can reuse ***** autoencoder ***** states and outputs without fine-tuning their weights, allowing for more efficient training and inference. | ||
| 2021.acl-short.127 In this paper, we solve the task of unsupervised cross-domain concept prerequisite chain learning, using an optimized variational graph ***** autoencoder ***** | ||
| LT | 37 | |
| 2021.emnlp-main.90 In this work, we examine the feasibility of ***** LT ***** for incremental NLU in English. | ||
| 1998.amta-papers.21 This is feasible because an ***** LT ***** product consists of a software part and a lingware part. | ||
| 2020.lrec-1.413 The ELG will boost the Multilingual Digital Single Market towards a thriving European ***** LT ***** community, creating new jobs and opportunities. | ||
| 2020.lrec-1.407 We present an overview of the European ***** LT ***** landscape, describing funding programmes, activities, actions and challenges in the different countries with regard to ***** LT *****, including the current state of play in industry and the ***** LT ***** market. | ||
| L10-1493 *****LT***** World ( www.lt-world.org ) is an ontology - driven web portal aimed at serving the global language technology community . | ||
| connective | 37 | |
| 2020.coling-main.505 We successively augment a purely-empirical approach based on contextualised embeddings with linguistic knowledge encoded in a ***** connective ***** lexicon. | ||
| W16-5110 Prior work has shown that the difference in usage of ***** connective *****s across corpora affects the cross domain ***** connective ***** identification task negatively. | ||
| 2021.codi-main.9 We here assess the performance on explicit ***** connective ***** identification of three parse methods (PDTB e2e, Lin et al., 2014; the winner of CONLL2015, Wang et al., 2015; and DisSent, Nie et al., 2019), along with a simple heuristic. | ||
| 2001.jeptalnrecital-long.12 More sentences were aggregated without than with the use of an explicit sign, such as a ***** connective ***** or a (semi-)colon | ||
| 2021.codi-main.8 Cross - linguistic research on discourse structure and coherence marking requires discourse - annotated corpora and *****connective***** lexicons in a large number of languages . | ||
| polysynthetic | 37 | |
| W19-4222 In addition, many of these ***** polysynthetic ***** languages are low-resource. | ||
| 2020.lrec-1.333 It is ***** polysynthetic ***** and low-resource. | ||
| N19-4021 Because Yupik is a ***** polysynthetic ***** language, handling of multimorphemic word forms is critical. | ||
| 2021.americasnlp-1.11 Yine is a low-resource indigenous ***** polysynthetic ***** Peruvian language spoken by approximately 3,000 people and is classified as `definitely endangered' by UNESCO. | ||
| N18-1005 Morphological segmentation for *****polysynthetic***** languages is challenging , because a word may consist of many individual morphemes and training data can be extremely scarce . | ||
| pointer | 37 | |
| 2020.findings-emnlp.335 Through evaluation by doctors, we show that our approach is preferred on twice the number of summaries to the baseline ***** pointer ***** generator model and captures most or all of the information in 80% of the conversations making it a realistic alternative to costly manual summarization by medical experts. | ||
| 2021.naacl-main.453 To explore entity pairs that may be implicitly connected by relations, we propose a binary ***** pointer ***** network to extract overlapping relational triples relevant to each word sequentially and retain the information of previously extracted triples in an external memory. | ||
| W17-4606 We propose an E2E model based on ***** pointer ***** networks, which can be trained directly on pairs of raw input and output text | ||
| D19-1093 Transition - based top - down parsing with *****pointer***** networks has achieved state - of - the - art results in multiple parsing tasks , while having a linear time complexity . | ||
| S17-2157 We present a neural encoder - decoder AMR parser that extends an attention - based model by predicting the alignment between graph nodes and sentence tokens explicitly with a *****pointer***** mechanism . | ||
| morphological analyser | 37 | |
| L04-1259 A possible solution to this problem is to apply a comprehensive ***** morphological analyser *****, which is able to analyse almost all wordforms alleviating the problem of unseen tokens. | ||
| 2020.lrec-1.439 The idea is to train recurrent neural networks on the output that the ***** morphological analyser ***** produces for unambiguous words. | ||
| L12-1153 The system, and the ***** morphological analyser ***** built for it, are both the first resources of their kind for Aragonese. | ||
| 2020.lrec-1.314 The aim of this paper is to create a new ***** morphological analyser ***** for Evenki. | ||
| 2020.acl-srw.28 In this paper, we describe the challenges encountered when modelling a language exhibiting distributed exponence and present the first ***** morphological analyser ***** for Nen, with an overall accuracy of 80.3% | ||
| topics | 37 | |
| 2020.emnlp-main.141 It covers a diverse set ***** topics ***** and speakers, and carries supervision of 20 labels including sentiment (and subjectivity), emotions, and attributes. | ||
| P17-1165 The coherence between ***** topics ***** is ensured through a copula, binding the ***** topics ***** associated to the words of a segment. | ||
| K19-1073 Content of text data are often influenced by contextual factors which often evolve over time (e.g., content of social media are often influenced by ***** topics ***** covered in the major news streams). | ||
| L12-1147 For this preliminary study, we focused on 7 ***** topics ***** which have been relatively important in France. | ||
| W18-1706 This paper proposes an accurate and efficient graph-based method for WSI that builds a global non-negative vector embedding basis (which are interpretable like ***** topics *****) and clusters the basis indexes in the ego network of each polysemous word | ||
| frame | 37 | |
| W19-0425 We find that for ***** frame ***** identification, generalization and task-adaptive categorization both yield substantial benefits. | ||
| S19-2004 New Hearst-like patterns for verbs are introduced that prove to be effective for ***** frame ***** induction. | ||
| 2021.eacl-main.206 We present a new model for ***** frame ***** identification that uses a pre-trained transformer model to generate representations for ***** frame *****s and lexical units (senses) using their formal definitions in FrameNet. | ||
| L08-1218 However, the absolute time values cant be used for alignment, since the timing is usually specified by ***** frame ***** numbers and not by real time, and converting it to real time values is not always possible, hence we use normalized subtitle duration instead. | ||
| 2021.ranlp-1.165 Despite previous efforts, there does not exist a well-thought-out automatic/semi-automatic methodology for ***** frame ***** construction | ||
| categories | 37 | |
| 2020.ccl-1.100 Aspect-category sentiment classification (ACSC) aims to identify the sentiment polarities towards the aspect ***** categories ***** mentioned in a sentence. | ||
| 2020.emnlp-main.287 Given a sentence and the aspect ***** categories ***** mentioned in the sentence, AC-MIMLLN first predicts the sentiments of the instances, then finds the key instances for the aspect ***** categories *****, finally obtains the sentiments of the sentence toward the aspect ***** categories ***** by aggregating the key instance sentiments. | ||
| E17-2107 Experimental results on 5 ***** categories ***** of Amazon.com products show that both common aspects of parent category and the individual aspects of sub-***** categories ***** can be extracted to align well with the common sense. | ||
| 2020.ccl-1.103 Although these joint models obtain promising performances, they train separate parameters for each aspect category and therefore suffer from data deficiency of some aspect ***** categories *****. | ||
| P19-1597 We conduct experiments on 5 ***** categories ***** in a benchmark Chess Commentary dataset and achieve inspiring results in both automatic and human evaluations | ||
| semantic interpretation | 37 | |
| 1998.amta-papers.25 All parse trees are converted to this format prior to ***** semantic interpretation *****. | ||
| 2020.crac-1.16 Reflexive anaphora present a challenge for ***** semantic interpretation *****: their meaning varies depending on context in a way that appears to require abstract variables. | ||
| L08-1328 Developing a full coreference system able to run all the way from raw text to ***** semantic interpretation ***** is a considerable engineering effort. | ||
| 2014.lilt-10.1 The agents, modeled within the OntoAgent environment, are tasked to compute a full context-sensitive ***** semantic interpretation ***** of each compound using a battery of engines that rely on a high-quality computational lexicon and ontology. | ||
| W16-5305 The identification of semantic relations between terms within texts is a fundamental task in Natural Language Processing which can support applications requiring a lightweight *****semantic interpretation***** model . | ||
| common sense | 37 | |
| 2021.nlp4convai-1.23 Humans make appropriate responses not only based on previous dialogue utterances but also on implicit background knowledge such as ***** common sense *****. | ||
| C16-1177 According to the hearer's ***** common sense ***** knowledge and his comprehension of the preceding text, a discourse entity could be old, mediated or new. | ||
| 2021.acl-long.102 We propose Mickey Probe, a language-general probing task for fairly evaluating the ***** common sense ***** of popular ML-LMs across different languages. | ||
| 2020.findings-emnlp.44 We exploit ConceptNet KG for encoding the ***** common sense ***** knowledge and evaluate our methodology on the Outside Knowledge-VQA (OK-VQA) and VQA datasets. | ||
| 2020.semeval-1.71 In this paper, we explore solutions to a ***** common sense ***** making task in which a model must discern which of two sentences is against ***** common sense *****. | ||
| automated essay scoring | 37 | |
| N18-1021 This may affect ***** automated essay scoring ***** models in many ways, as these models are typically designed to model (potentially biased) essay raters. | ||
| P18-1058 While argument persuasiveness is one of the most important dimensions of argumentative essay quality, it is relatively little studied in ***** automated essay scoring ***** research. | ||
| 2020.bea-1.15 Here we investigate whether, in ***** automated essay scoring ***** (AES) research, deep neural models are an appropriate technological choice. | ||
| 2020.lrec-1.157 In this study, we created an ***** automated essay scoring ***** (AES) system for nonnative Japanese learners using an essay dataset with annotations for a holistic score and multiple trait scores, including content, organization, and language scores. | ||
| 2020.acl-demos.17 The method involves extracting grammar patterns, training models for ***** automated essay scoring ***** (AES) and grammatical error detection (GED), and finally retrieving plausible corrections from a n-gram search engine. | ||
| linguistic analysis | 37 | |
| 2020.emnlp-main.424 An intermediate step in the ***** linguistic analysis ***** of an under-documented language is to find and organize inflected forms that are attested in natural speech. | ||
| 2021.acl-long.326 The graph network injects structural psycholinguistic knowledge in LIWC, a computerized instrument for psycho***** linguistic analysis *****, by constructing a heterogeneous tripartite graph. | ||
| L06-1174 The ***** linguistic analysis ***** processes both documents to be indexed and queries to extract concepts representing their content. | ||
| C16-1044 This work focuses on the development of *****linguistic analysis***** tools for resource - poor languages . | ||
| L14-1522 This paper presents the *****linguistic analysis***** tools and its infrastructure developed within the XLike project . | ||
| entity typing | 37 | |
| 2021.acl-long.141 To remedy this problem, in this paper, we propose to obtain training data for ultra-fine ***** entity typing ***** by using a BERT Masked Language Model (MLM). | ||
| P18-1010 This paper presents new methods using real and complex bilinear mappings for integrating hierarchical information, yielding substantial improvement over flat predictions in entity linking and fine-grained ***** entity typing *****, and achieving new state-of-the-art results for end-to-end models on the benchmark FIGER dataset. | ||
| 2021.acl-demo.11 First, CogIE is a versatile toolkit with a rich set of functional modules, including named entity recognition, ***** entity typing *****, entity linking, relation extraction, event extraction and frame-semantic parsing. | ||
| W19-4319 How can we represent hierarchical information present in large type inventories for ***** entity typing *****? | ||
| 2021.acl-long.420 We evaluate the model on three downstream tasks, including relation classification, ***** entity typing *****, and question answering. | ||
| dialogue act | 37 | |
| 2020.lrec-1.74 We highlight how thinking aloud affects interpretation of ***** dialogue act *****s in our setting and how to best capture that information. | ||
| L14-1180 The second approach is a DiAML-oriented querying of ***** dialogue act ***** annotated data, for which we designed an interface. | ||
| 2020.acl-main.638 To address these issues, we propose a neural co-generation model that generates ***** dialogue act *****s and responses concurrently. | ||
| D17-1232 We present an unsupervised model of ***** dialogue act ***** sequences in conversation. | ||
| 2020.lrec-1.80 More specifically, we describe the method used to annotate ***** dialogue act *****s in the corpus, including the evaluation of the annotations. | ||
| order | 37 | |
| 2020.coling-main.519 This is ***** order *****s of magnitude larger than previous speech corpora used for search and summarization. | ||
| 1998.amta-papers.33 The approach is based on pattern matching, morphological rules, and word ***** order ***** inversion. | ||
| 2020.udw-1.4 We use Universal Dependencies treebanks to test whether a well-known typological trade-off between word ***** order ***** freedom and richness of morphological marking of core arguments holds within individual languages. | ||
| 2001.mtsummit-eval.12 During the experiment, MT output of three different systems is compared in ***** order ***** to establish which MT system best serves the organisation's multilingual communication and information needs. | ||
| L14-1645 Along with the methodology for coping with this diversity in the speech data, we also describe a set of experiments performed in ***** order ***** to investigate the efficiency of different approaches for automatic data pruning. | ||
| abusive | 37 | |
| 2020.lrec-1.765 In this study, we explore the phenomenon of swearing in Twitter conversations, taking the possibility of predicting the ***** abusive *****ness of a swear word in a tweet context as the main investigation perspective. | ||
| 2021.winlp-1.2 The complete freedom of expression in social media has its costs especially in spreading harmful and ***** abusive ***** content that may induce people to act accordingly. | ||
| 2020.lrec-1.191 The main contribution of this paper is toevaluate the quality of the recently developed ”Spanish Database for cyberbullying prevention” for the purpose of trainingclassifiers on detecting ***** abusive ***** short texts. | ||
| 2020.acl-main.380 For example, texts containing some demographic identity-terms (e.g., “gay”, “black”) are more likely to be ***** abusive ***** in existing ***** abusive ***** language detection datasets. | ||
| P19-2051 A hybrid approach with deep learning and a multilingual lexicon to cross-domain and cross-lingual detection of ***** abusive ***** content is proposed and compared with other simpler models. | ||
| temporal relation classification | 37 | |
| D17-1190 We present a sequential model for ***** temporal relation classification ***** between intra-sentence events. | ||
| 2021.emnlp-main.815 In this paper, we propose a joint model for event-event ***** temporal relation classification ***** and an auxiliary task, relative event time prediction, which predicts the event time as real numbers. | ||
| 2021.acl-short.67 We present TIMERS - a TIME, Rhetorical and Syntactic-aware model for document-level ***** temporal relation classification ***** in the English language. | ||
| W19-5929 Prior work on ***** temporal relation classification ***** has focused extensively on event pairs in the same or adjacent sentences (local), paying scant attention to discourse-level (global) pairs. | ||
| 2020.findings-emnlp.121 *****Temporal relation classification***** is the pair-wise task for identifying the relation of a temporal link (TLINKs) between two mentions, i.e. event, time and document creation time (DCT). | ||
| ibm model | 37 | |
| L06-1184 This paper describes a word alignment training procedure for statistical machine translation that uses a simple and clear statistical model, different from the *****IBM models*****. | ||
| 2021.insights-1.12 This is due to both the *****IBM model***** losing its advantage over the implicitly learned neural alignment, and issues with subword segmentation of unseen words. | ||
| 2003.mtsummit-papers.6 While the total number of translations that preserved meaning were the same for (YK02) and the syntax-based system (and both higher than the IBM-model-4-based system), the syntax based system had 45% more translations that also had good syntax than did (YK02) (and approximately 70% more than *****IBM Model***** 4). | ||
| 2011.iwslt-papers.1 We study five types of lexicon models: a model which is extracted from word-aligned training data and—given the word alignment matrix—relies on pure relative frequencies [1]; the *****IBM model***** 1 lexicon | ||
| 2004.amta-papers.6 Using Giza++ as a reference implementation of the *****IBM Model***** 1, an HMMbased alignment and *****IBM Model***** 4, we measure the impact of normalizing inflectional morphology on German-English statistical word alignment. | ||
| minimum risk training | 37 | |
| 2021.acl-long.380 In this paper, we propose a novel unified framework for zero-shot sequence labeling with ***** minimum risk training ***** and design a new decomposable risk function that models the relations between the predicted labels from the source models and the true labels. | ||
| P19-1560 More specifically, we introduce the explainable factor and the ***** minimum risk training ***** approach that learn to generate more reasonable explanations. | ||
| C18-1008 We successfully leverage ***** minimum risk training ***** to compensate for the weaknesses of MLE parameter learning and neutralize the negative effects of training a pipeline with a separate character aligner. | ||
| 2020.inlg-1.7 To alleviate these issues, we propose a novel strategy for training REG models, using ***** minimum risk training ***** (MRT) with maximum likelihood estimation (MLE) and we show that our approach outperforms RL w.r.t naturalness and diversity of the output. | ||
| D18-1249 Unlike prior efforts, we propose a new lightweight joint learning paradigm based on ***** minimum risk training ***** (MRT). | ||
| language grid | 37 | |
| L12-1477 The *****Language Grid*****, a service-oriented collective intelligent platform, allows in-domain resources to be wrapped into language services. | ||
| L10-1495 This paper extends our previous work on integrating two different platforms, i.e. Heart of Gold and *****Language Grid*****. | ||
| L08-1436 This paper discusses ontologization of lexicon access functions in the context of a service-oriented language infrastructure, such as the *****Language Grid*****. | ||
| L06-1364 The *****Language Grid*****, recently proposed by one of the authors, is a language infrastructure available on the Internet. | ||
| L14-1708 We then present a case study wherein two representative frameworks, the *****Language Grid***** and UIMA, are integrated. | ||
| neural architecture search | 37 | |
| 2021.acl-srw.4 In this work, we design a comprehensive search space for BERT based RC models and employ a modified version of efficient ***** neural architecture search ***** (ENAS) method to automatically discover the design choices mentioned above. | ||
| 2021.acl-long.206 In this paper, we propose Automated Concatenation of Embeddings (ACE) to automate the process of finding better concatenations of embeddings for structured prediction tasks, based on a formulation inspired by recent progress on ***** neural architecture search *****. | ||
| 2021.naacl-industry.29 Thus we experiment with automatically optimizing the model architectures on the task at hand via ***** neural architecture search ***** (NAS). | ||
| D19-1367 In this paper, we study differentiable ***** neural architecture search ***** (NAS) methods for natural language processing. | ||
| 2020.acl-main.686 To enable low-latency inference on resource-constrained hardware platforms, we propose to design Hardware-Aware Transformers (HAT) with ***** neural architecture search *****. | ||
| verbal | 37 | |
| 2004.amta-papers.23 This paper describes an evaluation experiment about a Japanese-Uighur machine translation system which consists of ***** verbal ***** suffix processing, case suffix processing, phonetic change processing, and a Japanese-Uighur dictionary including about 20,000 words. | ||
| P19-1038 Nowadays, firm CEOs communicate information not only ***** verbal *****ly through press releases and financial reports, but also non***** verbal *****ly through investor meetings and earnings conference calls. | ||
| 2020.emnlp-main.375 First, we assess few-shot learning capabilities by developing controlled experiments that probe models' syntactic nominal number and ***** verbal ***** argument structure generalizations for tokens seen as few as two times during training. | ||
| W18-6247 Our inference results indicate the feasibility of using deep learning based ***** verbal ***** content representation in inferring hirability scores from online conversational video resumes. | ||
| L10-1388 This work reports the evaluation and selection of annotation tools to assign wh - question labels to *****verbal***** arguments in a sentence . | ||
| end - to - end | 37 | |
| 2020.emnlp-main.439 We propose an *****end - to - end***** approach for synthetic QA data generation . | ||
| 2020.clssts-1.2 The Machine Translation for English Retrieval of Information in Any Language ( MATERIAL ) research program , sponsored by the Intelligence Advanced Research Projects Activity ( IARPA ) , focuses on rapid development of *****end - to - end***** systems capable of retrieving foreign language speech and text documents relevant to different types of English queries that may be further restricted by domain . | ||
| 2019.iwslt-1.14 Our synthetic corpus and SpecAugment resulted in an improvement of 5 BLEU points over our baseline model on the test set of MuST - C En - De , reaching the score of 22.3 with a single *****end - to - end***** system . | ||
| 2020.lrec-1.365 Embedding commonsense knowledge is crucial for *****end - to - end***** models to generalize inference beyond training corpora . | ||
| W19-3219 This paper describes the system that team MYTOMORROWS - TU DELFT developed for the 2019 Social Media Mining for Health Applications ( SMM4H ) Shared Task 3 , for the *****end - to - end***** normalization of ADR tweet mentions to their corresponding MEDDRA codes . | ||
| Human | 37 | |
| P17-1142 *****Human***** trafficking is a global epidemic affecting millions of people across the planet . | ||
| 2005.mtsummit-swtmt.1 *****Human***** translation is based on linguistic and extralinguistic knowledge . | ||
| 2021.emnlp-main.157 *****Human***** expertise and the participation of speech communities are essential factors in the success of technologies for low - resource languages . | ||
| L14-1347 *****Human***** translators are the key to evaluating machine translation ( MT ) quality and also to addressing the so far unanswered question when and how to use MT in professional translation workflows . | ||
| 2021.alta-1.2 *****Human***** annotation for establishing the training data is often a very costly process in natural language processing ( NLP ) tasks , which has led to frugal NLP approaches becoming an important research topic . | ||
| part - of - speech | 37 | |
| 2020.lt4hala-1.18 We describe the JHUBC submission to the EvaLatin Shared task on lemmatization and *****part - of - speech***** tagging for Latin . | ||
| P18-1251 Weighted finite state transducers ( FSTs ) are frequently used in language processing to handle tasks such as *****part - of - speech***** tagging and speech recognition . | ||
| L14-1402 Several works in Natural Language Processing have recently looked into *****part - of - speech***** annotation of Twitter data and typically used their own data sets . | ||
| R17-1016 Decision trees have been previously employed in many machine - learning tasks such as *****part - of - speech***** tagging , lemmatization , morphological - attribute resolution , letter - to - sound conversion and statistical - parametric speech synthesis . | ||
| 2021.emnlp-demo.6 We introduce N - LTP , an open - source neural language technology platform supporting six fundamental Chinese NLP tasks : lexical analysis ( Chinese word segmentation , *****part - of - speech***** tagging , and named entity recognition ) , syntactic parsing ( dependency parsing ) , and semantic parsing ( semantic dependency parsing and semantic role labeling ) . | ||
| artificial | 37 | |
| 2020.coling-main.541 Corporate mergers and acquisitions ( M&A ) account for billions of dollars of investment globally every year and offer an interesting and challenging domain for *****artificial***** intelligence . | ||
| 2021.naacl-main.76 Understanding and executing natural language instructions in a grounded domain is one of the hallmarks of *****artificial***** intelligence . | ||
| 2020.acl-main.466 Legal Artificial Intelligence ( LegalAI ) focuses on applying the technology of *****artificial***** intelligence , especially natural language processing , to benefit tasks in the legal domain . | ||
| 2020.findings-emnlp.397 In previous work , *****artificial***** agents were shown to achieve almost perfect accuracy in referential games where they have to communicate to identify images . | ||
| 1991.iwpt-1.2 In the domain of *****artificial***** intelligence , the pattern of information flow varies drastically from one context to another . | ||
| hate | 37 | |
| 2021.eacl-demos.14 This system demonstration paper describes ASAD : Arabic Social media Analysis and unDerstanding , a suite of seven individual modules that allows users to determine dialects , sentiment , news category , offensiveness , *****hate***** speech , adult content , and spam in Arabic tweets . | ||
| N19-1305 Existing computational models to understand *****hate***** speech typically frame the problem as a simple classification task , bypassing the understanding of hate symbols ( e.g. , 14 words , kigy ) and their secret connotations . | ||
| 2020.alw-1.3 Distinguishing *****hate***** speech from non - hate offensive language is challenging , as hate speech not always includes offensive slurs and offensive language not always express hate . | ||
| 2020.trac-1.14 In the last few years , *****hate***** speech and aggressive comments have covered almost all the social media platforms like facebook , twitter etc . | ||
| 2021.acl-long.556 The wanton spread of *****hate***** speech on the internet brings great harm to society and families . | ||
| augment | 36 | |
| 2013.iwslt-evaluation.19 We ***** augment ***** training data with content words extracted from itself and experiment with reverse word order for source languages. | ||
| 2012.amta-papers.8 In this paper, we investigate large-scale lightly-supervised training with a pivot language: We ***** augment ***** a baseline statistical machine translation (SMT) system that has been trained on human-generated parallel training corpora with large amounts of additional unsupervised parallel data; but instead of creating this synthetic data from monolingual source language data with the baseline system itself, or from target language data with a reverse system, we employ a parallel corpus of target language data and data in a pivot language. | ||
| 2020.conll-1.50 We ***** augment ***** our models with embeddings represent-ing language ID, part of speech, and other features such as word embeddings. | ||
| 2020.bionlp-1.7 We ***** augment ***** this framework by introducing global embeddings to help with long-distance relation inference, and by multi-task learning to increase model performance and generalizability. | ||
| 2021.acl-long.59 In contrast to prior work, we ***** augment ***** our text representations by leveraging a complementary source of document context: the citation graph of referential links between citing and cited papers | ||
| TimeML | 36 | |
| L10-1375 In this paper, we present the reporting part of CAVaT, and then its error-checking ability, including the workings of several novel ***** TimeML ***** document verification methods. | ||
| L10-1109 Along with that this paper also suggests some additions to ***** TimeML ***** language by adding new event features (ontology type), some more SLINKs and also relations between events with their arguments, which we call RLINK (relation link). | ||
| L08-1563 The evaluation performed on TimeBank reveals an F-measure of 86.43% achieved for the identification of verbal events, and an accuracy of 85.25% in the task of classifying them into ***** TimeML ***** event classes. | ||
| L14-1439 Our work introduces an annotation project for Estonian, where temporal annotations in ***** TimeML ***** framework were manually added to a corpus containing gold standard morphological and dependency syntactic annotations. | ||
| Q18-1025 To compare predictions of systems that follow both SCATE and ***** TimeML *****, we present a new scoring metric for time intervals | ||
| elicitation | 36 | |
| 2020.nl4xai-1.11 This survey reviews studies focused on cognitive bias mitigation of recommender system users during two processes: 1) item selection and 2) preference ***** elicitation *****. | ||
| L10-1289 These data are pre-processed, synchronized, and enriched by text annotations of signed language ***** elicitation ***** sessions. | ||
| L14-1672 Then, we present the outline of an ***** elicitation ***** engine based on an inference engine using schemes like deduction, induction and abduction which will be referenced and briefly presented and we will especially highlight the new scheme (Relation Inference Scheme with Refinements) added to our system. | ||
| 2021.nlpmc-1.1 The ***** elicitation ***** of the dialogues is achieved through textual stimuli presented to dialogue writers | ||
| 2005.mtsummit-posters.10 In this document we will describe a semi - automated process for creating *****elicitation***** corpora . | ||
| disentangled | 36 | |
| 2021.cinlp-1.4 Second, since such global explanations do not justify causal interpretations, we propose a methodology for detecting confounding effects in natural language and generating explanations, ***** disentangled ***** from textual confounders, in the form of lexicons. | ||
| D18-1497 We propose a method for learning ***** disentangled ***** representations of texts that code for distinct and complementary aspects, with the aim of affording efficient model transfer and interpretability. | ||
| 2021.acl-long.511 Additionally, we provide new insights illustrating various trade-offs in style transfer when attempting to learn ***** disentangled ***** representations and quality of the generated sentence. | ||
| D18-1420 In this paper, the proposed framework contains two latent factors, namely, outcome factor and content factor, ***** disentangled ***** from the input sentence to allow convenient editing to change the outcome and keep the content. | ||
| 2021.emnlp-main.164 In this paper, we propose a semi-automatic framework for generating ***** disentangled ***** shifts by introducing a controllable visual question-answer generation (VQAG) module that is capable of generating highly-relevant and diverse question-answer pairs with the desired dataset style | ||
| BM25 | 36 | |
| 2021.emnlp-main.305 We analyze COUGH by testing different FAQ retrieval models built on top of ***** BM25 ***** and BERT, among which the best model achieves 48.8 under P@5, indicating a great challenge presented by COUGH and encouraging future research for further improvement. | ||
| 2020.sdp-1.29 Contrary to most previous works, we frame Task 1A as a search relevance problem, and introduce a 2-step re-ranking approach, which consists of a preselection based on ***** BM25 ***** in addition to positional document features, and a top-k re-ranking with BERT. | ||
| 2021.wanlp-1.24 SERAG is shown to significantly outperform the popular ***** BM25 ***** model thanks to its multi-hop reasoning. | ||
| W17-1214 Based on a SVM with character and POStag n-grams as features and the ***** BM25 ***** weighting scheme, it achieved 92.7% accuracy in the Discriminating between Similar Languages (DSL) task, ranking first among eleven systems but with a lead over the next three teams of only 0.2%. | ||
| 2021.bionlp-1.27 Using this probabilistic transformation of ***** BM25 ***** scores we show an improved performance on the PubMed Click dataset developed and presented in this study, as well as the 2007 TREC Genomics collection | ||
| ADR | 36 | |
| 2020.smm4h-1.20 Extracting ***** ADR ***** mentions is treated as sequence labeling and normalizing ***** ADR ***** mentions is treated as multi-class classification. | ||
| 2020.louhi-1.6 The automatic mapping of Adverse Drug Reaction (***** ADR *****) reports from user-generated content to concepts in a controlled medical vocabulary provides valuable insights for monitoring public health. | ||
| W19-3220 In this study, we describe our methods to automatically classify Twitter posts conveying events of adverse drug reaction (***** ADR *****). | ||
| E17-1014 Recognizing mentions of Adverse Drug Reactions (***** ADR *****) in social media is challenging: ***** ADR ***** mentions are context-dependent and include long, varied and unconventional descriptions as compared to more formal medical symptom terminology. | ||
| W19-3207 The goals of the first two tasks are to classify whether a tweet contains mentions of adverse drug reactions (***** ADR *****) and extract these mentions, respectively | ||
| denoising | 36 | |
| 2021.emnlp-main.533 In this paper we propose instead to use _***** denoising ***** adapters_, adapter layers with a ***** denoising ***** objective, on top of pre-trained mBART-50. | ||
| D18-1101 We also analyze the effect of vocabulary size and ***** denoising ***** type on the translation performance, which provides better understanding of learning the cross-lingual word embedding and its usage in translation. | ||
| D19-5537 We propose a new contextual text ***** denoising ***** algorithm based on the ready-to-use masked language model. | ||
| 2021.wat-1.9 Multilingual approaches such as mBART (Liu et al., 2020) are capable of pre-training a complete, multilingual sequence-to-sequence model through ***** denoising ***** objectives, making it a great starting point for building multilingual translation systems. | ||
| 2021.emnlp-main.259 We develop a ***** denoising ***** training approach | ||
| idiom | 36 | |
| C18-1132 We show that in contrast to simply joining the data of multiple tasks, multi-task learning consistently improves upon four metaphor and ***** idiom ***** detection tasks in two languages, English and German. | ||
| L10-1456 Based on the total tokens found for each ***** idiom ***** class, we suggest that future research on ***** idiom ***** detection and ***** idiom ***** annotation include prepositional phrases as this class of ***** idiom *****s occurred frequently in the nonfiction and spoken samples of our corpus | ||
| 2020.lrec-1.544 This is probably one of the reasons why many studies that investigated ***** idiom *****atic expressions collected limited information about ***** idiom ***** properties for very small numbers of ***** idiom *****s only. | ||
| 2020.lrec-1.35 Analysis of the resulting corpus revealed strong effects of genre on ***** idiom ***** distribution, providing new evidence for existing theories on what influences ***** idiom ***** usage | ||
| 2020.lrec-1.496 Human translators often resort to different non - literal translation techniques besides the literal translation , such as *****idiom***** equivalence , generalization , particularization , semantic modulation , etc . , especially when the source and target languages have different and distant origins . | ||
| pipelined | 36 | |
| 2021.eacl-main.55 Non-neural approaches to argument mining (AM) are often ***** pipelined ***** and require heavy feature-engineering. | ||
| D19-5822 In this work, we propose Interrogative-Word-Aware Question Generation (IWAQG), a ***** pipelined ***** system composed of two modules: an interrogative word classifier and a QG model. | ||
| L14-1708 It focuses on using ***** pipelined ***** execution and parallel execution to improve throughput of pipelines. | ||
| 2021.sdp-1.3 However, since each task has been studied and evaluated using data that has been independently developed, it is currently impossible to verify whether such tasks can be successfully ***** pipelined ***** to effective use in scientific-document writing. | ||
| 2021.naacl-main.5 In this work, we present a simple ***** pipelined ***** approach for entity and relation extraction, and establish the new state-of-the-art on standard benchmarks (ACE04, ACE05 and SciERC), obtaining a 1.7%-2.8% absolute improvement in relation F1 over previous joint models with the same pre-trained encoders | ||
| F1 | 36 | |
| 2020.acl-main.247 Specifically, our proposed model has strong empirical evidence as it obtains SOTA results on Natural Questions, a new benchmark MRC dataset, outperforming BERT-LARGE by 3 ***** F1 ***** points on short answer prediction. | ||
| D17-1009 For English, it achieves a 4.9 ***** F1 ***** point improvement over the state-of-the-art on GraphQuestions. | ||
| W19-2103 For example, a tweet-level stance detection model using only 13 user-level attributes (i.e. features that did not depend on the specific tweet) was able to obtain a higher ***** F1 ***** than the top-performing SemEval participant. | ||
| D18-1241 Our best model underperforms humans by 20 ***** F1 *****, suggesting that there is significant room for future work on this data. | ||
| Q19-1011 This is a significant improvement of 29.5 points ***** F1 ***** over state-of-the-art CNN classifiers with baseline segmentation | ||
| attribute | 36 | |
| 2021.ecnlp-1.2 In this paper we have presented a generative approach to the ***** attribute ***** value extraction problem using language models. | ||
| 2021.acl-long.511 The existent dominant approaches in the context of text data either rely on training an adversary (discriminator) that aims at making ***** attribute ***** values difficult to be inferred from the latent code or rely on minimising variational bounds of the mutual information between latent code and the value ***** attribute *****. | ||
| W19-8604 Based on these pivot words, we propose a lexical analysis framework, the Pivot Analysis, to quantitatively analyze the effects of these words in text ***** attribute ***** classification and transfer. | ||
| N18-1169 Based on human evaluation, our best method generates grammatical and appropriate responses on 22% more inputs than the best previous system, averaged over three ***** attribute ***** transfer datasets: altering sentiment of reviews on Yelp, altering sentiment of reviews on Amazon, and altering image captions to be more romantic or humorous. | ||
| E17-1006 The resulting embedding model, while being fully interpretable, outperforms count-based distributional vector space models that are tailored to ***** attribute ***** meaning in the two tasks of ***** attribute ***** selection and phrase similarity prediction | ||
| morphological disambiguation | 36 | |
| 2020.sltu-1.36 This paper presents an approach of voted perceptron for ***** morphological disambiguation ***** for the case of Kazakh language. | ||
| D19-3044 The sub-optimal performance is mainly due to errors in early ***** morphological disambiguation ***** decisions, that cannot be recovered later on in the pipeline, yielding incoherent annotations on the whole. | ||
| C18-1177 However, these taggers require external ***** morphological disambiguation ***** (MD) tools to function which are hard to obtain or non-existent for many languages. | ||
| N18-1130 Furthermore, we show that our model learns to exploit morphological knowledge encoded in the analyzer, and, as a byproduct, it can perform effective unsupervised ***** morphological disambiguation *****. | ||
| D17-1073 We make use of the resulting morphological models for scoring and ranking the analyses of the morphological analyzer for ***** morphological disambiguation ***** | ||
| graph embeddings | 36 | |
| P19-1026 This new model learns knowledge ***** graph embeddings ***** that can capture relation compositions by nature. | ||
| 2020.acl-main.241 Distance-based knowledge ***** graph embeddings ***** have shown substantial improvement on the knowledge graph link prediction task, from TransE to the latest state-of-the-art RotatE. | ||
| P19-2044 In this work, we propose a method that uses ***** graph embeddings ***** for integrating structured information from the knowledge base with unstructured information from text-based representations. | ||
| E17-1014 We use the CADEC corpus to train a recurrent neural network (RNN) transducer, integrated with knowledge ***** graph embeddings ***** of DBpedia, and show the resulting model to be highly accurate (93.4 F1) | ||
| P18-1186 We then build a deep zeroshot multimodal network for MNED that 1) extracts contexts from both text and image, and 2) predicts correct entity in the knowledge ***** graph embeddings ***** space, allowing for zeroshot disambiguation of entities unseen in training set as well. | ||
| speaker | 36 | |
| 2020.sigdial-1.22 By fitting a latent variable model to the corpus, we can exhibit utterances that give systematic evidence of the diverse kinds of reasoning ***** speaker *****s employ, and build integrated models that recognize not only ***** speaker ***** reference but also ***** speaker ***** reasoning. | ||
| 2020.sltu-1.51 We describe our experiments with available Cree data to improve automatic transcription both in ***** speaker *****- independent and dependent scenarios. | ||
| L10-1543 The Greybeard Project was designed so as to enable research in ***** speaker ***** recognition using data that have been collected over a long period of time. | ||
| L08-1351 The Maximum Likelihood Linear Regression (MLLR) technique has commonly been used in ***** speaker ***** adaptation; however we have used MLLR in language adaptation | ||
| W17-1606 Speakers' dialect and gender was controlled for by using videos uploaded as part of the “accent tag challenge”, where ***** speaker *****s explicitly identify their language background. | ||
| related languages | 36 | |
| C16-1095 In the absence of large annotated corpora, parallel corpora, treebanks, bilingual lexica, etc., we found the following to be effective: exploiting distributional regularities in monolingual data, projecting information across closely ***** related languages *****, and utilizing human linguist judgments. | ||
| K17-3010 To allow transfer learning for low-resource treebanks and surprise languages, we train several multilingual models for ***** related languages *****, grouped by their genus and language families. | ||
| L08-1576 We can show an improvement in alignment quality even for ***** related languages ***** compared to the cognate-based approach. | ||
| L16-1524 Based on the assumption, we propose a constraint-based bilingual lexicon induction for closely ***** related languages ***** by extending constraints and translation pair candidates from recent pivot language approach. | ||
| L16-1684 Although methods using resources from ***** related languages ***** outperform weakly supervised methods using just a few training examples, we can still reach a promising accuracy with methods abstaining additional resources. | ||
| speech detection | 36 | |
| 2021.ltedi-1.22 This paper proposes a bidirectional long short-term memory (BiLSTM) with the attention-based approach, in solving the hope ***** speech detection ***** problem. | ||
| 2021.woah-1.10 In hate ***** speech detection *****, however, equalizing model predictions may ignore important differences among targeted social groups, as hate speech can contain stereotypical language specific to each SGT. | ||
| 2021.acl-long.556 In other words, getting more affective features from other affective resources will significantly affect the performance of hate ***** speech detection *****. | ||
| 2020.wanlp-1.2 It also evaluates the recent language representation model BERT on the task of Arabic hate ***** speech detection *****. | ||
| W18-1105 While relevant research has been done independently on code-mixed social media texts and hate ***** speech detection *****, our work is the first attempt in detecting hate speech in Hindi-English code-mixed social media text. | ||
| verbal multiword expressions | 36 | |
| 2020.mwe-1.20 For our approach, we interpret detecting ***** verbal multiword expressions ***** as a token classification task aiming to decide whether a token is part of a verbal multiword expression or not. | ||
| 2020.mwe-1.17 This paper describes the ERMI system submitted to the closed track of the PARSEME shared task 2020 on automatic identification of ***** verbal multiword expressions ***** (VMWEs). | ||
| W18-4929 In this paper, we describe Mumpitz, the system we submitted to the PARSEME Shared task on automatic identification of ***** verbal multiword expressions ***** (VMWEs). | ||
| W18-4931 This paper describes a system submitted to the closed track of the PARSEME shared task (edition 1.1) on automatic identification of ***** verbal multiword expressions ***** (VMWEs). | ||
| 2020.mwe-1.19 This paper describes a semi-supervised system that jointly learns ***** verbal multiword expressions ***** (VMWEs) and dependency parse trees as an auxiliary task. | ||
| rhetorical structure | 36 | |
| L10-1079 The final parallel CODA corpus consists of 1000 dialogue turns that are tagged with dialogue acts and aligned with monologue that expresses the same information and has been annotated with ***** rhetorical structure ***** relations. | ||
| U19-1010 In this paper, we propose to use neural discourse representations obtained from a ***** rhetorical structure ***** theory (RST) parser to enhance document representations. | ||
| L10-1440 Like AZ, AZ-II follows the ***** rhetorical structure ***** of a scientific paper and the knowledge claims made by the authors. | ||
| C16-1312 In a pilot study with German newspaper commentary texts, we asked students to rate the degree of argumentativeness, and then looked for correlations with features of the annotated argumentation structure and the ***** rhetorical structure ***** (in terms of RST). | ||
| L08-1332 This paper describes a study of the levels at which different rhetorical relations occur in *****rhetorical structure***** trees . | ||
| information structure | 36 | |
| W16-4620 Machine translation systems should consider the ***** information structure ***** to improve the coherence of the output by using several topicalization techniques such as passivization. | ||
| L06-1157 To support this claim we present four linguistic phenomena for the study and relevant description of which in grammar a deep layer of corpus annotation as introduced in the Prague Dependency Treebank has brought important observations, namely the ***** information structure ***** of the sentence, condition of projectivity and word order, types of dependency relations and textual coreference. | ||
| Q15-1010 Inferring the ***** information structure ***** of scientific documents is useful for many NLP applications. | ||
| L14-1685 Along with selected schemes for ***** information structure ***** and coreference, discourse relations are discussed with special emphasis on the Penn Discourse Treebank and the RST Discourse Treebank. | ||
| C18-1191 Our error analysis indicates that an approach that takes the ***** information structure ***** into account (i.e. | ||
| unsupervised neural machine | 36 | |
| 2020.loresmt-1.10 In this work, we devise an ***** unsupervised neural machine ***** translation (UNMT) system consisting of a transformer based shared encoder and language specific decoders using denoising autoencoder and backtranslation with an additional Manipuri side multiple test reference. | ||
| P19-1119 Unsupervised bilingual word embedding (UBWE), together with other technologies such as back-translation and denoising, has helped ***** unsupervised neural machine ***** translation (UNMT) achieve remarkable results in several language pairs. | ||
| K19-1027 In this paper, we alleviate the local optimality of back-translation by learning a policy (takes the form of an encoder-decoder and is defined by its parameters) with future rewarding under the reinforcement learning framework, which aims to optimize the global word predictions for ***** unsupervised neural machine ***** translation. | ||
| 2020.wmt-1.128 Our core ***** unsupervised neural machine ***** translation (UNMT) system follows the strategy of Chronopoulou et al. | ||
| 2020.findings-emnlp.371 The rise of ***** unsupervised neural machine ***** translation (UNMT) almost completely relieves the parallel corpus curse, though UNMT is still subject to unsatisfactory performance due to the vagueness of the clues available for its core back-translation training. | ||
| neural network language | 36 | |
| C16-1130 Recently, researchers have shown promising results using word vectors extracted from a ***** neural network language ***** model as features in WSD algorithms. | ||
| W19-1706 We found that despite recognition word error rates of 7-16%, our ensemble of N-gram and recurrent ***** neural network language ***** models made predictions nearly as good as when they used the reference transcripts. | ||
| P18-2111 Recently, the use of a recurrent ***** neural network language ***** model was suggested as a method of generating query completions. | ||
| W18-4920 In this paper, we propose the first model for multiword expression (MWE) compositionality prediction based on character-level ***** neural network language ***** models. | ||
| 2020.acl-demos.10 Targeted syntactic evaluations have yielded insights into the generalizations learned by *****neural network language***** models . | ||
| automatic summarization | 36 | |
| 2020.coling-main.15 Empirical results show that our review-centric model can make better use of user-written summaries for review sentiment analysis, and is also more effective compared to existing methods when the user summary is replaced with summary generated by an ***** automatic summarization ***** system. | ||
| L10-1062 Our model summaries performed similar to the ones reported in Dang (2005) and thus are suitable for evaluating ***** automatic summarization ***** systems on the task of generating image descriptions for location related images. | ||
| 2020.lrec-1.822 So far work on ***** automatic summarization ***** has dealt primarily with English data. | ||
| 2021.nllp-1.19 In this paper, we propose the task of ***** automatic summarization ***** of German court rulings. | ||
| 2021.sdp-1.12 In this paper, we propose a session based ***** automatic summarization ***** model(SBAS) which using a session and ensemble mechanism to generate long summary. | ||
| system combination | 36 | |
| L12-1592 We describe the Shared Task on Applying Machine Learning Techniques to Optimise the Division of Labour in Hybrid Machine Translation (ML4HMT) which aims to foster research on improved ***** system combination ***** approaches for machine translation (MT). | ||
| 2013.iwslt-papers.6 Through progressive advances and ***** system combination ***** we reach a word error rate (WER) of 16.5% on the 2012 Quaero evaluation data. | ||
| L12-1665 That IWSLT 2011 evaluation focused on the automatic translation of public talks and included tracks for speech recognition, speech translation, text translation, and ***** system combination *****. | ||
| 2011.iwslt-evaluation.1 This year, the IWSLT evaluation focused on the automatic translation of public talks and included tracks for speech recognition, speech translation, text translation, and ***** system combination *****. | ||
| 2010.amta-papers.9 Combining the new properties of the TERp, we also propose a two-pass decoding strategy for the lattice-based phrase-level confusion network (CN) to generate the final result.The experiments conducted on the NIST2008 Chinese-to-English test set show that our TERp-based augmented ***** system combination ***** framework achieves significant improvements in terms of BLEU and TERp scores compared to the state-of-the-art word-level ***** system combination ***** framework and a TER-based combination strategy. | ||
| automatically generating | 36 | |
| 2020.acl-main.227 Further, ***** automatically generating ***** words with similar semantics is challenging, and hand-crafted linguistic rules are difficult to apply. | ||
| N19-1012 Finally, we develop baseline classifiers that can predict whether or not an edited headline is funny, which is a first step toward ***** automatically generating ***** humorous headlines as an approach to creating topical humor. | ||
| 2021.nlp4prog-1.7 We take the first step to address the task of ***** automatically generating ***** shellcodes, i.e., small pieces of code used as a payload in the exploitation of a software vulnerability, starting from natural language comments. | ||
| 2021.ranlp-1.17 Definition modelling is the task of ***** automatically generating ***** a dictionary-style definition given a target word. | ||
| 2021.eacl-main.229 Opinion summarization is the task of *****automatically generating***** summaries for a set of reviews about a specific target ( e.g. , a movie or a product ) . | ||
| data sparsity | 36 | |
| 2021.codi-main.3 Dealing with human-human dialogues makes for a realistic situation, but it calls for strategies to represent the context and face ***** data sparsity *****. | ||
| P18-1075 Bilingual tasks, such as bilingual lexicon induction and cross-lingual classification, are crucial for overcoming ***** data sparsity ***** in the target language. | ||
| R17-1065 To alleviate ***** data sparsity ***** in spoken Uyghur machine translation, we proposed a log-linear based morphological segmentation approach. | ||
| W18-5806 Morphologically rich languages are challenging for natural language processing tasks due to ***** data sparsity *****. | ||
| Q15-1038 Given a large corpus of definitions we leverage syntactic dependencies to reduce ***** data sparsity *****, then disambiguate the arguments and content words of the relation strings, and finally exploit the resulting information to organize the acquired relations hierarchically. | ||
| online social | 36 | |
| C18-1135 We investigate the birth and diffusion of lexical innovations in a large dataset of ***** online social ***** communities. | ||
| 2020.coling-main.51 In particular, Arabizi has recently emerged as the Arabic language in ***** online social ***** networks, becoming of great interest for opinion mining and sentiment analysis. | ||
| L16-1322 This corpus will be a valuable resource to investigate a variety of computational sociolinguistics research questions regarding ***** online social ***** interactions. | ||
| C16-1314 In this paper, we propose a systematic method to leverage user ***** online social ***** media content for predicting offline restaurant consumption level. | ||
| D19-5022 In recent years , the need for communication increased in *****online social***** media . | ||
| character | 36 | |
| L12-1283 This work is part of a project for MWE extraction and ***** character *****ization using different techniques aiming at measuring the properties related to idiomaticity, as institutionalization, non-compositionality and lexico-syntactic fixedness. | ||
| E17-1102 The results show that word- and ***** character *****-level representations each improve state-of-the-art results for BLI, and the best results are obtained by exploiting the synergy between these word- and ***** character *****-level representations in the classification model. | ||
| 2021.sigdial-1.30 We hypothesize that a multi-task model that trains on ***** character ***** dialogue plus ***** character ***** relationship information improves transformer-based story continuation. | ||
| 2020.wanlp-1.4 We propose a novel architecture for labelling ***** character ***** sequences that achieves state-of-the-art results on the Tashkeela Arabic diacritization benchmark. | ||
| 2021.acl-long.121 This paper presents a novel Multi-metadata Embedding based Cross-Transformer (MECT) to improve the performance of Chinese NER by fusing the structural information of Chinese ***** character *****s. | ||
| personal | 36 | |
| N18-3019 Extensive experimentation over a dataset of 10 domains drawn from data relevant to our commercial ***** personal ***** digital assistant shows that our BoE models outperform the baseline models with a statistically significant average margin of 5.06% in absolute F1-score when training with 2000 instances per domain, and achieve an even higher improvement of 12.16% when only 25% of the training data is used. | ||
| 2021.wassa-1.26 We explicitly examine the impact of transcription errors on the downstream performance of a multi-modal system on three related tasks from three datasets: emotion, sarcasm, and ***** personal *****ity detection. | ||
| P18-1205 Chit-chat models are known to have several problems: they lack specificity, do not display a consistent ***** personal *****ity and are often not very captivating. | ||
| C18-1156 Also, since the sarcastic nature and form of expression can vary from person to person, CASCADE utilizes user embeddings that encode stylometric and ***** personal *****ity features of users. | ||
| 2020.lrec-1.550 Our focus is directed at the de-identification of emails where ***** personal *****ly identifying information does not only refer to the sender but also to those people, locations, dates, and other identifiers mentioned in greetings, boilerplates and the content-carrying body of emails. | ||
| english resource grammar | 36 | |
| L08-1251 We use a hybrid NLP architecture with shallow preprocessing for increased robustness and domain-specific, ontology-based named entity recognition, followed by a deep HPSG parser running the *****English Resource Grammar***** (ERG). | ||
| K18-1054 To improve the parsing performance on cross-domain texts, we propose a data-oriented method to explore the linguistic generality encoded in *****English Resource Grammar*****, which is a precisionoriented, hand-crafted HPSG grammar, in an implicit way. | ||
| P18-1038 This accuracy is equivalent to that of *****English Resource Grammar***** guided models, suggesting that (recurrent) neural network models are able to effectively learn deep linguistic knowledge from annotations. | ||
| L10-1343 Our work is carried out within DELPH-IN (http://www.delph-in.net), using the LinGo Redwoods and the WeScience corpora, parsed with the *****English Resource Grammar***** and the PET parser. | ||
| L06-1277 An experiment with the British National Corpus shows about 70% of the sentences contain unknownword(s) for the *****English Resource Grammar*****. | ||
| mwe identification | 36 | |
| 2020.readi-1.3 In this paper, we propose a text complexity assessment system for English, which incorporates *****MWE identification*****. | ||
| W19-5110 On this basis, we claim that, in order to make strong headway in *****MWE identification*****, the community should bend its mind into coupling identification of MWEs with their discovery, via syntactic MWE lexicons. | ||
| Q14-1016 Experiments on a new dataset of English web text offer the first linguistically-driven evaluation of *****MWE identification***** with truly heterogeneous expression types. | ||
| C16-1046 Much previous research on multiword expressions (MWEs) has focused on the token- and type-level tasks of *****MWE identification***** and extraction, respectively. | ||
| W19-5121 Recent initiatives such as the PARSEME shared task allowed the rapid development of *****MWE identification***** systems. | ||
| zero - shot cross - lingual transfer | 36 | |
| 2021.nodalida-main.16 This article studies register classification of documents from the unrestricted web, such as news articles or opinion blogs, in a multilingual setting, exploring both the benefit of training on multiple languages and the capabilities for *****zero-shot cross-lingual transfer*****. | ||
| 2021.eacl-srw.24 Specifically, we show 1) that *****zero-shot cross-lingual transfer***** from the large English CORE corpus can match or surpass previously published monolingual models, and 2) that lightweight monolingual classification requiring very little training data can reach or surpass our zero-shot performance. | ||
| 2020.emnlp-main.584 We put forward a large multilingual benchmark, XL-WiC, featuring gold standards in 12 new languages from varied language families and with different degrees of resource availability, opening room for evaluation scenarios such as *****zero-shot cross-lingual transfer*****. | ||
| 2021.insights-1.7 While *****zero-shot cross-lingual transfer***** relies on multilingual word embeddings to apply a model trained on one language to another, Yarowski and Ngai (2001) propose the method of annotation projection to generate training data without manual annotation. | ||
| D19-1077 A new release of BERT (Devlin, 2018) includes a model simultaneously pretrained on 104 languages with impressive performance for *****zero-shot cross-lingual transfer***** on a natural language inference task. | ||
| Universal | 36 | |
| K18-2022 We present SParse , our Graph - Based Parsing model submitted for the CoNLL 2018 Shared Task : Multilingual Parsing from Raw Text to *****Universal***** Dependencies ( Zeman et al . , 2018 ) . | ||
| 2002.amta-systems.3 Any - Language Communications has developed a novel semantics - oriented pre - market prototype system , based on the Theory of *****Universal***** Grammar , that uses the innate relationships of the words in a sensible sentence ( the natural intelligence ) to determine the true contextual meaning of all the words . | ||
| 2020.lrec-1.696 With this paper , we provide an overview over ISOCat successor solutions and annotation standardization efforts since 2010 , and we describe the low - cost harmonization of post - ISOCat vocabularies by means of modular , linked ontologies : The CLARIN Concept Registry , LexInfo , Universal Parts of Speech , *****Universal***** Dependencies and UniMorph are linked with the Ontologies of Linguistic Annotation and through it with ISOCat , the GOLD ontology , the Typological Database Systems ontology and a large number of annotation schemes . | ||
| R17-1077 In this paper , we introduce a cross - lingual Semantic Role Labeling ( SRL ) system with language independent features based upon *****Universal***** Dependencies . | ||
| W18-6013 In this paper , for the purpose of enhancing *****Universal***** Dependencies for the Korean language , we propose a modified method for mapping Korean Part - of - Speech(POS ) tagset in relation to Universal Part - of - Speech ( UPOS ) tagset in order to enhance the Universal Dependencies for the Korean Language . | ||
| new | 36 | |
| Q18-1046 We introduce a novel framework for delexicalized dependency parsing in a *****new***** language . | ||
| 2021.naacl-main.33 The importance of building semantic parsers which can be applied to *****new***** domains and generate programs unseen at training has long been acknowledged , and datasets testing out - of - domain performance are becoming increasingly available . | ||
| 2021.naacl-main.147 Domain divergence plays a significant role in estimating the performance of a model in *****new***** domains . | ||
| 2021.naacl-main.237 A key challenge of dialog systems research is to effectively and efficiently adapt to *****new***** domains . | ||
| 1998.amta-papers.3 The MT engine of the JANUS speech - to - speech translation system is designed around four main principles : 1 ) an interlingua approach that allows the efficient addition of *****new***** languages , 2 ) the use of semantic grammars that yield low cost high quality translations for limited domains , 3 ) modular grammars that support easy expansion into new domains , and 4 ) efficient integration of multiple grammars using multi - domain parse lattices and domain re - scoring . | ||
| aggregating | 35 | |
| 2020.aacl-main.87 Next, we learn domain-attention scores over the sources for ***** aggregating ***** the predictions of the source-specific models. | ||
| P19-1077 Markable identification is typically carried out semi-automatically, by running a markable identifier and correcting its output by hand–which is increasingly done via annotators recruited through crowdsourcing and ***** aggregating ***** their responses. | ||
| 2021.acl-short.46 However, modeling cardinality based on ***** aggregating ***** a set of transformations with the same topology has been proven more effective than going deeper or wider when increasing capacity. | ||
| 2020.iwslt-1.33 Finally, we find that ***** aggregating ***** predictions across multiple context windows improves accuracy even further. | ||
| 2020.coling-main.476 In this paper, we propose a novel detection model based on tree transformer to better utilize user interactions in the dialogue where post-level self-attention plays the key role for ***** aggregating ***** the intra-/inter-subtree stances | ||
| firstly | 35 | |
| P19-1129 This multi-turn QA formalization comes with several key advantages: ***** firstly *****, the question query encodes important information for the entity/relation class we want to identify; secondly, QA provides a natural way of jointly modeling entity and relation; and thirdly, it allows us to exploit the well developed machine reading comprehension (MRC) models. | ||
| L10-1322 In our model, we ***** firstly ***** learn a model from training data and then we use the learned model to discover knowledge in a specific domain. | ||
| L10-1012 Our motivation to move towards this direction is twofold: ***** firstly *****, extending binary relation instances with time leads to a massive proliferation of useless objects (independently of the encoding); secondly, reasoning and querying with such extended relations is extremely complex, expensive, and error-prone. | ||
| 2020.emnlp-main.738 It ***** firstly ***** estimates the input data's supportiveness for each target word with an estimator and then applies a supportiveness adaptor and a rebalanced beam search to harness the over-generation problem in the training and generation phases respectively. | ||
| 2008.iwslt-evaluation.18 For the pivot task, we combined the translations generated by a pivot based statistical translation model and a statistical transfer translation model (***** firstly *****, translating from Chinese to English, and then from English to Spanish) | ||
| obtaining | 35 | |
| C16-1082 Firstly, a sample of frequent VNCs are analysed in-depth and tagged along lexico-semantic and morphosyntactic dimensions, ***** obtaining ***** satisfactory inter-annotator agreement scores. | ||
| P19-1316 Unlike traditional approaches that solely use an unsupervised setting, we have also framed the problem as a supervised task, ***** obtaining ***** comparable improvements. | ||
| D19-1414 We empirically evaluate the proposed approach for biomedical relation extraction tasks, ***** obtaining ***** significant accuracy improvements w.r.t. | ||
| 2020.semeval-1.290 We show that our model performs competitively on all five languages, ***** obtaining ***** the fourth position in the English task with an F1-score of 0.919 and eighth position in the Turkish task with an F1-score of 0.781. | ||
| 2020.sustainlp-1.12 Its characteristics allow us to use it in a low-resource scenario, where only a small amount of training data are available, ***** obtaining ***** an efficient Generator | ||
| FastText | 35 | |
| 2020.trac-1.18 We have found that the LSTM model with ***** FastText ***** embedding is performing better than other models for Hindi and Bangla datasets but for the English dataset, the CNN model with ***** FastText ***** embedding has performed better. | ||
| 2020.semeval-1.173 The first approach uses cross-lingual embeddings resulting from projecting Hinglish and pre-trained English ***** FastText ***** word embeddings in the same space. | ||
| 2021.codi-main.2 Transfer learning and non-transfer learning techniques are implemented by utilizing pre-trained models such as ***** FastText ***** word embeddings, BERT language models and Text GCN, which learns the word and document embeddings simultaneously of the corpus given. | ||
| W18-1205 We propose CNN- and RNN-based subword-level composition functions for learning word embeddings, and systematically compare them with popular word-level and subword-level models (Skip-Gram and ***** FastText *****) | ||
| 2020.wnut-1.74 This paper presents Iswara 's participation in the WNUT-2020 Task 2 Identification of Informative COVID-19 English Tweets using BERT and *****FastText***** Embeddings , which tries to classify whether a certain tweet is considered informative or not . | ||
| recognizer | 35 | |
| 1991.iwpt-1.15 In the second part a fast parallel ***** recognizer ***** is given for general CFG's. | ||
| W19-6142 We report on work in progress which consists of annotating an Icelandic corpus for named entities (NEs) and using it for training a named entity ***** recognizer ***** based on a Bidirectional Long Short-Term Memory model. | ||
| 1995.iwpt-1.24 Spelling correction using a ***** recognizer ***** constructed from a large word German list that simulates compounding, also indicates that the approach is applicable in such cases. | ||
| 2000.iwpt-1.38 This paper describes a rule based method for partial parsing, particularly for noun phrase recognition, which has been used in the development of a noun phrase ***** recognizer ***** for Modern Greek. | ||
| L06-1440 In case of factoid questions, we can use a question classifier (trained according to a target taxonomy) and a named entity ***** recognizer ***** | ||
| saliency | 35 | |
| L14-1105 For example, we construct six ***** saliency ***** classes, and for the words in each of these classes we compare the simulation results with the human data. | ||
| L12-1600 Additionally, we compare the gaze behavior of the human subjects to evaluate ***** saliency ***** regions in the multimodal and visual only conditions. | ||
| 2020.blackboxnlp-1.14 For this goal and user, we argue that input ***** saliency ***** methods are better suited, and that there are no compelling reasons to use attention, despite the coincidence that it provides a weight for each input. | ||
| 2021.acl-short.19 Our results indicate that ***** saliency ***** could be a cognitively more plausible metric for interpreting neural language models | ||
| 2021.naacl-main.399 Saliency methods are widely used to interpret neural network predictions , but different variants of *****saliency***** methods often disagree even on the interpretations of the same prediction made by the same model . | ||
| retrieving | 35 | |
| 2020.acl-main.398 However, training semantic parsers from weak supervision poses difficulties, and in addition, the generated logical forms are only used as an intermediate step prior to ***** retrieving ***** the denotation. | ||
| 2021.iwpt-1.21 To avoid sparsity issues resulting from lexicalized dependency labels, we replace lexical items in relations with placeholders at training and prediction time, later ***** retrieving ***** them from the parse via a hybrid rule-based/machine-learning system. | ||
| P19-1221 In this work, we present RE^3QA, a unified question answering model that combines context ***** retrieving *****, reading comprehension, and answer reranking to predict the final answer. | ||
| 2021.mrqa-1.8 Open-domain extractive question answering works well on textual data by first ***** retrieving ***** candidate texts and then extracting the answer from those candidates. | ||
| 2021.emnlp-main.560 We study multi-answer retrieval, an under-explored problem that requires ***** retrieving ***** passages to cover multiple distinct answers for a given question | ||
| presented | 35 | |
| W17-1416 Results show that gender differences in the language use remain in professional environment not only in usage of function words, preferred linguistic constructions, but in the ***** presented ***** topics as well. | ||
| 2020.sustainlp-1.20 On GPU, we also achieve up to 12.4x speed-up with the ***** presented ***** methods. | ||
| 2020.lt4hala-1.6 Lastly, the paper envisages the advantages of an inclusion of LatInfLexi into the LiLa knowledge base, both for the ***** presented ***** resource and for the knowledge base itself. | ||
| L06-1368 Extracted relations and ontologies are crucial for the structuring of the information at the portal pages, automatic classification of the ***** presented ***** documents as well as for personalisation at the presentation level. | ||
| W19-4402 Owing to its generic nature, the ***** presented ***** approach has the potential to generalize over other exams containing MCQs | ||
| formalize | 35 | |
| L10-1493 Ontology-driven means, that the system is driven by an ontological schema to manage the research information and knowledge life-cycles: identify relevant concepts of information, structure and ***** formalize ***** them, assign relationships, functions and views, add states and rules, modify them. | ||
| 2020.acl-main.174 We ***** formalize ***** narrative structure in terms of key narrative events (turning points) and treat it as latent in order to summarize screenplays (i.e., extract an optimal sequence of scenes). | ||
| S18-2026 We ***** formalize ***** this as two NLP tasks: predicting judgments of (i) individuals and (ii) groups based on the text of the assertion and previous judgments. | ||
| D18-1052 We ***** formalize ***** a new modular variant of current question answering tasks by enforcing complete independence of the document encoder from the question encoder. | ||
| 2021.naacl-main.160 We ***** formalize ***** the problem of selecting a set of questions as an integer linear programming problem and use standard solvers to get a solution | ||
| NAT | 35 | |
| 2021.naacl-main.313 Therefore, we propose to adopt multi-task learning to transfer the AT knowledge to ***** NAT ***** models through encoder sharing. | ||
| 2020.acl-main.138 Extensive experiments on English and German named entity recognition benchmarks confirmed that ***** NAT ***** consistently improved robustness of popular sequence labeling models, preserving accuracy on the original input. | ||
| 2020.acl-main.36 Specifically, we first empirically study the functionalities of the encoder and the decoder in ***** NAT ***** models, and find that the encoder takes a more important role than the decoder regarding the translation quality. | ||
| 2021.cl-4.29 Finally, we apply a three-stage training strategy to combine these two methods to train the ***** NAT ***** model. | ||
| P19-1125 A potential issue of the existing ***** NAT ***** algorithms, however, is that the decoding is conducted in parallel, without directly considering previous context | ||
| edits | 35 | |
| 2020.sigmorphon-1.19 We adapt the model to use substitution ***** edits ***** and train it with a weighted finite-state transducer acting as the expert policy. | ||
| N19-1012 Our publicly available data consists of regular English news headlines paired with versions of the same headlines that contain simple replacement ***** edits ***** designed to make them funny. | ||
| 2021.wnut-1.46 Recently, per-word classification of correction ***** edits ***** has proven an efficient, parallelizable alternative to current encoder-decoder GEC systems. | ||
| 2021.emnlp-main.482 However, we show that the popular reconstruction-based RoBERTa model is sensitive to source code ***** edits *****, even when the ***** edits ***** preserve semantics. | ||
| D17-1213 In this work, we develop in collaboration with Wikipedia editors a 13-category taxonomy of the semantic intention behind ***** edits ***** in Wikipedia articles | ||
| synonymy | 35 | |
| 2020.clssts-1.4 We discuss potential advantages of the approach in handling polysemy and ***** synonymy *****. | ||
| W18-3018 Word vector space specialisation models offer a portable, light-weight approach to fine-tuning arbitrary distributional vector spaces to discern between ***** synonymy ***** and antonymy. | ||
| Q13-1023 Moreover, our approach can incorporate external ***** synonymy ***** information (increasing its pairwise accuracy to 78%) and extends easily to new languages. | ||
| L10-1405 The models can handle different types of word relations: ***** synonymy ***** in the SA and co-occurrence in ACOM. | ||
| 2021.eacl-main.208 Ideally, the ***** synonymy ***** and semantic relatedness of names should be consistently reflected by their closeness in an embedding space | ||
| curated | 35 | |
| W17-3525 For these languages we present an automatic post-editing approach which learns how to post-edit the rule-based titles into ***** curated ***** titles. | ||
| L14-1404 First, every text in the corpus is a story, which is in contrast to other language resources that may contain stories or story-like texts, but are not specifically ***** curated ***** to contain only stories. | ||
| 2020.coling-main.579 We propose two classes of techniques to mitigate these errors: wordlist-based tunable-precision filters (for which we release ***** curated ***** lists in about 500 languages) and transformer-based semi-supervised LangID models, which increase median dataset precision from 5.5% to 71.2%. | ||
| 2020.findings-emnlp.66 In this paper, we present PolicyQA, a dataset that contains 25,017 reading comprehension style examples ***** curated ***** from an existing corpus of 115 website privacy policies | ||
| W17-3209 Parallel corpora are often not as parallel as one might assume : non - literal translations and noisy translations abound , even in *****curated***** corpora routinely used for training and evaluation . | ||
| vowels | 35 | |
| W19-3627 Given that Singapore adopts British English as the institutional standard, one might expect Singaporean children to follow British pronunciation patterns, but we observe that Singaporean children also present similar patterns to Americans for TRAP-BATH spilt ***** vowels *****: (1) British and Singaporean children both produce these ***** vowels ***** with a relatively lowered tongue height. | ||
| D19-3037 Short ***** vowels *****, aka diacritics, are more often omitted when writing different varieties of Arabic including Modern Standard Arabic (MSA), Classical Arabic (CA), and Dialectal Arabic (DA). | ||
| W17-4112 Given an undeciphered alphabetic writing system or mono-alphabetic cipher, determine: (1) which of its letters are ***** vowels ***** and which are consonants; and (2) whether the writing system is a vocalic alphabet or an abjad. | ||
| W19-4219 They are also able to capture co-occurrence restrictions among ***** vowels ***** such as those observed in languages with vowel harmony | ||
| 2020.acl-main.415 We present VoxClamantis v1.0 , the first large - scale corpus for phonetic typology , with aligned segments and estimated phoneme - level labels in 690 readings spanning 635 languages , along with acoustic - phonetic measures of *****vowels***** and sibilants . | ||
| humanities | 35 | |
| 2020.lrec-1.112 The annotation of texts and other material in the field of digital ***** humanities ***** and Natural Language Processing (NLP) is a common task of research projects. | ||
| 2021.latechclfl-1.17 We explore Boccaccio's Decameron to see how digital ***** humanities ***** tools can be used for tasks that have limited data in a language no longer in contemporary use: medieval Italian. | ||
| L16-1488 It can also be used by ***** humanities ***** scholars to analyse photographic style changes, the representation of people and societal issues, and new tools for exploring photograph reuse via image-similarity-based search. | ||
| W16-4016 It can benefit particularly digital ***** humanities ***** researchers working in the field of pragmatics, conversational analysis and discourse analysis. | ||
| 2021.eval4nlp-1.11 Detecting lexical semantic change in smaller data sets, e.g. in historical linguistics and digital ***** humanities *****, is challenging due to a lack of statistical power | ||
| analogical | 35 | |
| K19-1085 Using ***** analogical ***** inference as our use case, we propose a framework and a neural network architecture for learning dedicated sentence embeddings that preserve ***** analogical ***** properties in the semantic space. | ||
| 2020.bionlp-1.4 In this paper, we demonstrate the utility of encoding dependency structure in word embeddings in a model we call Embedding of Structural Dependencies (ESD) as a way to represent biomedical relationships in two ***** analogical ***** retrieval tasks: a relationship retrieval (RR) task, and a literature-based discovery (LBD) task meant to hypothesize plausible relationships between pairs of entities unseen in training. | ||
| K18-1045 These findings demonstrate the importance of order-based information in ***** analogical ***** retrieval tasks, and the utility of random permutations as a means to augment neural embeddings. | ||
| S17-1001 Previous work have shown that this strategy works more reliably for certain types of ***** analogical ***** word relationships than for others, but these studies have not offered a convincing account for why this is the case | ||
| S17-1017 This paper explores the possibilities of *****analogical***** reasoning with vector space models . | ||
| Biomedical | 35 | |
| 2021.bionlp-1.16 We introduce BioELECTRA, a biomedical domain-specific language encoder model that adapts ELECTRA for the ***** Biomedical ***** domain. | ||
| 2020.emnlp-main.431 We introduce ***** Biomedical ***** Event Extraction as Sequence Labeling (BeeSL), a joint end-to-end neural information extraction model | ||
| W18-6444 For the WMT 2018 shared task of translating documents pertaining to the *****Biomedical***** domain , we developed a scoring formula that uses an unsophisticated and effective method of weighting term frequencies and was integrated in a data selection pipeline . | ||
| P19-1317 *****Biomedical***** concepts are often mentioned in medical documents under different name variations ( synonyms ) . | ||
| 2020.louhi-1.9 Detecting negation and speculation in language has been a task of considerable interest to the biomedical community , as it is a key component of Information Extraction systems from *****Biomedical***** documents . | ||
| Sanskrit | 35 | |
| D18-1295 The models discussed in this paper clearly improve over previous approaches to ***** Sanskrit ***** word segmentation. | ||
| W17-2409 Derivational nouns are widely used in *****Sanskrit***** corpora and represent an important cornerstone of productivity in the language . | ||
| P19-1111 The word ordering in a *****Sanskrit***** verse is often not aligned with its corresponding prose order . | ||
| W18-5817 *****Sanskrit***** /n/-retroflexion is one of the most complex segmental processes in phonology . | ||
| 2019.icon-1.12 Computationally analyzing *****Sanskrit***** texts requires proper segmentation in the initial stages . | ||
| labelled | 35 | |
| 2020.lrec-1.613 These metrics include two novel metrics for evaluating domain adaptability to help source domain selection of ***** labelled ***** data and utilize word and sentence-based embeddings as metrics for un***** labelled ***** data. | ||
| D18-1133 In this paper, instead of focusing on architecture engineering, we take advantage of small amounts of ***** labelled ***** data that model semantic phenomena in text to encode matching features directly in the word representations. | ||
| 2021.ranlp-1.3 Since most of the NLP techniques either require linguistic knowledge that can only be developed by experts and native speakers of that language or they require a lot of ***** labelled ***** data which is again expensive to generate, the task of text classification becomes challenging for most of the Indian languages. | ||
| 2020.coling-main.224 To compensate for the scarcity of ***** labelled ***** data, semi-supervised dependency parsing methods are developed to utilize un***** labelled ***** data in the training procedure of dependency parsers. | ||
| Q14-1026 Current supervised parsers are limited by the size of their ***** labelled ***** training data, making improving them with un***** labelled ***** data an important goal | ||
| embedding vectors | 35 | |
| D18-1294 The one-to-one correspondence between these “syntax-embedding” vectors and the words (hence their ***** embedding vectors *****) in the sentence makes it easy to integrate such a representation with all word-level NLP models. | ||
| L16-1189 This paper presents some experiments for specialising Paragraph Vectors, a new technique for creating text fragment (phrase, sentence, paragraph, text, ...) ***** embedding vectors *****, for text polarity detection. | ||
| P19-2046 In order to remove the lexicon dependency without decreasing the performance, we replace bag-of-words model word features by word ***** embedding vectors *****. | ||
| W16-4004 Two types of similarity measures are used: the first applies co-occurrence statistics, while the second exploits cosine similarity on different types of word ***** embedding vectors *****. | ||
| 2021.acl-long.426 During training, DNE forms virtual sentences by sampling ***** embedding vectors ***** for each word in an input sentence from a convex hull spanned by the word and its synonyms, and it augments them with the training data | ||
| interpreting | 35 | |
| K18-1007 They are useful for understanding the shortcomings of machine learning models, ***** interpreting ***** their results, and for regularisation. | ||
| 2021.blackboxnlp-1.34 Pre-trained language models (PLMs) like BERT are being used for almost all language-related tasks, but ***** interpreting ***** their behavior still remains a significant challenge and many important questions remain largely unanswered. | ||
| 2020.lrec-1.712 Metaphor comprehension and understanding is a complex cognitive task that requires ***** interpreting ***** metaphors by grasping the interaction between the meaning of their target and source concepts. | ||
| D19-1465 We further investigate how words distribute in global and local context, and find that aspect and non-aspect words do exhibit different context, ***** interpreting ***** our superiority in unsupervised aspect extraction. | ||
| 2021.acl-short.19 Our results indicate that saliency could be a cognitively more plausible metric for ***** interpreting ***** neural language models. | ||
| mapping | 35 | |
| D17-1264 Leveraging zero-shot learning to learn ***** mapping ***** functions between vector spaces of different languages is a promising approach to bilingual dictionary induction. | ||
| D18-1082 To realize this ***** mapping *****, existing works tend to design intuitive but complex models. | ||
| N19-1391 We incorporate context in the transformation matrix by directly ***** mapping ***** the averaged embeddings of aligned sentences in a parallel corpus. | ||
| D19-1450 In this paper, we propose a weakly-supervised adversarial training method to overcome this limitation, based on the intuition that ***** mapping ***** across languages is better done at the concept level than at the word level. | ||
| 2020.smm4h-1.20 Task 3 involves extracting ADR mentions and then ***** mapping ***** them to MedDRA codes | ||
| feedback | 35 | |
| 2020.coling-demos.15 Also, it provides a facility to grade text in reference to given grade-level and gives users ***** feedback ***** about the complexity or difficulty of words used in a text. | ||
| 2020.lrec-1.42 A part of the annotation results is now available on the web, which will facilitate research in ***** feedback ***** comment generation | ||
| P18-2052 Our results show that chunk-level ***** feedback ***** outperforms sentence based ***** feedback ***** by up to 2.61% BLEU absolute. | ||
| C18-1059 We present a framework that integrates Entity Set Expansion (ESE) and Active Learning (AL) to reduce the annotation cost of sparse data and provide an online evaluation method as ***** feedback *****. | ||
| L12-1587 In addition, the post-editing of automatic translations can help understand problems in such translations and this can be used as ***** feedback ***** for researchers and developers to improve MT systems | ||
| downstream NLP | 35 | |
| 2020.emnlp-main.660 We demonstrate that this framework enables a pretrained entailment model to work well on new entailment domains in a few-shot setting, and show its effectiveness as a unified solver for several ***** downstream NLP ***** tasks such as question answering and coreference resolution when the end-task annotations are limited. | ||
| D19-1059 We evaluate our approach on 11 ***** downstream NLP ***** tasks. | ||
| 2020.coling-main.272 Learning semantic correspondences between structured input data (e.g., slot-value pairs) and associated texts is a core problem for many ***** downstream NLP ***** applications, e.g., data-to-text generation. | ||
| D19-1587 Rhetorical Structure Theory (RST) parsing is crucial for many ***** downstream NLP ***** tasks that require a discourse structure for a text. | ||
| 2020.findings-emnlp.212 Formality style transfer is the task of converting informal sentences to grammatically-correct formal sentences, which can be used to improve performance of many ***** downstream NLP ***** tasks | ||
| Event | 35 | |
| 2021.emnlp-main.815 *****Event***** time is one of the most important features for event - event temporal relation extraction . | ||
| W17-2703 Recent methods for *****Event***** Detection focus on Deep Learning for automatic feature generation and feature ranking . | ||
| 2021.emnlp-main.637 *****Event***** detection has long been troubled by the trigger curse : overfitting the trigger will harm the generalization ability while underfitting it will hurt the detection performance . | ||
| L14-1513 *****Event***** coreference is an important task for full text analysis . | ||
| 2021.acl-long.357 *****Event***** forecasting is a challenging , yet important task , as humans seek to constantly plan for the future . | ||
| scientific literature | 35 | |
| 2020.acl-demos.41 We train a model to automatically extract supplement information and identify such interactions from the ***** scientific literature *****. | ||
| D19-5320 We assembled a knowledge graph by mining the available biomedical ***** scientific literature ***** and extracted a set of high frequency paths to use for validation. | ||
| 2020.lrec-1.178 Tackling these issues is challenging since labelled corpora involving multiple domains and compiled in more than one language are few in the ***** scientific literature *****. | ||
| W19-2602 In this paper, we propose a novel, scalable, semi-supervised method for extracting relevant structured information from the vast available raw ***** scientific literature *****. | ||
| Q18-1018 “Based on theoretical reasoning it has been suggested that the reliability of findings published in the ***** scientific literature ***** decreases with the popularity of a research field” (Pfeiffer and Hoffmann, 2009). | ||
| semeval | 35 | |
| 2020.***** semeval *****-1.30 It consists of preparing a semantic vector space for each corpus, earlier and later; computing a linear transformation between earlier and later spaces, using Canonical Correlation Analysis and orthogonal transformation;and measuring the cosines between the transformed vector for the target word from the earlier corpus and the vector for the target word in the later corpus. | ||
| 2021.***** semeval *****-1.21 For subtask-III, we achieve accuracies of 65.64% and 64.27%. | ||
| 2021.***** semeval *****-1.7 The evaluation results for the third subtask confirmed the importance of both modalities, the text and the image. | ||
| 2020.***** semeval *****-1.107 This paper describes our contribution to SemEval-2020 Task 7: Assessing Humor in Edited News Headlines. | ||
| 2020.***** semeval *****-1.159 To utilise both text and image data, a multi-modal CNN-LSTM model is proposed to jointly learn latent features for positive, negative and neutral category predictions. | ||
| vector representations | 35 | |
| C16-1289 Upon the generated source and target phrase structures, we stack a convolutional neural network to integrate ***** vector representations ***** of linguistic units on the structures into bilingual phrase embeddings. | ||
| I17-5002 While methods such as word2vec and GloVe are well-known, this tutorial focuses on multilingual and cross-lingual ***** vector representations *****, of words, but also of sentences and documents as well. | ||
| R19-1150 The current state of the art for First Story Detection (FSD) are nearest neighbour-based models with traditional term ***** vector representations *****; however, one challenge faced by FSD models is that the document representation is usually defined by the vocabulary and term frequency from a background corpus. | ||
| C18-1290 Standard word embedding algorithms learn ***** vector representations ***** from large corpora of text documents in an unsupervised fashion. | ||
| P19-1398 There exist few text-specific methods for unsupervised anomaly detection, and for those that do exist, none utilize pre-trained models for distributed ***** vector representations ***** of words. | ||
| syntactic parsing | 35 | |
| L04-1193 This paper presents EASY (Evaluation of Analyzers of SYntax), an ongoing evaluation campaign of ***** syntactic parsing ***** of French, a subproject of EVALDA in the French TECHNOLANGUE program. | ||
| D19-1160 It can serve as a first step towards architectures that can better leverage eye-tracking data or other complementary information available only for training sentences, possibly leading to improvements in ***** syntactic parsing *****. | ||
| 2020.emnlp-main.196 In this paper, we focus on sequence-to-sequence (seq2seq) AMR parsing and propose a seq2seq pre-training approach to build pre-trained models in both single and joint way on three relevant tasks, i.e., machine translation, ***** syntactic parsing *****, and AMR parsing itself. | ||
| 2021.acl-long.452 Several previous works on ***** syntactic parsing ***** propose to annotate shallow word-internal structures for better utilizing character-level information. | ||
| Q15-1026 The lattice parser predicts a dependency tree over a path in the lattice and thus solves the joint task of segmentation, morphological analysis, and ***** syntactic parsing *****. | ||
| conversion | 35 | |
| 2020.findings-emnlp.100 By collecting comparative adjectives from existing dictionaries and utilizing a semantic framework to catch comparative quantifiers, the semantics of clues concerning comparison structures are better understood, ensuring ***** conversion ***** to correct logic representation. | ||
| L10-1260 The paper also describes the ***** conversion ***** of ten treebanks into a common XML-based format used by the system, touching the question of standards and formats. | ||
| 2021.louhi-1.9 Since FuzzyBIO improves performance for some data sets and the ***** conversion ***** from BIOHD to FuzzyBIO is straightforward, we recommend investigating which is more effective for any data set containing discontinuous entities. | ||
| L08-1315 This paper describes a syllabification based ***** conversion ***** method for converting romanized Persian text to the traditional Arabic-based writing system. | ||
| L08-1209 The lexicon schemas are introduced and compared to each other in terms of ***** conversion ***** and usability for this particular user group, using a common lexicon entry and providing examples for each schema under consideration. | ||
| fact verification | 35 | |
| 2020.coling-main.165 However, the challenging problem of fake news detection has not benefited from the improvement of ***** fact verification ***** models, which is closely related to fake news detection. | ||
| D19-1258 The system is evaluated on both ***** fact verification ***** and open-domain multihop QA, achieving state-of-the-art results on the leaderboard test sets of both FEVER and HOTPOTQA. | ||
| 2021.acl-short.51 This work explores a framework for ***** fact verification ***** that leverages pretrained sequence-to-sequence transformer models for sentence selection and label prediction, two key sub-tasks in ***** fact verification *****. | ||
| 2020.findings-emnlp.216 We come up with SciKGAT to combine the advantages of open-domain literature search, state-of-the-art ***** fact verification ***** systems and in-domain medical knowledge through language modeling. | ||
| D19-1292 Automated ***** fact verification ***** has been progressing owing to advancements in modeling and availability of large datasets. | ||
| structures | 35 | |
| 2020.lrec-1.143 Our corpus can be used as a resource for analyzing persuasiveness and training an argument mining system to identify and extract argument ***** structures *****. | ||
| 2020.findings-emnlp.100 By collecting comparative adjectives from existing dictionaries and utilizing a semantic framework to catch comparative quantifiers, the semantics of clues concerning comparison ***** structures ***** are better understood, ensuring conversion to correct logic representation. | ||
| C16-1289 Upon the generated source and target phrase ***** structures *****, we stack a convolutional neural network to integrate vector representations of linguistic units on the ***** structures ***** into bilingual phrase embeddings. | ||
| W19-4819 Here we present a suite of experiments probing whether neural language models trained on linguistic data induce these stack-like data ***** structures ***** and deploy them while incrementally predicting words. | ||
| 2021.isa-1.3 Literary texts feature a rich variety in expressing quantification, including a broad range of lexemes to express quantifiers and complex sentence ***** structures ***** to express the restrictor and the nuclear scope of a quantification. | ||
| linked data | 35 | |
| 2020.ldl-1.5 In recent years, there has been increasing interest in publishing lexicographic and terminological resources as ***** linked data *****. | ||
| 2021.repl4nlp-1.25 While existing methods require entity-***** linked data ***** for pre-training, we train using a mention-span masking objective and a candidate ranking objective – which doesn't require any entity-links and only assumes access to an alias table for retrieving candidates, enabling large-scale pre-training. | ||
| L14-1126 assisted by machine translation and text analytics services, to explain how ***** linked data ***** can support such active curation. | ||
| L14-1668 To that end, in this paper we propose a model for representing translations as ***** linked data *****, as an extension of the lemon model. | ||
| L14-1232 Multilingual and cross-lingual information access can be facilitated by the availability of such lexica, e.g., allowing for an easy mapping of natural language expressions in different languages to ***** linked data ***** resources from LOD. | ||
| automatic translation | 35 | |
| 2020.peoples-1.15 In this paper, we present emotion lexicons of Croatian, Dutch and Slovene, based on manually corrected ***** automatic translation *****s of the English NRC Emotion lexicon. | ||
| L12-1665 That IWSLT 2011 evaluation focused on the ***** automatic translation ***** of public talks and included tracks for speech recognition, speech translation, text translation, and system combination. | ||
| 2011.iwslt-evaluation.1 This year, the IWSLT evaluation focused on the ***** automatic translation ***** of public talks and included tracks for speech recognition, speech translation, text translation, and system combination. | ||
| W18-5451 They have become the standard approach for ***** automatic translation ***** of text, at the cost of increased model complexity and uncertainty. | ||
| W18-2712 Neural machine translation ( NMT ) has significantly improved the quality of *****automatic translation***** models . | ||
| temporal information | 35 | |
| 2020.lrec-1.247 Most use cases that require medication information also generally require the associated ***** temporal information ***** (e.g. | ||
| W19-1907 This paper details the development of a linguistic resource designed to improve ***** temporal information ***** extraction systems and to integrate aspectual values. | ||
| D18-2013 It involves two basic tasks: (1) Understanding time expressions that are mentioned explicitly in text (e.g., February 27, 1998 or tomorrow), and (2) Understanding ***** temporal information ***** that is conveyed implicitly via relations. | ||
| 2020.acl-main.680 We empirically demonstrate the effect of temporal drift on performance, and how the ***** temporal information ***** of documents can be used to obtain better models compared to those that disregard ***** temporal information *****. | ||
| L16-1557 This paper describes two sets of crowdsourcing experiments on *****temporal information***** annotation conducted on two languages , i.e. , English and Italian . | ||
| social network | 35 | |
| Q14-1024 Such evaluations can be analyzed separately using signed ***** social network *****s and textual sentiment analysis, but this misses the rich interactions between language and social context. | ||
| 2020.sustainlp-1.17 Thus, there is a significant opportunity to deploy NLP in myriad applications to help web users, ***** social network *****s, and businesses. | ||
| L16-1008 Our goal is to identify the sentiments of the users in the ***** social network ***** through their conversations. | ||
| W19-3502 Interactions among users on ***** social network ***** platforms are usually positive, constructive and insightful. | ||
| 2020.semeval-1.268 In recent years , with the development of *****social network***** services and video distribution services , there has been a sharp increase in offensive posts . | ||
| abstractive sentence summarization | 35 | |
| D19-1301 Experiments on benchmark datasets show that, the proposed contrastive attention mechanism is more focused on the relevant parts for the summary than the conventional attention mechanism, and greatly advances the state-of-the-art performance on the ***** abstractive sentence summarization ***** task. | ||
| P17-1101 We evaluate our model on the English Gigaword, DUC 2004 and MSR ***** abstractive sentence summarization ***** datasets. | ||
| 2021.newsum-1.3 In this paper, we study the ***** abstractive sentence summarization *****. | ||
| 2020.coling-main.497 In this paper, we propose a controllable *****abstractive sentence summarization***** model which generates summaries with guiding entities. | ||
| P19-1305 *****Abstractive Sentence Summarization***** (ASSUM) targets at grasping the core idea of the source sentence and presenting it as the summary. | ||
| shallow discourse | 35 | |
| K19-1072 This paper describes a novel approach for the task of end-to-end argument labeling in ***** shallow discourse ***** parsing. | ||
| 2021.codi-main.12 This paper demonstrates discopy, a novel framework that makes it easy to design components for end-to-end ***** shallow discourse ***** parsing. | ||
| E17-4004 Sense classification of discourse relations is a sub-task of ***** shallow discourse ***** parsing. | ||
| 2020.lrec-1.139 This paper describes a novel application of semi-supervision for ***** shallow discourse ***** parsing. | ||
| 2020.lrec-1.131 *****Shallow Discourse***** Parsing (SDP), the identification of coherence relations between text spans, relies on large amounts of training data, which so far exists only for English - any other language is in this respect an under-resourced one. | ||
| named entity recognition ( NER | 35 | |
| L10-1284 In this paper , we present Second HAREM , the second edition of an evaluation campaign for Portuguese , addressing *****named entity recognition ( NER***** ) . | ||
| 2021.eacl-demos.7 Language model ( LM ) pretraining has led to consistent improvements in many NLP downstream tasks , including *****named entity recognition ( NER***** ) . | ||
| 2021.acl-demo.12 We present fastHan , an open - source toolkit for four basic tasks in Chinese natural language processing : Chinese word segmentation ( CWS ) , Part - of - Speech ( POS ) tagging , *****named entity recognition ( NER***** ) , and dependency parsing . | ||
| 2020.conll-1.35 This paper tackles the task of *****named entity recognition ( NER***** ) applied to digitized historical texts obtained from processing digital images of newspapers using optical character recognition ( OCR ) techniques . | ||
| R17-1101 We propose a neural reranking system for *****named entity recognition ( NER***** ) , leverages recurrent neural network models to learn sentence - level patterns that involve named entity mentions . | ||
| pre - trained | 35 | |
| 2021.acl-long.420 We study the problem of leveraging the syntactic structure of text to enhance *****pre - trained***** models such as BERT and RoBERTa . | ||
| 2021.emnlp-main.96 Recent development in NLP shows a strong trend towards refining *****pre - trained***** models with a domain - specific dataset . | ||
| 2020.sdp-1.13 We introduce SciWING , an open - source soft - ware toolkit which provides access to state - of - the - art *****pre - trained***** models for scientific document processing ( SDP ) tasks , such as citation string parsing , logical structure recovery and citation intent classification . | ||
| 2021.acl-long.49 Recent studies on neural networks with *****pre - trained***** weights ( i.e. , BERT ) have mainly focused on a low - dimensional subspace , where the embedding vectors computed from input words ( or their contexts ) are located . | ||
| 2021.emnlp-main.503 Although recent developments in neural architectures and *****pre - trained***** representations have greatly increased state - of - the - art model performance on fully - supervised semantic role labeling ( SRL ) , the task remains challenging for languages where supervised SRL training data are not abundant . | ||
| German | 35 | |
| L10-1347 We present a flexible toolkit - based approach to automatic coreference resolution on *****German***** text . | ||
| L16-1531 This paper presents a *****German***** corpus for Named Entity Linking ( NEL ) and Knowledge Base Population ( KBP ) tasks . | ||
| N18-2024 We present a computational model to detect and distinguish analogies in meaning shifts between *****German***** base and complex verbs . | ||
| 2020.lrec-1.566 We present a fine - grained NER annotations with 30 labels and apply it to *****German***** data . | ||
| L08-1449 In this paper we discuss an approach to the semi - automatic extraction and classification of the compounds extracted from *****German***** corpora . | ||
| variability | 34 | |
| L08-1101 First the features properties in terms of language and session ***** variability ***** are studied, predicting an increase in the language robustness when frame-wise intonation and energy values are combined with traditional MFCC features. | ||
| 2021.emnlp-main.525 We further provide a comparative study of stochastic and deterministic methods for rationale extraction for classification and natural language inference tasks, jointly assessing their predictive power, quality of the explanations, and model ***** variability *****. | ||
| 2020.coling-main.105 We 1) employ the language distances to infer and evaluate language trees, finding that they are close to the reference family tree in terms of quartet tree distance, 2) perform distance matrix regression analysis, finding that the language distances can be best explained by phylogenetic and worst by structural factors and 3) present a novel measure for measuring diachronic meaning stability (based on cross-lingual representation ***** variability *****) which correlates significantly with published ranked lists based on linguistic approaches. | ||
| 2020.lrec-1.688 We further go into some points related to confirmation of research findings through reproduction, including the choice of the dataset, reporting and accounting for ***** variability *****, use of appropriate evaluation metrics, and making code and data available. | ||
| 2020.privatenlp-1.2 While this allows the perturbation to admit the required metric differential privacy, often the utility of downstream tasks modeled on this perturbed data is low because the spherical noise does not account for the ***** variability ***** in the density around different words in the embedding space | ||
| evaluated | 34 | |
| 2021.ranlp-1.29 It leads to a strong search to improve resource efficiency based on algorithmic and hardware improvements ***** evaluated ***** only for English. | ||
| L14-1707 The system is trained and its performance ***** evaluated ***** on a 57k-token corpus, including different varieties of French spoken in three countries (Belgium, France and Switzerland). | ||
| 2021.naacl-main.252 Machine translation (MT) is currently ***** evaluated ***** in one of two ways: in a monolingual fashion, by comparison with the system output to one or more human reference translations, or in a trained crosslingual fashion, by building a supervised model to predict quality scores from human-labeled data. | ||
| 2021.gem-1.1 Sentence-level text simplification is currently ***** evaluated ***** using both automated metrics and human evaluation. | ||
| 2020.eamt-1.14 We also note that two of the systems ***** evaluated ***** do not produce any error for a category that was relevant for this translation direction prior to the advent of NMT systems: Chinese classifiers | ||
| meanings | 34 | |
| 2021.emnlp-main.199 The proposed paradigm offers merits over existing paraphrase generation methods: (1) using the context regularizer on ***** meanings *****, the model is able to generate massive amounts of high-quality paraphrase pairs; (2) the combination of the huge amount of paraphrase candidates and further diversity-promoting filtering yields paraphrases with more lexical and syntactic diversity; and (3) using human-interpretable scoring functions to select paraphrase pairs from candidates, the proposed framework provides a channel for developers to intervene with the data generation process, leading to a more controllable model. | ||
| 2018.gwc-1.12 A large enough corpus, such as the Corpus Of Contemporary American English, provides the data needed to enumerate all common uses or ***** meanings *****. | ||
| D19-1469 Our dataset offers a new resource for the study of the rich ***** meanings ***** that result from pairing text and image. | ||
| 2020.emnlp-main.195 It is agnostic about how to derive ***** meanings ***** from strings and for this reason it lends itself well to the encoding of semantics across languages. | ||
| C18-1134 We rank ***** meanings ***** by stability, infer phylogenetic trees using first the most stable meaning, then the two most stable ***** meanings *****, and so on, computing the quartet distance of the resulting tree to the tree proposed by language family experts at each step of datasize increase | ||
| abbreviations | 34 | |
| L10-1510 For our second approach, an SVM-based classifier was used on the preprocessed data sets, leading to an average F-score of 0.93 for the ***** abbreviations *****; for the definitions an average F-score of 0.82 was obtained. | ||
| L12-1333 In order to achieve compatibility with annotation rules designed for standard written Portuguese, transcribed words were orthographically normalized, and the parsing lexicon augmented with speech-specific material, phonetically spelled ***** abbreviations ***** etc. | ||
| L08-1365 This kind of clinical text presents many language challenges such as fragmented sentences and heavy use of ***** abbreviations ***** and acronyms. | ||
| W16-3915 Our third contribution involves segmentation of hashtags and a semantic enrichment using a combination of relations from WordNet, which helps the performance of our classification system, including disambiguation of named entities, ***** abbreviations ***** and acronyms. | ||
| S19-2063 We analyze the syntax, ***** abbreviations *****, and informal-writing of Twitter; and perform perfect data preprocessing on the data to convert them to normative text | ||
| ILP | 34 | |
| C16-1161 To incorporate the valid time of facts, we propose a joint time-aware inference model based on Integer Linear Programming (***** ILP *****) using temporal consistencyinformationasconstraints. | ||
| D19-1642 The proposed system is trained on the state-of-the-art MATRES dataset and applies contextualized word embeddings, a Siamese encoder of a temporal common sense knowledge base, and global inference via integer linear programming (***** ILP *****). | ||
| P18-1212 Specifically, we formulate the joint problem as an integer linear programming (***** ILP *****) problem, enforcing constraints that are inherent in the nature of time and causality. | ||
| C16-1226 To address this, we map relational phrases to KB predicates and textual relations simultaneously, and further develop an integer linear program (***** ILP *****) model to infer on these candidates and provide a globally optimal solution. | ||
| L16-1695 Joint inference approaches such as Integer Linear Programming (***** ILP *****) and Markov Logic Networks (MLNs) have recently been successfully applied to many natural language processing (NLP) tasks, often outperforming their pipeline counterparts | ||
| exploiting | 34 | |
| 2020.semeval-1.202 We then fine-tuned the model on the provided training data and, in some configurations, implement transfer learning approach ***** exploiting ***** the typological relatedness between English and Danish. | ||
| 2021.hcinlp-1.4 We also enable the user to gather insights into the causative factors that drive the model's behavior, ***** exploiting ***** the self-attention mechanism. | ||
| E17-2083 Results on WordNet link prediction show that leveraging cross-lingual information yields significant gains over ***** exploiting ***** only monolingual triples. | ||
| W18-3930 Existing dialect iden-tification models ***** exploiting ***** the dataset pre-date the recent boost deep learning brought to NLPand hence the data are not benchmarked for use with deep learning, nor is it clear how much neural networks can help tease the categories in the data apart. | ||
| 2020.wmt-1.58 We observe that contrarily to expectations, ***** exploiting ***** context degrades the results (and on analysis the data is not highly contextual) | ||
| transduction | 34 | |
| P19-1505 Using a neural ***** transduction ***** model, we estimate this quantity for the forms in 28 languages. | ||
| E17-3021 Autobank thus enables deep treebank conversions (and subsequent modifications) without the need for complex ***** transduction ***** algorithms accompanied by cascades of ad hoc rules; instead, the locus of human effort falls directly on the task of grammar construction itself. | ||
| C18-1115 We show that the general problem of string ***** transduction ***** can be reduced to the problem of sequence labeling. | ||
| N19-1333 However, unlike sequence ***** transduction ***** problems such as machine translation, GEC suffers from the lack of plentiful parallel data. | ||
| P18-1179 By augmenting a DAG automaton with ***** transduction ***** rules, a DAG transducer has potential applications in fundamental NLP tasks | ||
| conceptually | 34 | |
| N19-4010 The core idea of the framework is to present a simple, unified interface for ***** conceptually ***** very different types of word and document embeddings. | ||
| 2020.coling-main.509 Questions under Discussion (QUD; Roberts, 2012) are emerging as a ***** conceptually ***** fruitful approach to spelling out the connection between the information structure of a sentence and the nature of the discourse in which the sentence can function. | ||
| P19-1383 We report on the stability in performance of 11 ***** conceptually ***** diverse algorithms on a selection of 8 typologically distinct languages. | ||
| W17-0808 In this study, we seek to ***** conceptually ***** align three representations for common types of morpho-syntactic analysis, pinpoint what in our view constitute contentful differences, and reflect on the underlying principles and specific requirements that led to individual choices. | ||
| 2021.acl-long.101 Despite being ***** conceptually ***** attractive, it often suffers from low output quality | ||
| Combinatory Categorial | 34 | |
| 2014.lilt-9.3 In this article, with the help of an RTE system based on ***** Combinatory Categorial ***** Grammar, Discourse Representation Theory, and first-order theorem proving, we make an empirical assessment of the logic-based approach. | ||
| 2020.udw-1.10 We revisit the problem of extracting dependency structures from the derivation structures of ***** Combinatory Categorial ***** Grammar (CCG). | ||
| W19-1103 We use ***** Combinatory Categorial ***** Grammar (CCG) as a syntactic component of DTS and implement our compositional semantics for interrogative sentences using ccg2lambda, a semantic parsing platform based on CCG. | ||
| 1995.iwpt-1.7 Vijay - Shanker and Weir have shown in [ 17 ] that Tree Adjoining Grammars and *****Combinatory Categorial***** Grammars can be transformed into equivalent Linear Indexed Grammars ( LIGs ) which can be recognized in 0(n^6 ) time using a Cocke - Kasami - Younger style algorithm . | ||
| 1997.iwpt-1.17 A type of ` non - traditional constituents ' motivates an extended class of *****Combinatory Categorial***** Grammars , CCGs with Generalized Type - Raised Categories ( CCG - GTRC ) involving variables . | ||
| prefix | 34 | |
| 2021.acl-long.353 Prefix-tuning draws inspiration from prompting for language models, allowing subsequent tokens to attend to this ***** prefix ***** as if it were “virtual tokens”. | ||
| W17-5702 We show that ***** prefix ***** constraints are more flexible than side constraints and can be used to control the behavior of neural machine translation, in terms of output length, bidirectional decoding, domain adaptation, and unaligned target word generation. | ||
| 1991.iwpt-1.4 Earley's parser for CFGs (Earley, 1968; Earley, 1970) maintains the valid ***** prefix ***** property and obtains an O(n^3)-time worst case complexity, as good as parsers that do not maintain such as the CKY parser (Younger, 1967; Kasami, 1965). | ||
| 2021.sigtyp-1.8 In this paper we explore two machine-driven approaches for ***** prefix ***** and suffix statistics which are crude approximations, but have advantages in terms of time and replicability. | ||
| 2020.iwpt-1.6 Syntactic surprisal has been shown to have an effect on human sentence processing , and can be predicted from *****prefix***** probabilities of generative incremental parsers . | ||
| Galician | 34 | |
| L14-1579 CORILGA is a large high-quality corpus of spoken ***** Galician ***** from the 1960s up to present-day, including both formal and informal spoken language from both standard and non-standard varieties, and across different generations and social levels. | ||
| L10-1271 This evaluation, designed according to the criteria and methodology applied in the NIST Language Recognition Evaluations, involved four target languages: Basque, Catalan, ***** Galician ***** and Spanish (official languages in Spain), and included speech signals in other (unknown) languages to allow open-set verification trials. | ||
| L12-1264 The database features 6 target languages: Basque, Catalan, English, ***** Galician *****, Portuguese and Spanish, and includes segments in other (Out-Of-Set) languages, which allow to perform open-set verification tests. | ||
| L16-1469 We introduce TweetMT, a parallel corpus of tweets in four language pairs that combine five languages (Spanish from/to Basque, Catalan, ***** Galician ***** and Portuguese), all of which have an official status in the Iberian Peninsula. | ||
| L14-1576 The first one involved six target languages (Basque, Catalan, English, ***** Galician *****, Portuguese and Spanish) for which there was plenty of training data, whereas the second one involved four target languages (French, German, Greek and Italian) for which no training data was provided | ||
| algorithmic | 34 | |
| 2000.amta-papers.8 It offers ***** algorithmic ***** solutions and an implementation framework for local discourse processing in machine translation. | ||
| 2020.knlp-1.2 The Differentiable Neural Computer (DNC), a neural network model with an addressable external memory, can solve ***** algorithmic ***** and question answering tasks. | ||
| 2021.hackashop-1.12 I propose ***** algorithmic ***** and interface designs when personalizing the presentation of comments based on different objectives including relevance, diversity, and education/background information. | ||
| 2016.amta-researchers.3 We provide a formal ***** algorithmic ***** description of a method that is capable of using any SBI to generate all possible fuzzy-match repairs and perform an oracle evaluation on three different language pairs to ascertain the potential of the method to improve translation productivity. | ||
| N18-3011 We describe a deployed scalable system for organizing published scientific literature into a heterogeneous graph to facilitate ***** algorithmic ***** manipulation and discovery | ||
| unigram | 34 | |
| L10-1579 In our experiments we integrate eight different measures of lexical semantic similarity into an evaluation metric based on standard measures of ***** unigram ***** precision, recall and F-measure. | ||
| N19-1281 The model uses only ***** unigram ***** character embeddings, encodes them using either stacked bi-LSTM or a self-attention network, and independently infers both segmentation and part of speech information. | ||
| W16-4702 We show that the information from a small number of observed instances can be combined with local and global word embeddings to remarkably improve the term extraction results on ***** unigram ***** terms. | ||
| 2004.amta-papers.16 Recent research has shown that a balanced harmonic mean (F1 measure) of ***** unigram ***** precision and recall outperforms the widely used BLEU and NIST metrics for Machine Translation evaluation in terms of correlation with human judgments of translation quality. | ||
| W18-5519 It consists of three components: 1) Wikipedia Page Retrieval: First we extract the entities in the claim, then we find potential Wikipedia URI candidates for each of the entities using a SPARQL query over DBpedia 2) Sentence selection: We investigate various techniques i.e. Smooth Inverse Frequency (SIF), Word Mover's Distance (WMD), Soft-Cosine Similarity, Cosine similarity with ***** unigram ***** Term Frequency Inverse Document Frequency (TF-IDF) to rank sentences by their similarity to the claim | ||
| Sketch | 34 | |
| 2020.lrec-1.729 This paper presents the development and evaluation of the French EcoLexicon Semantic ***** Sketch ***** Grammar (ESSG-fr), a French hyponymic sketch grammar for ***** Sketch ***** Engine based on knowledge patterns. | ||
| L16-1445 They were processed by state-of-the-art tools and made available for researchers in the corpus manager ***** Sketch ***** Engine. | ||
| L16-1061 The paper describes automatic definition finding implemented within the leading corpus query and management tool , *****Sketch***** Engine . | ||
| 2020.codi-1.5 *****Sketch***** comedy and crosstalk are two popular types of comedy . | ||
| 2020.wac-1.1 In this paper we discuss some of the current challenges in web corpus building that we faced in the recent years when expanding the corpora in *****Sketch***** Engine . | ||
| phonemic | 34 | |
| 2020.lrec-1.656 While ***** phonemic ***** representations are language specific, phonetic representations (stated in terms of (allo)phones) are much closer to a universal (language-independent) transcription. | ||
| 2020.sltu-1.23 The result obtained by cross-lingual recognition compared with other baseline system and it has been found that the performance of the recognition system is based on ***** phonemic ***** units . | ||
| 2021.sigmorphon-1.1 In this work, as a first step towards implementing this framework, we focus on detecting ***** phonemic ***** sources of confusion. | ||
| 2021.naacl-main.149 We propose a Transformer-based sequence-to-sequence model for automatic speech recognition (ASR) capable of simultaneously transcribing and annotating audio with linguistic information such as ***** phonemic ***** transcripts or part-of-speech (POS) tags. | ||
| N19-1007 Together, these findings suggest that many reliable cues to ***** phonemic ***** structure are immediately available to infants from bottom-up perceptual characteristics alone, but that these cues must eventually be supplemented by top-down lexical and phonotactic information to achieve adult-like phone discrimination | ||
| synchronous | 34 | |
| L14-1179 The resulting database is ***** synchronous ***** between modalities (audio and 3D facial motion capture data). | ||
| P18-1038 We demonstrate that an SHRG-based parser can produce semantic graphs much more accurately than previously shown, by relating ***** synchronous ***** production rules to the syntacto-semantic composition process. | ||
| W19-3109 We prove that ***** synchronous ***** multiple context-free grammars are strictly more powerful than this combination of regular transductions and multiple context-free grammars. | ||
| 2001.mtsummit-road.9 Test data embodying geographic and sociolinguistic differences were obtained from a ***** synchronous ***** Chinese corpus of news media texts | ||
| 2006.amta-papers.5 As an approach to syntax based statistical machine translation ( SMT ) , Probabilistic Synchronous Dependency Insertion Grammars ( PSDIG ) , introduced in ( Ding and Palmer , 2005 ) , are a version of *****synchronous***** grammars defined on dependency trees . | ||
| KIT | 34 | |
| L14-1277 In order to support this, we have annotated speech disfluencies in German lectures at ***** KIT *****. | ||
| 2011.iwslt-evaluation.9 This paper presents the ***** KIT ***** system participating in the English→French TALK Translation tasks in the framework of the IWSLT 2011 machine translation evaluation. | ||
| 2014.iwslt-evaluation.17 In this paper, we present the ***** KIT ***** systems participating in the TED translation tasks of the IWSLT 2014 machine translation evaluation. | ||
| 2016.iwslt-1.16 In this paper , we present the *****KIT***** systems of the IWSLT 2016 machine translation evaluation . | ||
| 2012.iwslt-evaluation.3 In this paper , we present the *****KIT***** systems participating in the English - French TED Translation tasks in the framework of the IWSLT 2012 machine translation evaluation . | ||
| Multiword | 34 | |
| 2021.semeval-1.10 For each ***** Multiword ***** Target, a set of individual word features is taken along with single word complexities in the feature space. | ||
| L14-1433 ***** Multiword ***** expressions (MWEs) are quite frequent in languages such as English, but their diversity, the scarcity of individual MWE types, and contextual ambiguity have presented obstacles to corpus-based studies and NLP systems addressing them as a class. | ||
| C16-1042 ***** Multiword ***** expressions (MWEs) are pervasive in natural languages and often have both idiomatic and compositional readings, which leads to high syntactic ambiguity | ||
| C18-1219 *****Multiword***** expressions , especially verbal ones ( VMWEs ) , show idiosyncratic variability , which is challenging for NLP applications , hence the need for VMWE identification . | ||
| L12-1319 The Berkeley FrameNet Project ( BFN , https://framenet.icsi.berkeley.edu/fndrupal/ ) created descriptions of 73 non - core grammatical constructions , annotation of 50 of these constructions and about 1500 example sentences in its one year project Beyond the Core : A Pilot Project on Cataloging Grammatical Constructions and *****Multiword***** Expressions in English supported by the National Science Foundation . | ||
| Keyphrase | 34 | |
| P17-2054 ***** Keyphrase ***** boundary classification (KBC) is the task of detecting keyphrases in scientific articles and labelling them with respect to predefined types. | ||
| 2020.coling-main.56 ***** Keyphrase ***** extraction is the task of extracting a small set of phrases that best describe a document. | ||
| 2021.emnlp-main.215 ***** Keyphrase ***** extraction is a fundamental task in Natural Language Processing, which usually contains two main parts: candidate keyphrase extraction and keyphrase importance estimation. | ||
| C16-1277 *****Keyphrase***** annotation is the task of identifying textual units that represent the main content of a document . | ||
| N18-2100 *****Keyphrase***** extraction is a fundamental task in natural language processing that facilitates mapping of documents to a set of representative phrases . | ||
| multilingual corpora | 34 | |
| 2020.lrec-1.347 In Future Work, we plan to use these grammars to bootstrap the generation of other linguistic resources such as ***** multilingual corpora ***** that make use of data-driven approaches to natural language processing feasible. | ||
| 2020.acl-demos.14 We have trained Stanza on a total of 112 datasets, including the Universal Dependencies treebanks and other ***** multilingual corpora *****, and show that the same neural architecture generalizes well and achieves competitive performance on all languages tested. | ||
| W18-2711 In experiments with real incomplete ***** multilingual corpora ***** of TED Talks, the multi-source NMT with the NULL tokens achieved higher translation accuracies measured by BLEU than those by any one-to-one NMT systems. | ||
| 2020.lrec-1.12 We explore the possibility to exploit parallel ***** multilingual corpora ***** as a means of cheap supervision for the classification of three different readings of the English pronoun `it': entity, event or pleonastic, from their translation in several languages. | ||
| L12-1050 This paper questions the use of these ***** multilingual corpora ***** in translation studies and shows the methodological steps needed in order to obtain more reliably comparable sub-corpora that consist of original and directly translated text only | ||
| semantically annotated corpus | 34 | |
| L12-1076 The work presented in this paper aims at filling this gap by presenting a syntactically and ***** semantically annotated corpus *****. | ||
| 2020.lrec-1.724 French, as many languages, lacks ***** semantically annotated corpus ***** data. | ||
| L06-1081 To train the system, we used a ***** semantically annotated corpus ***** that was produced by projection across parallel corpora. | ||
| P17-1189 Previous studies on Chinese semantic role labeling (SRL) have concentrated on a single ***** semantically annotated corpus *****. | ||
| W19-3302 Meaning banking—creating a ***** semantically annotated corpus ***** for the purpose of semantic parsing or generation—is a challenging task. | ||
| bidirectional encoder | 34 | |
| 2020.smm4h-1.26 The system we propose for these tasks is based on ***** bidirectional encoder ***** representations from transformers (BERT) incorporating with knowledge graph and retrieving evidence from online information. | ||
| 2020.semeval-1.84 We use last-5 ***** bidirectional encoder ***** representation from ***** bidirectional encoder ***** representation from transformer (BERT)and term frequency–inverse document frequency (TF-IDF) vectorizer for counterfactual detection. | ||
| R19-1066 We present a preliminary approach to the classification of labelled data using logistic regression, bidirectional long short-term memory recurrent neural networks (BiLSTM) and ***** bidirectional encoder ***** representations from transformers (BERT). | ||
| D19-1586 We here show that this shortcoming can be effectively addressed by using the ***** bidirectional encoder ***** representation from transformers (BERT) proposed by Devlin et al. | ||
| 2021.ranlp-1.77 In this paper, we attempt to apply the mix-up method to a document classification task using ***** bidirectional encoder ***** representations from transformers (BERT) (Devlin et al., 2018) | ||
| compositional distributional semantics | 34 | |
| W16-4904 Experiments demonstrate that the proposed technique often outperforms other ***** compositional distributional semantics ***** approaches as well as vector space methods such as latent semantic analysis. | ||
| 2021.semspace-1.6 We propose a framework to model an operational conversational negation by applying worldly context (prior knowledge) to logical negation in ***** compositional distributional semantics *****. | ||
| P17-1073 The dataset may be used for the evaluation of ***** compositional distributional semantics ***** models of Polish. | ||
| 2020.pam-1.12 We import vector representations of words and predicates, learnt from large scale ***** compositional distributional semantics *****, interpret them as fuzzy sets, and analyse their performance on a toy inference dataset. | ||
| W19-5106 This article describes a dependency-based strategy that uses ***** compositional distributional semantics ***** and cross-lingual word embeddings to translate multiword expressions (MWEs). | ||
| aspect extraction | 34 | |
| P19-2002 We propose extensions for modern models in three downstream tasks, i.e. text classification, named entity recognition and ***** aspect extraction *****, which shows improvement in noise robustness over existing solutions. | ||
| 2021.emnlp-main.20 This, for example, happens in the task of ***** aspect extraction *****, where the aspects of interest of reviews of, e.g., restaurants or electronic devices may be very different. | ||
| 2020.lrec-1.612 The task of ***** aspect extraction ***** is an important component of aspect-based sentiment analysis. | ||
| N19-1242 To show the generality of the approach, the proposed post-training is also applied to some other review-based tasks such as ***** aspect extraction ***** and aspect sentiment classification in aspect-based sentiment analysis. | ||
| W19-3605 It is based on ***** aspect extraction ***** with neural networks and combines the advantages of deep learning and topic modeling | ||
| denoising autoencoder | 34 | |
| C16-1152 We compare zero-shot learning, bilingual word embeddings, stacked ***** denoising autoencoder ***** representations and machine translation techniques for aspect-based CLSC. | ||
| D19-1262 With a baseline model using sequence-to-sequence architecture integrated by ***** denoising autoencoder *****, we confirm the validity of our task. | ||
| 2021.vardial-1.6 A transformer, initialized with cross-lingual language model weights, is fine-tuned exclusively on monolingual data of the target language by jointly learning on a paraphrasing and ***** denoising autoencoder ***** objective. | ||
| 2020.acl-main.703 We present BART, a ***** denoising autoencoder ***** for pretraining sequence-to-sequence models. | ||
| 2021.acl-long.555 Our model is trained as a ***** denoising autoencoder *****: we take temporally-ordered event sequences, shuffle them, delete some events, and then attempt to recover the original event sequence | ||
| sentence representations | 34 | |
| S19-2048 Our model extends the Recurrent Convolutional Neural Network (RCNN) by using external fine-tuned word representations and DeepMoji ***** sentence representations *****. | ||
| 2020.emnlp-main.225 We find that (i) sentence positional encoding can lead to a large improvement for identifying discourse elements; (ii) a structural relative positional encoding of sentences shows to be most effective; (iii) inter-sentence attention vectors are useful as a kind of ***** sentence representations ***** for identifying discourse elements. | ||
| 2021.emnlp-main.312 Impressive milestones have been achieved in text matching by adopting a cross-attention mechanism to capture pertinent semantic connections between two ***** sentence representations *****. | ||
| 2021.ranlp-1.129 Document alignment techniques based on multilingual ***** sentence representations ***** have recently shown state of the art results. | ||
| 2021.repl4nlp-1.2 Our results also indicate that model distillation may hurt the ability of cross-lingual transfer of ***** sentence representations *****, while language dissimilarity at most has a modest effect. | ||
| biomedical literature | 34 | |
| 2021.naacl-main.423 Recent efforts of generating adversaries using rule-based synonyms and BERT-MLMs have been witnessed in general domain, but the ever-increasing ***** biomedical literature ***** poses unique challenges. | ||
| 2020.coling-main.59 We also leverage the new BREATHE dataset which is one of the largest available datasets of biomedical research literature, containing abstracts and full-text articles from ten different ***** biomedical literature ***** sources on which we pre-train our BioMedBERT model. | ||
| D17-1313 In this work, we introduce a focused reading approach to guide the machine reading of ***** biomedical literature ***** towards what literature should be read to answer a biomedical query as efficiently as possible. | ||
| 2020.sdp-1.6 We study whether novel ideas in ***** biomedical literature ***** appear first in preprints or traditional journals. | ||
| P19-2058 In this work, we focus on extraction information of adverse drug reactions from various sources of biomedical textbased information, including ***** biomedical literature ***** and social media. | ||
| error detection | 34 | |
| L12-1160 We present a complex, open source tool for detailed machine translation error analysis providing the user with automatic ***** error detection ***** and classification, several monolingual alignment algorithms as well as with training and test corpus browsing. | ||
| I17-4001 We expected this evaluation campaign could lead to the development of more advanced NLP techniques for educational applications, especially for Chinese ***** error detection *****. | ||
| D19-1087 We build a supervised alignment model for translation ***** error detection ***** (AlignDet) based on a simple Alignment Triangle strategy to set the benchmark for automatic ***** error detection ***** task. | ||
| P17-1194 The architecture was evaluated on a range of datasets, covering the tasks of ***** error detection ***** in learner texts, named entity recognition, chunking and POS-tagging. | ||
| 2021.bea-1.15 We approach the problem with the methods for grammatical ***** error detection ***** (GED), since we hypothesize that models for detecting grammatical mistakes can assess the correctness of potential alternative answers in a learning setting. | ||
| bilingual dictionaries | 34 | |
| L10-1161 MuLeXFoR entries contain, among other things, detailed descriptions of morphological constraints and productivity notes, which are sorely lacking in currently available tools such as ***** bilingual dictionaries *****. | ||
| L14-1109 In addition to the usual information on part-of-speech, gender, and number for nouns, offered by most dictionaries currently available, OpenLogos ***** bilingual dictionaries ***** have some distinctive features that make them unique: they contain cross-language morphological information (inflectional and derivational), semantico-syntactic knowledge, indication of the head word in multiword units, information about whether a source word corresponds to an homograph, information about verb auxiliaries, alternate words (i.e., predicate or process nouns), causatives, reflexivity, verb aspect, among others. | ||
| J17-2001 We analyze the effects of frequency and burstiness, and the sizes of the seed ***** bilingual dictionaries ***** and the monolingual training corpora. | ||
| D19-1363 Unlike previous PA extensions that require a k-way dictionary, this approach requires only pairwise ***** bilingual dictionaries ***** that are much easier to construct. | ||
| L12-1006 Pivot-based bilingual dictionary building is based on merging two ***** bilingual dictionaries ***** which share a common language (e.g. | ||
| neural network based | 34 | |
| 2020.acl-main.651 The framework utilises a ***** neural network based ***** architecture for classifying clarification questions. | ||
| 2020.codi-1.8 We propose a ***** neural network based ***** approach to learn the match between pairs of discourse tree structures. | ||
| 2021.emnlp-main.253 Therefore, in this paper, we propose a novel ***** neural network based ***** approach for multi-label document classification, in which two heterogeneous graphs are constructed and learned using heterogeneous graph transformers. | ||
| 2020.sdp-1.40 Our approach leverages state-of-the-art pre-trained deep ***** neural network based ***** models as zero-shot learners to achieve high scores on the task. | ||
| W16-4502 A 12-gram statistical language model was selected as a baseline to oppose three ***** neural network based ***** models of different characteristics. | ||
| trees | 34 | |
| P17-1105 The outputs are represented as abstract syntax ***** trees ***** (ASTs) and constructed by a decoder with a dynamically-determined modular structure paralleling the structure of the output tree. | ||
| 1998.amta-papers.25 All parse ***** trees ***** are converted to this format prior to semantic interpretation. | ||
| 2020.acl-main.591 In this paper, we make use of a multi-task objective, i.e., the models simultaneously predict words as well as ground truth parse ***** trees ***** in a form called “syntactic distances”, where information between these two separate objectives shares the same intermediate representation. | ||
| 2020.smm4h-1.23 We fine - tune ELECTRA transformers using our trained SVM filter for data augmentation , along with decision *****trees***** to detect medication mentions in tweets . | ||
| 2020.coling-main.227 Unsupervised dependency parsing aims to learn a dependency parser from sentences that have no annotation of their correct parse *****trees***** . | ||
| search engine | 34 | |
| 2020.acl-demos.5 We present a large improvement over classic ***** search engine ***** baseline on several standard QA datasets and provide the community a collaborative data collection tool to curate the first natural language processing research QA dataset via a community effort. | ||
| L12-1475 The attrition of documents online, also called link rot or document half-life, has been studied many times for the purposes of optimising ***** search engine ***** web crawlers, producing robust and reliable archival systems, and ensuring the integrity of distributed information stores, however, the affect that attrition has upon corpora of varying construction remains largely unknown. | ||
| L12-1493 This paper presents a web-based multimedia ***** search engine ***** built within the Buceador (www.buceador.org) research project. | ||
| 2021.sigdial-1.39 We evaluate the approach with a state of the art ***** search engine ***** and a recently introduced dialogue model in an extensive user study with respect to the dialogue coherence. | ||
| 2020.argmining-1.1 Finally, we present a ***** search engine ***** for this dataset which is utilized extensively by members of the National Speech and Debate Association today. | ||
| bilingual dictionary | 34 | |
| 2003.mtsummit-papers.3 Based on these assumptions, new valency entries are constructed from words in a plain ***** bilingual dictionary *****, using entries with similar source-language meaning and the same target-language translations. | ||
| 2021.rocling-1.44 In our approach, the keyword list are converted into unigram of all possible Mandarin translations, intended or not.The method involve converting words in the keyword list into all translations using a ***** bilingual dictionary *****, computing the unigram word counts of translations, and computing character counts from the word counts. | ||
| D18-1038 We specifically discuss two types of common parallel resources: bilingual corpus and ***** bilingual dictionary *****, and design different transfer learning strategies accordingly. | ||
| D19-5210 Phrase based statistical machine translation (PBSMT) system is built by using other resources: Name Entity Recognition (NER) corpus and ***** bilingual dictionary ***** which is created by Google Translate (GT). | ||
| L10-1155 The paper presents an approach for constructing a weighted ***** bilingual dictionary ***** of inflectional forms using as input data a traditional ***** bilingual dictionary *****, and not parallel corpora. | ||
| entity set expansion | 34 | |
| 2020.acl-main.725 A key challenge for ***** entity set expansion ***** is to avoid selecting ambiguous context features which will shift the class semantics and lead to accumulative errors in later iterations. | ||
| 2020.findings-emnlp.331 Bootstrapping for ***** entity set expansion ***** (ESE) has been studied for a long period, which expands new entities using only a few seed entities as supervision. | ||
| 2021.emnlp-main.762 Bootstrapping has become the mainstream method for ***** entity set expansion *****. | ||
| 2020.emnlp-main.666 Extensive experiments on the SE2 dataset and previous benchmarks demonstrate the effectiveness of SynSetExpan for both ***** entity set expansion ***** and synonym discovery tasks. | ||
| D19-1028 Bootstrapping for *****Entity Set Expansion***** (ESE) aims at iteratively acquiring new instances of a specific target category. | ||
| large text | 34 | |
| W18-0540 We developed solutions following three approaches: (i) a feature engineering method using lexical, n-gram and psycholinguistic features, (ii) a shallow neural network method using only word embeddings, and (iii) a Long Short-Term Memory (LSTM) language model, which is pre-trained on a ***** large text ***** corpus to produce a contextualized word vector. | ||
| C16-1149 In this work, we propose an unsupervised method of learning word-emotion association from ***** large text ***** corpora, called Selective Co-occurrences (SECO), by leveraging the property of mutual exclusivity generally exhibited by emotions. | ||
| L08-1444 Since ***** large text ***** corpora nowadays are easily available and inflectional systems are in general well understood, it seems feasible to acquire lexical data from raw texts, guided by our knowledge of inflection. | ||
| L10-1363 We present an experimental framework for Entity Mention Detection in which two different classifiers are combined to exploit Data Redundancy attained through the annotation of a ***** large text ***** corpus, as well as a number of Patterns extracted automatically from the same corpus. | ||
| D18-1455 In this paper we look at a more practical setting, namely QA over the combination of a KB and entity-linked text, which is appropriate when an incomplete KB is available with a ***** large text ***** corpus. | ||
| clinical tempeval | 34 | |
| S17-2179 Task 12: *****Clinical TempEval***** challenge, specifically in the event and time expressions span and attribute identification subtasks (ES, EA, TS, TA). | ||
| S17-2176 This paper describes the system developed for the task of temporal information extraction from clinical narratives in the context of the 2017 *****Clinical TempEval***** challenge. | ||
| W18-5607 The complexity of temporal representation in language is evident as results of the 2016 *****Clinical TempEval***** challenge indicate: the current state-of-the-art systems perform well in solving mention-identification tasks of event and time expressions but poorly in temporal relation extraction, showing a gap of around 0.25 point below human performance. | ||
| S17-2180 *****Clinical TempEval***** 2017 (SemEval 2017 Task 12) addresses the task of cross-domain temporal extraction from clinical text. | ||
| S17-2093 *****Clinical TempEval***** 2017 aimed to answer the question: how well do systems trained on annotated timelines for one medical condition (colon cancer) perform in predicting timelines on another medical condition (brain cancer)? | ||
| dual learning | 34 | |
| D17-1191 In this paper, we design a novel convolutional neural network (CNN) with resi***** dual learning *****, and investigate its impacts on the task of distantly supervised noisy relation extraction. | ||
| 2020.coling-main.134 Second, to address the original information forgotten issue and vanishing/exploding gradient issue, it uses the resi***** dual learning ***** method. | ||
| 2021.acl-long.276 To solve the data lacking problem, we introduce a new approach to augment training data for event causality identification, by iteratively generating new examples and classifying event causality in a ***** dual learning ***** framework. | ||
| P19-1007 In this work, we develop a semantic parsing framework with the ***** dual learning ***** algorithm, which enables a semantic parser to make full use of data (labeled and even unlabeled) through a dual-learning game. | ||
| N18-1124 The resi***** dual learning ***** facilitates the flow of information from the distant past and is able to emphasize any of the previously translated words, hence it gains access to a wider context. | ||
| French | 34 | |
| L08-1119 In this paper we present the PASSAGE project which aims at building automatically a *****French***** Treebank of large size by combining the output of several parsers , using the EASY annotation scheme . | ||
| W16-3809 Verbenet is a *****French***** lexicon developed by translation of its English counterpart VerbNet ( Kipper - Schuler , 2005)and treatment of the specificities of French syntax ( Pradet et al . , 2014 ; Danlos et al . , 2016 ) . | ||
| W17-1723 We present a simple and efficient tagger capable of identifying highly ambiguous multiword expressions ( MWEs ) in *****French***** texts . | ||
| L14-1382 In this paper , we describe the development of *****French***** resources for the extraction and normalization of temporal expressions with HeidelTime , a open - source multilingual , cross - domain temporal tagger . | ||
| L06-1441 The EVALDA / EvaSy project is dedicated to the evaluation of text - to - speech synthesis systems for the *****French***** language . | ||
| genres | 33 | |
| L10-1059 A preliminary linear discriminant analysis of text ***** genres ***** using the data of POS frequencies and sentence length revealed it was possible to classify the text ***** genres ***** with a correct identification rate of 88% as far as the samples of books, newspapers, whitepapers, and internet bulletin boards are concerned. | ||
| 2021.eacl-main.101 We perform both manual validation and empirical evaluation on multiple evaluation datasets with different event domains and text ***** genres ***** to assess the quality of our acquired event pairs. | ||
| E17-1033 We create topic models (using Latent Dirichlet Allocation) to determine ***** genres ***** from a heterogeneous dataset and then train an expert for each of the ***** genres *****. | ||
| 2021.law-1.18 The dataset covers a broad range of 12 written and spoken ***** genres *****, most of which have not been included in Entity Linking efforts to date, leading to poor performance by a pretrained SOTA system in our evaluation. | ||
| 2021.wmt-1.20 Expect for in-domain fine-tuning, we also propose a fine-grained “one model one domain” approach to model characteristics of different news ***** genres ***** at fine-tuning and decoding stages | ||
| Contrary | 33 | |
| P17-1002 ***** Contrary ***** to models that operate on the argument component level, we find that framing AM as dependency parsing leads to subpar performance results. | ||
| 2021.eacl-main.283 ***** Contrary ***** to our expectation, multiple state-of-the-art multi-hop QA models fail to answer a large portion of sub-questions, although the corresponding multi-hop questions are correctly answered. | ||
| E17-2089 ***** Contrary ***** to classical domain adaptation methods, which employ texts from both domains to detect pivot features, we do not use the target domain for training. | ||
| W19-8615 ***** Contrary ***** to previously proposed text simplification corpora, which contain only a small number of split examples, we present a dataset where each input sentence is broken down into a set of minimal propositions, i.e. a sequence of sound, self-contained utterances with each of them presenting a minimal semantic unit that cannot be further decomposed into meaningful propositions. | ||
| 2020.acl-demos.18 ***** Contrary ***** to what the project name might suggest, CLIReval does not actually require any annotated CLIR dataset | ||
| formulation | 33 | |
| 2021.emnlp-main.615 For this, we explore the benefits of multi-task learning and investigate which setup and task ***** formulation ***** is best suited for each sub-task. | ||
| 2020.acl-srw.32 In our ***** formulation *****, the SAS systems should extract as many scoring predictions that are not critical scoring errors (CSEs). | ||
| 2021.naacl-main.393 While many models have recently been proposed for LFQA, we show in this paper that the task ***** formulation ***** raises fundamental challenges regarding evaluation and dataset creation that currently preclude meaningful modeling progress. | ||
| 2021.naacl-main.25 Under this new task ***** formulation *****, we show strong quantitative and qualitative results on the 20Newsgroups and AG News datasets. | ||
| 2021.acl-long.71 In contrast, our ***** formulation ***** is inspired by axioms satisfied by characteristic functions as well as solution concepts in cooperative game theory literature | ||
| tuples | 33 | |
| 2021.naacl-main.457 Existing multimodal neural machine translation methods (MNMT) require triplets of bilingual sentence - image for training and ***** tuples ***** of source sentence - image for inference. | ||
| W19-4002 We address the non-trivial problem of evaluating the extractions produced by systems against the reference ***** tuples *****, and share our evaluation script. | ||
| N19-1239 We describe NeurON, a system for extracting ***** tuples ***** from question-answer pairs. | ||
| E17-2011 Cross-lingual information extraction is the task of distilling facts from foreign language (e.g. Chinese text) into representations in another language that is preferred by the user (e.g. English ***** tuples *****). | ||
| D19-1029 In this work, we propose a new sequence labeling framework (as well as a new tag schema) to jointly extract the fact and condition ***** tuples ***** from statement sentences | ||
| vowel | 33 | |
| L06-1124 The second corpus is a task-specific resource designed in the PELT framework to investigate the ***** vowel ***** space of English produced by Poles. | ||
| W17-0705 Single-feature patterns are learned faster than two-feature patterns, and ***** vowel ***** or consonant-only patterns are learned faster than patterns involving ***** vowel *****s and consonants, mimicking the results of laboratory learning experiments. | ||
| 1963.earlymt-1.7 Rules for breaking ***** vowel ***** strings are obtained by a study of the CVC forms. | ||
| L14-1033 Across the five languages there are systematic differences in the degree to which duration, f0, intensity and spectral ***** vowel ***** definition change with changing prominence under different focus conditions. | ||
| W19-3627 We investigate English pronunciation patterns in Singaporean children in relation to their American and British counterparts by conducting archetypal analysis on selected *****vowel***** pairs . | ||
| meta | 33 | |
| 2021.dialdoc-1.5 This ***** meta ***** dialog system can answer questions from Wikipedia and at the same time act as a personal assistant. | ||
| S19-2150 In order to address this task, we propose a system based on the BERT model with ***** meta ***** information of questions. | ||
| 2021.metanlp-1.8 Recent studies have proposed unsupervised approaches to create ***** meta *****-training tasks from unlabeled data for free, e.g., the SMLMT method (Bansal et al., 2020a) constructs unsupervised multi-class classification tasks from the unlabeled text by randomly masking words in the sentence and let the ***** meta ***** learner choose which word to fill in the blank. | ||
| P19-1396 To deal with the insufficiency, we propose a generative model that aggregates short texts into clusters by leveraging the associated ***** meta ***** information. | ||
| 2020.lrec-1.668 We verified the quality of the collected data and, by subjective evaluation, we also verified their usefulness in training neural conversational models for generating utterances reflecting the ***** meta ***** information, especially emotion | ||
| KD | 33 | |
| 2021.wnut-1.33 However, one neglected area of research is the impact of noisy (corrupted) labels on ***** KD *****. | ||
| 2020.coling-industry.4 However, it is not straightforward to apply ***** KD ***** to ranking problems. | ||
| 2021.acl-long.266 Accordingly, we propose reverse ***** KD ***** to rejuvenate more alignments for low-frequency target words. | ||
| 2021.acl-long.228 Instead of only learning from the teacher's soft label as in conventional ***** KD *****, researchers find that the rich information contained in the hidden layers of BERT is conducive to the student's performance | ||
| 2021.sustainlp-1.13 Moreover , we find that different datasets / tasks prefer different *****KD***** algorithms , and thus propose a simple AutoDistiller algorithm that can recommend a good KD pipeline for a new dataset . | ||
| infrequent | 33 | |
| P19-1158 To make the model robust against ***** infrequent ***** tokens, we sampled segmentation for each sentence stochastically during training, which resulted in improved performance of text classification. | ||
| D17-1146 Neural machine translation (NMT) has achieved notable success in recent times, however it is also widely recognized that this approach has limitations with handling ***** infrequent ***** words and word pairs. | ||
| L14-1265 On the other hand, the dynamic generation of stopword lists, by removing those ***** infrequent ***** terms appearing only once in the corpus, appears to be the optimal method to maintaining a high classification performance while reducing the data sparsity and shrinking the feature space. | ||
| 2020.aacl-main.10 Specifically, we propose to replace ***** infrequent ***** input and output words in CBOW model with their clusters. | ||
| W19-4323 Another benefit of our approach is that it is capable of generating a high-quality representation of ***** infrequent ***** words as, for example, found in very recent news articles with rapidly changing vocabularies | ||
| asynchronous | 33 | |
| N19-1134 Unlike synchronous conversations (e.g., meetings, phone), ***** asynchronous ***** domains lack large labeled datasets to train an effective SAR model. | ||
| C16-1019 Thus, we propose a simple method AsynGrad for ***** asynchronous ***** parallel learning with gradient error. | ||
| J18-4012 The CRF model can consider arbitrary graph structures to model conversational dependencies in an ***** asynchronous ***** conversation. | ||
| P18-1208 Intrinsically this language is multimodal (heterogeneous), sequential and ***** asynchronous *****; it consists of the language (words), visual (expressions) and acoustic (paralinguistic) modalities all in the form of ***** asynchronous ***** coordinated sequences | ||
| 2020.coling-main.436 We present a large - scale corpus of e - mail conversations with domain - agnostic and two - level dialogue act ( DA ) annotations towards the goal of a better understanding of *****asynchronous***** conversations . | ||
| Apertium | 33 | |
| 2011.freeopmt-1.8 This document describes a project aimed at building a new web interface to the ***** Apertium ***** machine translation platform, including pre-editing and post-editing environments. | ||
| 2020.globalex-1.16 We also describe additional evaluation experiments on ***** Apertium ***** data, a comparison with an earlier approach based on embedding projection, and an approach for constrained projection that outperforms the TIAD-2020 vanilla system by a large margin. | ||
| L12-1153 This article describes the development of a bidirectional shallow-transfer based machine translation system for Spanish and Aragonese, based on the ***** Apertium ***** platform, reusing the resources provided by other translators built for the platform. | ||
| 2016.amta-researchers.3 Using DGT-TM translation memories and the machine system ***** Apertium ***** as the single source to build repair operators in three different language pairs, we show that the best repaired fuzzy matches are consistently closer to reference translations than either machine-translated segments or unrepaired fuzzy matches. | ||
| 2011.freeopmt-1.11 This article describes the development of an Open Source shallow-transfer machine translation system from Czech to Polish in the ***** Apertium ***** platform | ||
| inferential | 33 | |
| W17-5105 These statements are then used to produce a matrix representing the ***** inferential ***** relationship between different aspects of the topic. | ||
| L06-1019 A preliminary frame-based format for representing their prototypical behavior is then proposed together with related ***** inferential ***** patterns that describe functional or paradigmatic relations between preposition senses. | ||
| 2006.jeptalnrecital-long.28 The ***** inferential ***** model allows a classification that can accommodate genres that are not entirely standardized, and is more capable of reading a Web page, which is mixed, rarely corresponding to an ideal type and often showing a mixture of genres or no genre at all. | ||
| 2021.acl-long.534 We present InferWiki, a Knowledge Graph Completion (KGC) dataset that improves upon existing benchmarks in ***** inferential ***** ability, assumptions, and patterns. | ||
| 2021.acl-long.552 If two sentences have the same meaning , it should follow that they are equivalent in their *****inferential***** properties , i.e. , each sentence should textually entail the other . | ||
| metaphors | 33 | |
| 2021.acl-long.185 Style is formed by a complex combination of different stylistic factors, including formality markers, emotions, ***** metaphors *****, etc. | ||
| 2020.figlang-1.22 On the other hand, approaches that process ***** metaphors ***** on the relation-level ignore the context where the metaphoric expression. | ||
| W19-4444 The experiments show that the Age of Acquisition is the most distinctive feature for both ***** metaphors ***** and literal words. | ||
| Q13-1031 Our results shows that it significantly outperforms other state-of-the-art methods in recognizing and explaining ***** metaphors *****. | ||
| W18-0908 In this era of web 2.0, automatic analysis of sarcasm and ***** metaphors ***** is important for their extensive usage. | ||
| aspects | 33 | |
| L16-1179 Aspect Based Sentiment Analysis (ABSA) is the task of mining and summarizing opinions from text about specific entities and their ***** aspects *****. | ||
| L16-1465 The fine-grained task of automatically detecting all sentiment expressions within a given document and the ***** aspects ***** to which they refer is known as aspect-based sentiment analysis. | ||
| 2021.ranlp-1.13 Sentiment analysis aims to detect the overall sentiment, i.e., the polarity of a sentence, paragraph, or text span, without considering the entities mentioned and their ***** aspects *****. | ||
| D19-1468 In this work, we consider weakly supervised approaches for training aspect classifiers that only require the user to provide a small set of seed words (i.e., weakly positive indicators) for the ***** aspects ***** of interest. | ||
| D19-1476 The novelties of the proposed architecture are manifested in the ***** aspects ***** of a newly defined objective function, the complementary information fusion method for structural and textual features, and the mutual gate mechanism for textual feature extraction | ||
| correction | 33 | |
| J17-4002 Both methods demonstrated state-of-the-art performance in several text ***** correction ***** competitions. | ||
| 2020.coling-industry.18 We also share results to show enhancement in classification accuracy after noise ***** correction *****. | ||
| L14-1555 We also propose a method to evaluate the impact of spelling errors ***** correction ***** on translation quality without expensive manual work of providing reference translations. | ||
| L10-1096 It is used in ***** correction ***** for generation of proposals when in the input text appear standard forms which we want to replace with dialectal forms. | ||
| 2021.iwslt-1.22 To reduce the punctuation errors generated by the ASR model, we employ our previous work SlotRefine to train a punctuation ***** correction ***** model | ||
| learner corpus | 33 | |
| L14-1488 The MERLIN corpus is a written ***** learner corpus ***** for Czech, German,and Italian that has been designed to illustrate the Common European Framework of Reference for Languages (CEFR) with authentic learner data. | ||
| L06-1138 In this paper, we first present a new computer ***** learner corpus ***** in French. | ||
| W17-6306 This opinion paper proposes the use of parallel treebank as ***** learner corpus *****. | ||
| 2021.rocling-1.32 In our approach, annotated edits in the ***** learner corpus ***** are converted into edit rules for correcting common writing errors | ||
| W17-5051 We present a new longitudinal L1 ***** learner corpus ***** for German (handwritten texts collected in grade 2–4), which is transcribed and annotated with a target hypothesis that strictly only corrects orthographic errors, and is thereby tailored to research and tool development for orthographic issues in primary school. | ||
| selectional preferences | 33 | |
| L10-1434 Our second interest lies in the actual comparison of the models: How does a very simple distributional model compare to much more complex approaches, and which representation of ***** selectional preferences ***** is more appropriate, using (i) second-order properties, (ii) an implicit generalisation of nouns (by clusters), or (iii) an explicit generalisation of nouns by WordNet classes within clusters? | ||
| L14-1254 While the current version of the dictionary concentrates on syntax, it already contains some semantic features, including semantically defined arguments, such as locative, temporal or manner, as well as control and raising, and work on extending it with semantic roles and ***** selectional preferences ***** is in progress. | ||
| 2019.lilt-17.1 The analysis has two key components (i) an underspecified category for the nominal and (ii) combinatorial constraints on the noun and light verb to specify ***** selectional preferences *****. | ||
| L12-1164 The distinctive feature of these representations are their relation to a concept network, through which we can compute ***** selectional preferences ***** of open-class words relative to general concepts. | ||
| 2005.mtsummit-papers.1 Initial results are highly promising, obtaining more specific information about ***** selectional preferences *****. | ||
| entity mentions | 33 | |
| L10-1596 This paper describes the 2009 resource creation efforts, with particular focus on the selection and development of named ***** entity mentions ***** for the Entity Linking task evaluation. | ||
| N18-1002 The task of Fine-grained Entity Type Classification (FETC) consists of assigning types from a hierarchy to ***** entity mentions ***** in text. | ||
| 2020.findings-emnlp.409 The goal of Document-level Relation Extraction (DRE) is to recognize the relations between ***** entity mentions ***** that can span beyond sentence boundary. | ||
| W19-2804 Clustering unlinkable ***** entity mentions ***** across documents in multiple languages (cross-lingual NIL Clustering) is an important task as part of Entity Discovery and Linking (EDL). | ||
| L08-1580 The system first translates known ***** entity mentions ***** using a standard phrase-based statistical machine translation framework, which is then reused to perform name transliteration on unknown mentions. | ||
| language modelling | 33 | |
| D19-1288 Motivated by this question, we aim at constructing an informative prior for held-out languages on the task of character-level, open-vocabulary ***** language modelling *****. | ||
| 2020.lrec-1.333 We see improvements for this approach over word-level language models, again indicating that sub-word modelling is important for Mi'kmaq ***** language modelling *****. | ||
| 2020.blackboxnlp-1.11 We speculate that both problems could potentially be solved by adopting a different training task other than unidirectional ***** language modelling *****. | ||
| 2021.acl-long.331 Inspired by old and well-established ideas in machine learning, we explore a variety of non-linear “reservoir” layers interspersed with regular transformer layers, and show improvements in wall-clock compute time until convergence, as well as overall performance, on various machine translation and (masked) ***** language modelling ***** tasks. | ||
| 2021.semspace-1.8 Vector representations have become a central element in semantic ***** language modelling *****, leading to mathematical overlaps with many fields including quantum theory. | ||
| sentence boundary detection | 33 | |
| 2020.autosimtrans-1.1 In this paper, we propose a novel method for ***** sentence boundary detection ***** that takes it as a multi-class classification task under the end-to-end pre-training framework. | ||
| L16-1348 This paper describes a method to perform ***** sentence boundary detection ***** and alignment simultaneously, which significantly improves the alignment accuracy on languages like Chinese with uncertain sentence boundaries. | ||
| 2020.autosimtrans-1.6 We present a sentence length based method and a ***** sentence boundary detection ***** model based method for the streaming input segmentation. | ||
| C16-1028 The paper applies a deep recurrent neural network to the task of ***** sentence boundary detection ***** in Sanskrit, an important, yet underresourced ancient Indian language. | ||
| W16-4701 Coupled with text mining techniques including named entity recognition, ***** sentence boundary detection *****, string approximate matching, entitymetrics enables us to analyze knowledge diffusion, impact, and trend at various knowledge entity units, such as bio-entity, organization, and country. | ||
| downstream applications | 33 | |
| P19-1317 Consequently, this has tremendous implications such as rendering ***** downstream applications ***** inefficacious and/or potentially unreliable. | ||
| D18-1263 Beyond SDP, our linearization technique opens the door to integration of graph-based semantic representations as features in neural models for ***** downstream applications *****. | ||
| K19-1048 While the task is well-established, there is no universally used tagset: often, datasets are annotated for use in ***** downstream applications ***** and accordingly only cover a small set of entity types relevant to a particular task. | ||
| 2021.eacl-main.308 To investigate how representative the synthetic tasks are of downstream use cases, we conduct experiments on benchmarking well-known traditional and neural coherence models on synthetic sentence ordering tasks, and contrast this with their performance on three ***** downstream applications *****: coherence evaluation for MT and summarization, and next utterance prediction in retrieval-based dialog. | ||
| 2020.acl-demos.7 Therefore, these representations lack many explicit connections between content words, that would be useful for ***** downstream applications *****. | ||
| mining | 33 | |
| 2021.arg***** mining *****-1.18 Our fine-tuned RoBERTa-Base model achieves a mean average precision score of 0.913, the best score for strict labels of all participating teams. | ||
| 2020.lrec-1.143 Our corpus can be used as a resource for analyzing persuasiveness and training an argument ***** mining ***** system to identify and extract argument structures. | ||
| Q15-1030 We suggest a method for deter***** mining ***** the correct labels of the clustering outcomes, and then use the labels for voting, improving the accuracy even further. | ||
| 2021.blackboxnlp-1.43 Rather than build a WSD system as in previous work, we investigate contextualized embedding neighborhoods directly, formulating a query-by-example nearest neighbor retrieval task and exa***** mining ***** ranking performance for words and senses in different frequency bands. | ||
| L06-1161 The purpose of the paper is to give an overview of parameters applicable to Dutch, which are determined by exa***** mining ***** a large set of data and two Dutch NLP systems. | ||
| years | 33 | |
| 2021.naacl-main.15 Over the ***** years *****, many different filtering approaches have been proposed. | ||
| 2020.sigdial-1.29 A total of 20 papers from the last two ***** years ***** are surveyed to analyze three types of evaluation protocols: automated, static, and interactive. | ||
| 2021.eval4nlp-1.20 The evaluation of the generated text is a challenging task and multiple theories and metrics have been proposed over the ***** years *****. | ||
| L16-1741 Previously, a seniors' speech corpus named S-JNAS was developed, but the average age of the participants was 67.6 ***** years *****, but the target age for nursing home care is around 75 ***** years ***** old, much higher than that of the S-JNAS samples. | ||
| 2020.ldl-1.5 In recent ***** years *****, there has been increasing interest in publishing lexicographic and terminological resources as linked data. | ||
| medical domain | 33 | |
| D19-6203 The experimental results suggest that dependency-based pooling is the best pooling strategy for RE in the bio***** medical domain *****, yielding the state-of-the-art performance on two benchmark datasets for this problem. | ||
| P19-2008 However, there are particular challenges in extending these open-domain techniques to extend into the bio***** medical domain *****. | ||
| W18-5618 Rapidly expanding volume of publications in the bio***** medical domain ***** makes it increasingly difficult for a timely evaluation of the latest literature. | ||
| W17-2507 Even though large collections are available for certain domains and language pairs, these are still scarce in the bio***** medical domain *****. | ||
| W19-5042 We use a multi - source transfer learning approach to transfer the knowledge from MT - DNN and SciBERT to natural language understanding tasks in the *****medical domain***** . | ||
| lexical complexity | 33 | |
| 2020.readi-1.3 However, research on the use of MWEs in ***** lexical complexity ***** assessment and simplification is still an under-explored area. | ||
| 2021.semeval-1.2 We propose an ensemble model for predicting the ***** lexical complexity ***** of words and multiword expressions (MWEs). | ||
| W16-4107 In this paper, we define ***** lexical complexity ***** for French and we present a pilot study on the effects of text simplification in dyslexic children. | ||
| 2021.semeval-1.88 We present our approach to predicting ***** lexical complexity ***** of words in specific contexts, as entered LCP Shared Task 1 at SemEval 2021. | ||
| 2019.icon-1.21 This paper proposes a metric to quantify ***** lexical complexity ***** in Malayalam. | ||
| conversations | 33 | |
| 2020.peoples-1.7 Furthermore, we propose contextual augmentation of pretrained language models for emotion recognition in ***** conversations *****, which is to consider not only previous utterances, but also conversation-related information such as speakers, speech acts and topics. | ||
| D18-1075 Different from conventional text generation tasks, the mapping between inputs and responses in ***** conversations ***** is more complicated, which highly demands the understanding of utterance-level semantic dependency, a relation between the whole meanings of inputs and outputs. | ||
| D18-1547 Even though machine learning has become the major scene in dialogue research community, the real breakthrough has been blocked by the scale of data available.To address this fundamental obstacle, we introduce the Multi-Domain Wizard-of-Oz dataset (MultiWOZ), a fully-labeled collection of human-human written ***** conversations ***** spanning over multiple domains and topics.At a size of 10k dialogues, it is at least one order of magnitude larger than all previous annotated task-oriented corpora.The contribution of this work apart from the open-sourced dataset is two-fold:firstly, a detailed description of the data collection procedure along with a summary of data structure and analysis is provided. | ||
| L16-1008 Our goal is to identify the sentiments of the users in the social network through their ***** conversations *****. | ||
| 2020.emnlp-main.512 Conversation disentanglement aims to separate intermingled messages into detached ***** conversations *****. | ||
| open domain | 33 | |
| L10-1249 The synthetic voices for Viennese varieties, implemented with the ***** open domain ***** unit selection speech synthesis engine Multisyn of Festival will also be released within Festival. | ||
| P19-1538 We present ***** open domain ***** dialogue generation with meta-words. | ||
| L06-1191 In this paper we investigated the use of Wikipedia, the ***** open domain ***** encyclopedia, for the Question Answering task. | ||
| W16-4404 There are some ***** open domain ***** question answering systems, such as IBM Waston, which take the unstructured text data as input, in some ways of humanlike thinking process and a mode of artificial intelligence. | ||
| L12-1010 The semantic annotation component achieves approximately 83% F-measure, which is very reasonable considering the wide range of entities and document types and the ***** open domain *****. | ||
| external knowledge | 33 | |
| 2020.lrec-1.94 A stream of this network also utilizes transfer learning by pre-training a bidirectional transformer to extract semantic representation for each input sentence and incorporates ***** external knowledge ***** through the neighborhood of the entities from a Knowledge Base (KB). | ||
| S18-1123 By defining only very few data-internal, word-level features and ***** external knowledge ***** sources in the form of word clusters and word embeddings, we train a fast and simple linear classifier | ||
| 2020.findings-emnlp.302 We introduce two effective models for duration prediction, which incorporate ***** external knowledge ***** by reading temporal-related news sentences (time-aware pre-training). | ||
| P19-1331 Our model outperforms previous state-of-the-art neural sentence simplification models (without ***** external knowledge *****) by large margins on three benchmark text simplification corpora in terms of SARI (+0.95 WikiLarge, +1.89 WikiSmall, +1.41 Newsela), and is judged by humans to produce overall better and simpler output sentences. | ||
| D19-1299 This allows the model to infer relevant facts which are not explicitly stated in the data table from an ***** external knowledge ***** source. | ||
| referring expression | 33 | |
| W17-3522 Using the furniture stimuli set developed for the TUNA and D-TUNA corpora, our corpus extends on these corpora by providing data collected in a simulated driving dual-task setting, and additionally provides exact duration annotations for the spoken ***** referring expression *****s. | ||
| K19-1040 As a result, past ***** referring expression *****s for objects can provide strong signals for grounding subsequent ***** referring expression *****s. | ||
| L12-1025 As an alternative, and perhaps less traditional approach, we also use surface information to build statistical language models of the ***** referring expression *****s that are most likely to occur in the corpus, and let the model probabilities guide attribute selection. | ||
| 2021.codi-main.5 The diversity of coreference chains is usually tackled by means of global features (length, types and number of ***** referring expression *****s, distance between them, etc.). | ||
| 2020.lrec-1.13 We are releasing this dataset to encourage research in the field of coreference resolution, ***** referring expression ***** generation and identification within realistic, deep dialogs involving multiple domains. | ||
| achieving | 32 | |
| 2020.lrec-1.176 The results show that such features are relevant to the task (***** achieving *****, alone, up to 92% classification accuracy) and may improve previous classification results. | ||
| D17-1103 First, using policy gradient and mixed-loss methods for reinforcement learning, we directly optimize sentence-level task-based metrics (as rewards), ***** achieving ***** significant improvements over the baseline, based on both automatic metrics and human evaluation on multiple datasets. | ||
| 2021.emnlp-main.218 As for the real-world application, our model has been applied to the in-house customs data, ***** achieving ***** reliable performance in the production setting. | ||
| 2020.coling-main.56 Furthermore, our best models outperform previous methods for the task, ***** achieving ***** new state-of-the-art results on two public benchmarks: Inspec and SemEval-2017. | ||
| K18-1047 Moreover, training on all strategies combined achieves further improvements, ***** achieving ***** a new state-of-the-art performance on the original task (also verified via human evaluation) | ||
| categorizing | 32 | |
| S19-2102 PERSPECTIVE performed better than BERT in detecting toxicity, but BERT was much better in ***** categorizing ***** the offensive type. | ||
| 2021.emnlp-main.254 In this survey, we review representative methods at the intersection of NLP and quantum physics in the past ten years, ***** categorizing ***** them according to the use of quantum theory, the linguistic targets that are modeled, and the downstream application. | ||
| D17-1279 This paper addresses the problem of extracting keyphrases from scientific articles and ***** categorizing ***** them as corresponding to a task, process, or material. | ||
| S19-2132 Task 6 of SemEval 2019 involves identifying and ***** categorizing ***** offensive language in social media. | ||
| D19-1505 A number of authoring tasks, such as ***** categorizing ***** and summarizing edits, detecting completed to-dos, and visually rearranging comments could benefit from such a contribution | ||
| furthermore | 32 | |
| L16-1130 We ***** furthermore ***** show how different variants of ROUGE result in very different correlations with the manual Pyramid scores. | ||
| N18-1100 Annotating these codes is labor intensive and error prone; ***** furthermore *****, the connection between the codes and the text is not annotated, obscuring the reasons and details behind specific diagnoses and treatments. | ||
| W17-1910 Our findings ***** furthermore ***** show that simple composition functions such as pointwise addition are able to recover sense specific information from a single-sense vector model remarkably well. | ||
| 2020.lrec-1.357 The parallel data in the UD project was chosen as a source because this would ***** furthermore ***** give us the first parallel treebank for Icelandic. | ||
| L10-1293 We ***** furthermore ***** measure the contribution of the features to the precision of the extraction: by using both morpho-syntactic and syntactic features, we achieve a higher precision in the identification of idiomatic MWEs, than by using only properties of one type | ||
| lemmatisation | 32 | |
| W19-3714 We report on the participation of the JRC Text Mining and Analysis Competence Centre (TMA-CC) in the BSNLP-2019 Shared Task, which focuses on named-entity recognition, ***** lemmatisation ***** and cross-lingual linking. | ||
| 2021.naacl-main.322 In this paper, we devote our attention to ***** lemmatisation ***** for low resource, morphologically rich scheduled Indian languages using neural methods. | ||
| 2021.emnlp-demo.7 We introduce COMBO – a fully neural NLP system for accurate part-of-speech tagging, morphological analysis, ***** lemmatisation *****, and (enhanced) dependency parsing. | ||
| L14-1114 We also describe modifications to the SMOR grammar that result in a more conventional ***** lemmatisation ***** of words. | ||
| L06-1049 The machine trained sets of ***** lemmatisation ***** rules are very easy to produce without having linguistic knowledge given that one has correct training data | ||
| adjective | 32 | |
| L08-1078 We discuss a software tool that suggests synset members using a measure of semantic relatedness with a given verb or ***** adjective *****; this extends previous work on nominal synsets in Polish WordNet. | ||
| W18-3807 The category `***** adjective *****' has been chosen on account of its lower frequency of occurrence in texts written in Spanish, and particularly in the Argentine Rioplatense variety, and with the aim of developing strategies to increase its use. | ||
| 2001.mtsummit-papers.28 Based on evaluation, our method is able to determine ***** adjective ***** dependency with an precision of about 94%. | ||
| L10-1293 We report about tools for the extraction of German multiword expressions (MWEs) from text corpora; we extract word pairs, but also longer MWEs of different patterns, e.g. verb-noun structures with an additional prepositional phrase or ***** adjective *****. | ||
| I17-3014 A grammar pattern consists of a head word (verb, noun, or ***** adjective *****) and its syntactic environment | ||
| gazetteers | 32 | |
| 2020.acl-main.722 However, designing such features for low-resource languages is challenging, because exhaustive entity ***** gazetteers ***** do not exist in these languages. | ||
| W17-4421 We propose a novel approach, which incorporates comprehensive word representations with multi-channel information and Conditional Random Fields (CRF) into a traditional Bidirectional Long Short-Term Memory (BiLSTM) neural network without using any additional hand-craft features such as ***** gazetteers *****. | ||
| S19-2228 The system detects toponyms using a bootstrapped machine learning (ML) approach which classifies names identified using ***** gazetteers ***** extracted from the GeoNames geographical database. | ||
| W18-3212 The model is augmented with stacked layers of enriched information such pre-trained embeddings, Brown clusters and named entity ***** gazetteers *****. | ||
| W16-3925 In this effort, we show detailed experimentation results on the effectiveness of word embeddings, brown clusters, part-of-speech (POS) tags, shape features, ***** gazetteers *****, and local context for the tweet input vector representation to the LSTM model | ||
| Euclidean | 32 | |
| C16-1015 However, the ***** Euclidean ***** similarity used in Gaussian topics is not an optimal semantic measure for word embeddings. | ||
| N18-1103 By injecting external linguistic constraints (e.g., WordNet links) into the initial vector space, the LE specialisation procedure brings true hyponymy-hypernymy pairs closer together in the transformed ***** Euclidean ***** space. | ||
| P19-1474 We demonstrate the superiority of Poincarë embeddings over distributional semantic representations, supporting the hypothesis that they can better capture hierarchical lexical-semantic relationships than embeddings in the ***** Euclidean ***** space. | ||
| W19-4319 The hyperbolic model shows improvements in some but not all cases over its ***** Euclidean ***** counterpart. | ||
| 2021.insights-1.8 However, given the wide influence of their work, our aim here is to present an updated and more accurate comparison between the ***** Euclidean ***** and hyperbolic embeddings | ||
| simplifications | 32 | |
| 2021.insights-1.19 These ***** simplifications ***** are the removal of (i) the Kullback-Liebler divergence from its objective and (ii) the fully unobserved latent variable from its probabilistic model. | ||
| Q16-1029 In this paper, we conduct an in-depth adaptation of statistical machine translation to perform text simplification, taking advantage of large-scale paraphrases learned from bilingual texts and a small amount of manual ***** simplifications ***** with multiple references. | ||
| 2020.readi-1.7 The ***** simplifications ***** have been produced according to a well-documented set of guidelines. | ||
| 2001.jeptalnrecital-poster.12 Application of disambiguation to dictionary definitions (in contrast to usual texts) allows for some ***** simplifications ***** of the algorithm, e.g., we do not have to care of context window size | ||
| 2021.gem-1.14 Our system is a monolingual Seq2Seq Transformer architecture that uses control tokens pre - pended to the data , allowing the model to shape the generated *****simplifications***** according to user desired attributes . | ||
| funniness | 32 | |
| 2020.semeval-1.98 This task includes two subtasks, the first of which is to estimate the ***** funniness ***** of headlines on a humor scale in the interval 0-3. | ||
| 2020.semeval-1.140 We found that our approach requires more text than we used to perform reliably, and that unexpectedness alone is not sufficient to gauge ***** funniness ***** for humorous content that targets a diverse target audience. | ||
| L10-1506 The features have been demonstrated to work well when classifying the ***** funniness ***** of single sentences up to entire blogs. | ||
| D19-1673 Specifically, our annotations of linguistic humor not only contain the degree of ***** funniness *****, like previous work, but they also contain key words that trigger humor as well as character relationship, scene, and humor categories. | ||
| 2021.emnlp-main.789 We use this dataset to train a model that provides a `***** funniness *****' score, on a five-point scale, given the audio and its corresponding text | ||
| fMRI | 32 | |
| 2020.lrec-1.25 In this paper we provide a pilot neuroimaging study of the possible neural correlates of speech disfluencies perception, using a combination of the corpus and functional magnetic-resonance imaging (***** fMRI *****) methods. | ||
| C18-1243 Neural activation models have been proposed in the literature that use a set of example words for which ***** fMRI ***** measurements are available in order to find a mapping between word semantics and localized neural activations. | ||
| W18-4904 In this paper, we present a ***** fMRI ***** study based on language comprehension to provide neuroimaging evidence for processing MWEs. | ||
| 2020.lincr-1.3 We review recent studies leveraging different types of cognitive processing signals, namely eye-tracking, M/EEG and ***** fMRI ***** data recorded during language understanding | ||
| 2020.cogalex-1.2 Functional Magnetic Resonance Imaging ( fMRI ) provides a means to investigate human conceptual representation in cognitive and neuroscience studies , where researchers predict the *****fMRI***** activations with elicited stimuli inputs . | ||
| monotonic | 32 | |
| P18-1171 We present a ***** monotonic ***** hard attention model for the transition framework to handle the strictly left-to-right alignment between each transition state and the current buffer input focus. | ||
| 2000.iwpt-1.11 A transformation-based approach to robust parsing is presented, which achieves a strictly ***** monotonic ***** improvement of its current best hypothesis by repeatedly applying local repair steps to a complex multi-level representation. | ||
| C18-1123 After reordering, the attention in the decoder becomes more peaked and ***** monotonic *****. | ||
| 2021.wmt-1.119 This is unlike human simultaneous interpreters who produce largely ***** monotonic ***** translations at the expense of the grammaticality of a sentence being translated. | ||
| I17-1044 Experimental results on ASR, G2P and machine translation between two languages with similar sentence structures, demonstrate that the proposed encoder-decoder model with local ***** monotonic ***** attention could achieve significant performance improvements and reduce the computational complexity in comparison with the one that used the standard global attention architecture | ||
| ELRA | 32 | |
| L14-1146 Very recently the GlobalPhone pronunciation dictionaries have been made available for research and commercial purposes by the European Language Resources Association (***** ELRA *****). | ||
| L12-1407 The second part shows how ***** ELRA ***** helps in the development and evaluation of HLT, in particular through its numerous participations to collaborative projects for the production of resources and platforms to facilitate their production and exploitation | ||
| L16-1718 To allow an easy understanding of the various licenses that exist for the use of Language Resources ( *****ELRA***** 's , META - SHARE 's , Creative Commons ' , etc . | ||
| L14-1155 This paper aims at analyzing the content of the LREC conferences contained in the *****ELRA***** Anthology over the past 15 years ( 1998 - 2013 ) . | ||
| L10-1640 ) , more annotations , at different levels and for different modalities ... easy access to these LRs and solved IPR issues , appropriate and adaptable licensing schemas ... large activity in HLT evaluation , both in terms of setting up the evaluation and in helping produce all necessary data , protocols , specifications as well as conducting the whole process ... producing the LRs researchers and developers need , LRs for a wide variety of activities and technologies ... for development , for training , for evaluation ... Disseminating all knowledge in the field , whether generated at *****ELRA***** or elsewhere ... keeping the community up to date with what goes on regularly ( LREC conferences , LangTech , Newsletters , HLT Evaluation Portal , etc . ) . | ||
| Verbal | 32 | |
| L12-1171 ***** Verbal ***** intelligence scores of the test persons were compared to other features that may reflect dominant behaviour. | ||
| W18-1101 Classical theories like Script-based Semantic Theory of Humour and General ***** Verbal ***** Theory of Humour try and achieve this feat to an adequate extent | ||
| 2016.lilt-14.7 *****Verbal***** irony , or sarcasm , presents a significant technical and conceptual challenge when it comes to automatic detection . | ||
| 2021.cmcl-1.20 *****Verbal***** prediction has been shown to be critical during online comprehension of Subject - Object - Verb ( SOV ) languages . | ||
| W18-3711 *****Verbal***** communication and pronunciation as its part is a core skill that can be developed through guided learning . | ||
| sarcastic | 32 | |
| W19-1309 The sentence `Love waking up at 3 am' is ***** sarcastic ***** because of the number. | ||
| 2021.emnlp-demo.38 Furthermore, Chandler not only generates ***** sarcastic ***** responses, but also explanations for why each response is ***** sarcastic *****. | ||
| 2021.wassa-1.4 Inherent ambiguity in ***** sarcastic ***** expressions makes sarcasm detection very difficult. | ||
| D19-5544 Many online reviews are ***** sarcastic *****, humorous, or hateful | ||
| 2021.alta-1.21 In 2019 , the Australasian Language Technology Association ( ALTA ) organised a shared task to detect the target of *****sarcastic***** comments posted on social media . | ||
| Memotion | 32 | |
| 2020.semeval-1.143 ***** Memotion ***** analysis is a very crucial and important subject in today's world that is dominated by social media. | ||
| 2020.semeval-1.148 Our method achieves the performance within the top 2 ranks in the final leaderboard of ***** Memotion ***** Analysis among 36 Teams. | ||
| 2020.semeval-1.154 In this paper, we describe our deep learning system used for SemEval 2020 Task 8: ***** Memotion ***** analysis. | ||
| 2020.semeval-1.111 This paper presents our work on the ***** Memotion ***** Analysis shared task of SemEval 2020, which involves the sentiment and humor analysis of memes. | ||
| 2020.semeval-1.115 This paper describes the system submitted by the PRHLT - UPV team for the task 8 of SemEval-2020 : *****Memotion***** Analysis . | ||
| aspectual | 32 | |
| 2016.lilt-13.3 It will be shown that this ***** aspectual ***** property is identified and classified with ease both by humans and by automatic systems. | ||
| W18-4912 The proposed framework augments the representation of finite predications to include a four-way temporal distinction (event time before, up to, at, or after speech time) and several ***** aspectual ***** distinctions (including static vs. dynamic, habitual vs. episodic, and telic vs. atelic). | ||
| L16-1193 We present an experimental study making use of a machine learning approach to identify the factors that affect the ***** aspectual ***** value that characterizes verbs under each of their readings. | ||
| 2020.coling-main.401 We find that a verb's local context is most indicative of its ***** aspectual ***** class, and we demonstrate that closed class words tend to be stronger discriminating contexts than content words. | ||
| L10-1122 This paper presents project Nomage, which aims at describing the ***** aspectual ***** properties of deverbal nouns in an empirical way | ||
| Gender | 32 | |
| I17-1093 In this study, we present a supervised learning strategy to detect racist language on Twitter based on word embedding that incorporate demographic (Age, ***** Gender *****, and Location) information. | ||
| R17-1075 *****Gender***** identification in social networks is one of the most popular aspects of user profile learning . | ||
| 2020.wmt-1.39 *****Gender***** bias in machine translation can manifest when choosing gender inflections based on spurious gender correlations . | ||
| 2020.bionlp-1.1 *****Gender***** bias in biomedical research can have an adverse impact on the health of real people . | ||
| 2020.winlp-1.25 *****Gender***** bias negatively impacts many natural language processing applications , including machine translation ( MT ) . | ||
| incremental | 32 | |
| 2021.acl-long.286 We also evaluate our models' ***** incremental ***** performance to establish the trade-off between ***** incremental ***** performance and final performance, using different prediction strategies. | ||
| I17-2027 Inspired by this cognitive ability, ***** incremental ***** algorithms for natural language processing tasks have been proposed and demonstrated promising performance. | ||
| W18-6419 Our NMT systems were trained with the transformer architecture using the provided parallel data enlarged with a large quantity of back-translated monolingual data that we generated with a new ***** incremental ***** training framework. | ||
| 2020.conll-1.49 Contemporary autoregressive language models (LMs) trained purely on corpus data have been shown to capture numerous features of human ***** incremental ***** processing. | ||
| S17-2166 Word Embedding Distance Pattern, which uses the head noun word embedding to generate distance patterns based on labeled keyphrases, is proposed as an ***** incremental ***** feature set to enhance the conventional Named Entity Recognition feature sets. | ||
| descriptions | 32 | |
| 2020.emnlp-main.430 Subevents elaborate an event and widely exist in event ***** descriptions *****. | ||
| L10-1451 The monologues were ***** descriptions ***** of two short films; the dialogues were discussions about problems of German education. | ||
| W16-5403 We identify cases of idiosyncrasy of Mandarin Chinese that are difficult to fit into the current schema which has mainly been based on the ***** descriptions ***** of various Indo-European languages. | ||
| P19-1167 We find that there are significant differences between ***** descriptions ***** of male and female nouns and that these differences align with common gender stereotypes: Positive adjectives used to describe women are more often related to their bodies than adjectives used to describe men. | ||
| 2020.inlg-1.35 To generate a dataset for SG-NLG we re-purpose an existing dataset for another task: dialog state tracking, which includes a large and rich schema spanning multiple different attributes, including information about the domain, user intent, and slot ***** descriptions ***** | ||
| attributes | 32 | |
| E17-1059 We also introduce an attention mechanism to jointly generate reviews and align words with input ***** attributes *****. | ||
| 2020.emnlp-main.15 We find that most ***** attributes ***** are reliably encoded by only a few neurons, with fastText concentrating its linguistic structure more than BERT. | ||
| R19-1137 Thus, a large number of models has been developed for Knowledge Base Completion, the task of predicting new ***** attributes ***** of entities given partial descriptions of these entities. | ||
| L06-1345 In order to be able to classify errors in the texts, we have introduced new ***** attributes ***** to the TEI corr and sic tags. | ||
| P19-1388 Most current NLP systems have little knowledge about quantitative ***** attributes ***** of objects and events | ||
| grammatical error detection | 32 | |
| 2021.bea-1.15 We approach the problem with the methods for ***** grammatical error detection ***** (GED), since we hypothesize that models for detecting grammatical mistakes can assess the correctness of potential alternative answers in a learning setting. | ||
| 2021.naacl-main.429 Our experiments on four GEC datasets show that VERNet achieves state-of-the-art ***** grammatical error detection ***** performance, achieves the best quality estimation results, and significantly improves GEC performance by reranking hypotheses. | ||
| I17-1005 Experimental results show that a bidirectional long-short term memory model initialized by our word embeddings achieved the state-of-the-art accuracy by a large margin in an English ***** grammatical error detection ***** task on the First Certificate in English dataset. | ||
| 2020.sltu-1.1 We also discuss applications of these models to ***** grammatical error detection ***** and language modeling. | ||
| 2020.coling-main.195 To further explore the data we train ***** grammatical error detection ***** models with various configurations including pre-trained and contextual word representations as input, additional features and auxiliary objectives, and extra training data from written error-annotated corpora. | ||
| dependency structures | 32 | |
| P17-2068 In this work, we construct a corpus that ensures consistency between ***** dependency structures ***** and MWEs, including named entities. | ||
| L06-1070 The dependency structure patterns are generated by using two operations: combining and interpolation, which utilize ***** dependency structures ***** in the searched corpus. | ||
| 2000.amta-papers.5 The approach relies on canonical predicate-argument structures (or ***** dependency structures *****), which provide a suitable pivot representation for the handling of structural divergences and the recovery of dropped arguments. | ||
| C16-1238 This architecture enables learning some important features from task-specific labeled data, forgoing the need for external knowledge such as explicit ***** dependency structures *****. | ||
| L08-1115 We also convert the hand-corrected and parser output phrase structure trees to dependency trees using a state-of-the-art functional tag labeller and constituent-to-dependency conversion tool, and then calculate label accuracy, unlabelled attachment and labelled attachment scores over the ***** dependency structures *****. | ||
| discourse treebank | 32 | |
| 2021.naacl-main.128 With extensive experiments on the standard RST ***** discourse treebank *****, we demonstrate that our parser outperforms existing methods by a good margin in both end-to-end parsing and parsing with gold segmentation. | ||
| 2020.emnlp-main.603 In this work, we present a novel scalable methodology to automatically generate ***** discourse treebank *****s using distant supervision from sentiment annotated datasets, creating and publishing MEGA-DT, a new large-scale discourse-annotated corpus. | ||
| D17-1225 In this paper we propose the first end-to-end discourse parser that jointly parses in both syntax and discourse levels, as well as the first syntacto-***** discourse treebank ***** by integrating the Penn Treebank and the RST Treebank. | ||
| P18-2070 Experiments on the RST ***** discourse treebank ***** show that our method outperforms traditional featured based methods, and the memory based discourse cohesion can improve the overall parsing performance significantly. | ||
| 2020.coling-main.337 We further demonstrate that pretraining our parser on the recently available large-scale “silver-standard” ***** discourse treebank ***** MEGA-DT provides even larger performance benefits, suggesting a novel and promising research direction in the field of discourse analysis. | ||
| translation tasks | 32 | |
| 2020.wmt-1.29 We present the results of our systems for the English–Inuktitut language pair for the WMT 2020 ***** translation tasks *****. | ||
| 2021.emnlp-main.263 Experimental results show that BiT pushes the SOTA neural machine translation performance across 15 ***** translation tasks ***** on 8 language pairs (data sizes range from 160K to 38M) significantly higher. | ||
| 2020.lrec-1.860 We focus on three real-world use cases (communication with IT support, describing administrative issues and asking encyclopedic questions) from which we gain insight into different strategies users take when faced with outbound ***** translation tasks *****. | ||
| 2011.iwslt-papers.8 The application of a non-restrictive approach together with an integrated dependency LM scoring is a novel contribution which yields significant improvements for two large-scale ***** translation tasks ***** for the language pairs Chinese–English and German–French. | ||
| P18-2104 Empirical evaluation of two low-resource ***** translation tasks *****, English to Vietnamese and Farsi, show +1 BLEU score improvements compared to strong baselines. | ||
| named entity disambiguation | 32 | |
| L14-1540 Our approach introduces Wikipedia as a raw text and uses the DBpedia data set for ***** named entity disambiguation *****. | ||
| P19-1023 Our model employs jointly learned word and entity embeddings to support ***** named entity disambiguation *****. | ||
| 2020.lrec-1.583 Moreover, specialized embeddings also exist for tasks like topic modeling or ***** named entity disambiguation *****. | ||
| 2021.crac-1.7 We also present a new entity linking annotation on the dataset using WikiData identifiers, a ***** named entity disambiguation ***** (NED) dataset, and a larger automatically created NED dataset enabling wikily supervised NED models. | ||
| 2021.acl-long.364 Unlike other approaches for ***** named entity disambiguation ***** (e.g., entity linking), streaming CDC allows for the disambiguation of entities that are unknown at inference time. | ||
| evaluation of machine | 32 | |
| 2020.wmt-1.137 Yet, little is known about best practices regarding human ***** evaluation of machine ***** translation at the document-level. | ||
| 2020.lrec-1.852 This paper describes our developing dataset of Japanese slot filling quizzes designed for ***** evaluation of machine ***** reading comprehension. | ||
| W19-8708 The automatic ***** evaluation of machine ***** translation (MT) has proven to be a very significant research topic. | ||
| N18-4015 Al-though it is difficult to train sentence representations using small-scale translation datasets with manual evaluation, sentence representations trained from large-scale data in other tasks can improve the automatic ***** evaluation of machine ***** translation. | ||
| P19-1269 Accurate, automatic ***** evaluation of machine ***** translation is critical for system tuning, and evaluating progress in the field. | ||
| written text | 32 | |
| L16-1513 the clinical subcorpus, consisting of ***** written text *****s produced by speakers with various types of language disorders, and the healthy speakers subcorpus, as well as by the levels of its annotation, it offers an opportunity for different lines of research. | ||
| 2020.coling-main.208 Detectors that can distinguish text generated by TGM from human ***** written text ***** play a vital role in mitigating such misuse of TGMs. | ||
| L06-1261 This paper describes FreP, a new electronic tool that provides frequency counts of phonological units at the word-level and below from Portuguese ***** written text *****: namely, major classes of segments, syllables and syllable types, phonological clitics, clitic types and size, prosodic words and their shape, word stress location, and syllable type by position within the word and/or status relative to word stress. | ||
| W17-7701 The field of automated deception detection in ***** written text *****s is methodologically challenging. | ||
| L12-1615 The means of corpus presentation is a multimodal framework, since it joins together both the manuscript's image and the ***** written text *****: the letter's material representation in facsimile and the letter's digital transcription. | ||
| dialogue acts | 32 | |
| 2020.lrec-1.74 We highlight how thinking aloud affects interpretation of ***** dialogue acts ***** in our setting and how to best capture that information. | ||
| 2020.acl-main.638 To address these issues, we propose a neural co-generation model that generates ***** dialogue acts ***** and responses concurrently. | ||
| 2020.lrec-1.80 More specifically, we describe the method used to annotate ***** dialogue acts ***** in the corpus, including the evaluation of the annotations. | ||
| 2021.sigdial-1.31 We employ both traditional and transformer-based machine learning models for ***** dialogue acts ***** prediction and find them statistically indistinguishable in performance on our health coaching dataset. | ||
| L12-1044 We analyse the fundamental distinction between (a) the coding of surface features; (b) form-related semantic classification; and (c) semantic annotation in terms of ***** dialogue acts *****, supported by experimental studies of (a) and (b). | ||
| word meaning | 32 | |
| I17-1022 Besides its success and practical value, however, questions arise about the relationships between a true ***** word meaning ***** and its distributed representation. | ||
| 2021.emnlp-main.567 In this paper, we describe the creation of the largest resource of graded contextualized, diachronic ***** word meaning ***** annotation in four different languages, based on 100,000 human semantic proximity judgments. | ||
| C16-1175 In particular, we focus on word-embedding models which have been proposed to learn aspects of ***** word meaning ***** in a manner similar to humans. | ||
| C18-2003 We here introduce a substantially extended version of JeSemE, an interactive website for visually exploring computationally derived time-variant information on ***** word meaning *****s and lexical emotions assembled from five large diachronic text corpora. | ||
| 2021.acl-long.281 This paper presents a multilingual study of *****word meaning***** representations in context . | ||
| polarity classification | 32 | |
| W19-1304 Based on this finding, we analyze differences between embeddings used by these systems in regard to their capability of handling such cases and argue that intensifiers in context of emotion words need special treatment, as is established for sentiment ***** polarity classification *****, but not for more fine-grained emotion prediction. | ||
| E17-1095 To the best of our knowledge, this is the first work that explores the use of a convolutional neural network to ***** polarity classification ***** of Spanish tweets. | ||
| 2020.semeval-1.151 Our better performance was achieved in Task A, related to ***** polarity classification *****. | ||
| 2021.naacl-main.143 This work helps to fill this gap by proposing a methodology to characterize, quantify and measure the impact of hard instances in the task of ***** polarity classification ***** of movie reviews. | ||
| W18-6222 We manually annotate a freely available English sentiment polarity dataset with these boundaries and carry out a series of experiments which demonstrate that high quality sentiment expressions can boost the performance of ***** polarity classification *****. | ||
| support vector | 32 | |
| L14-1344 This tool applies fingerprinting to different acoustic features extracted from the audio signal in order to remove perceptual irrelevancies, and a ***** support vector ***** machine is trained for classifying these fingerprints in classes music and no-music. | ||
| K19-1062 The model consists of 1) a recurrent neural network (RNN) to learn scoring functions for pair-wise relations, and 2) a structured ***** support vector ***** machine (SSVM) to make joint predictions. | ||
| S17-2141 Since two submissions were allowed, two different machine learning methods were developed to solve this task, a ***** support vector ***** machine approach and a recurrent neural network approach. | ||
| 2021.sustainlp-1.1 The structure of our convex program is such that standard ***** support vector ***** machine software packages, which are numerically robust and efficient, can solve it. | ||
| 2008.amta-papers.4 We construct a discriminative, syntactic language model (LM) by using a latent ***** support vector ***** machine (SVM) to train an unlexicalized parser to judge sentences. | ||
| data selection | 32 | |
| 2021.emnlp-main.31 We explore the dynamical adjustments on three aspects: teacher model adoption, ***** data selection *****, and KD objective adaptation. | ||
| 2020.wmt-1.130 We investigate the usefulness of ***** data selection ***** in the unsupervised setting. | ||
| 2020.wmt-1.60 Specifically, we proposed a hybrid ***** data selection ***** method to select high-quality and in-domain sentences from out-of-domain data. | ||
| 2021.emnlp-main.268 We evaluate our cross-lingual ***** data selection ***** method on NMT across five diverse domains in three language pairs, as well as a real-world scenario of translation for COVID-19. | ||
| D19-1153 In addition, characteristic differences between the source and target languages raise a natural question of whether source ***** data selection ***** can improve the knowledge transfer. | ||
| neural text generation | 32 | |
| 2021.emnlp-main.504 The largest available dataset for enthymemes (Habernal et al., 2018) consists of 1.7k samples, which is not large enough to train a ***** neural text generation ***** model. | ||
| N18-1204 We introduce an approach to ***** neural text generation ***** that explicitly represents entities mentioned in the text. | ||
| D17-1227 In ***** neural text generation ***** such as neural machine translation, summarization, and image captioning, beam search is widely used to improve the output text quality. | ||
| 2021.acl-long.173 Data augmentation is an effective way to improve the performance of many ***** neural text generation ***** models. | ||
| 2020.spnlp-1.1 We propose a new paradigm for introducing a syntactic inductive bias into ***** neural text generation *****, where the dependency parse tree is used to drive the Transformer model to generate sentences iteratively. | ||
| process | 32 | |
| W17-1411 We investigate whether word embeddings offer any advantage over corpus- and pre***** process *****ing-free string kernels, and how these compare to bag-of-words baselines. | ||
| L14-1333 We discuss our specifications, pre-***** process *****ing and evaluation | ||
| D19-1236 The review and selection ***** process ***** for scientific paper publication is essential for the quality of scholarly publications in a scientific field. | ||
| 2021.wnut-1.53 Our results show that while word-level, intrinsic, performance evaluation is behind other methods, our model improves performance on extrinsic, downstream tasks through normalization compared to models operating on raw, un***** process *****ed, social media text. | ||
| L10-1479 The *****process***** engine for pattern recognition and information fusion tasks , the \emphpepr framework , aims to empower the researcher to develop novel solutions in the field of pattern recognition and information fusion tasks in a timely manner , by supporting reuse and combination of well tested and established components in an environment , that eases the wiring of distinct algorithms and description of the control flow through graphical tooling . | ||
| community | 32 | |
| 2020.acl-main.560 Through this paper, we attempt to convince the ACL ***** community ***** to prioritise the resolution of the predicaments highlighted here, so that no language is left behind. | ||
| 2020.acl-demos.5 We present a large improvement over classic search engine baseline on several standard QA datasets and provide the ***** community ***** a collaborative data collection tool to curate the first natural language processing research QA dataset via a ***** community ***** effort. | ||
| L12-1076 The article also sums up previous analyses of this corpus and indicates possible uses of this corpus for the NLP ***** community *****. | ||
| W18-3814 The annotated corpus resulting from our work will be made available to the ***** community *****. | ||
| C16-1163 In this paper, we apply Long Short-Term Memory networks with an attention mechanism, which can select important parts of text for the task of similar question retrieval from ***** community ***** Question Answering (cQA) forums. | ||
| spoken dialog | 32 | |
| P17-1120 Recently emerged intelligent assistants on smartphones and home electronics (e.g., Siri and Alexa) can be seen as novel hybrids of domain-specific task-oriented ***** spoken dialog *****ue systems and open-domain non-task-oriented ones. | ||
| L10-1351 We describe an experimentalWizard-of-Oz-setup for the integration of emotional strategies into ***** spoken dialog *****ue management. | ||
| L10-1398 In this paper, we propose an estimation method of user satisfaction for a ***** spoken dialog ***** system using an N-gram-based dialog history model. | ||
| L08-1493 Regulus is an Open Source platform that supports construction of rule-based medium-vocabulary ***** spoken dialog *****ue applications. | ||
| 2021.sigdial-1.45 The ability to take turns in a fluent way ( i.e. , without long response delays or frequent interruptions ) is a fundamental aspect of any *****spoken dialog***** system . | ||
| narrative | 32 | |
| 2020.acl-main.765 A ***** narrative ***** plays a different role than the context (i.e., previous utterances), which is generally used in current dialogue systems. | ||
| 2021.naacl-main.342 In the pursuit of natural language understanding, there has been a long standing interest in tracking state changes throughout ***** narrative *****s. | ||
| 2020.lrec-1.415 This article describes the process of gathering and constructing a bilingual parallel corpus of Islamic Hadith, which is the set of ***** narrative *****s reporting different aspects of the prophet Muhammad's life. | ||
| 2020.nuse-1.7 We perform the ablation study and conclude that the inductive biases introduced by ARM are conducive to better performance on the ***** narrative ***** cloze test. | ||
| N18-2106 We present an initial approach for this problem, which finds correspondences between ***** narrative *****s in terms of plot events, and resemblances between characters and their social relationships. | ||
| rst discourse parsing | 32 | |
| 2020.codi-1.17 We present preliminary results on investigating the benefits of coreference resolution features for neural *****RST discourse parsing***** by considering different levels of coupling of the discourse parser with the coreference resolver. | ||
| D17-1133 Recent advances in *****RST discourse parsing***** have focused on two modeling paradigms: (a) high order parsers which jointly predict the tree structure of the discourse and the relations it encodes; or (b) linear-time parsers which are efficient but mostly based on local features. | ||
| J18-2001 Computational text-level discourse analysis mostly happens within Rhetorical Structure Theory (RST), whose structures have classically been presented as constituency trees, and relies on data from the RST Discourse Treebank (RST-DT); as a result, the *****RST discourse parsing***** community has largely borrowed from the syntactic constituency parsing community. | ||
| C16-1179 The main challenge for *****RST discourse parsing***** is the limited amounts of training data. | ||
| 2021.codi-main.15 While previous work significantly improves the performance of *****RST discourse parsing*****, they are not readily applicable to practical use cases: (1) EDU segmentation is not integrated into most existing tree parsing frameworks, thus it is not straightforward to apply such models on newly-coming data. | ||
| literary | 32 | |
| 2020.lrec-1.105 The resource contains three types of data for the investigation and evaluation of quite distinct phenomena: TEI-compliant song lyrics as primary data, linguistically and ***** literary ***** motivated annotations, and extralinguistic metadata. | ||
| C16-2040 TopoText takes as input a ***** literary ***** piece of text such as a novel or a biography article and automatically extracts all place names in the text. | ||
| L14-1211 The article gives a short overview over the design of the corpus that has to serve quite different purposes from palaeographic over stemmatological to ***** literary ***** research. | ||
| L14-1591 Here, we present the original parallel ***** literary ***** corpus, then we address issues related to pos-tagging a large collection of Serbian text: from the conception of an appropriate tagset for Serbian, to the choice of an automatic pos-tagger adapted to the task, and then to some quantitative and qualitative results. | ||
| L16-1647 This paper presents a number of experiments to model changes in a historical Portuguese corpus composed of *****literary***** texts for the purpose of temporal text classification . | ||
| entailment recognition | 32 | |
| C16-1104 We combine i) health outcome detection, ii) keyphrase extraction, and iii) textual ***** entailment recognition ***** between sentences. | ||
| W18-5522 The sentences in these documents are then supplied to a textual ***** entailment recognition ***** module. | ||
| 2020.emnlp-main.125 Phrase alignment is the basis for modelling sentence pair interactions, such as paraphrase and textual ***** entailment recognition *****. | ||
| 2020.semeval-1.31 Lexical ***** entailment recognition ***** plays an important role in tasks like Question Answering and Machine Translation. | ||
| L10-1259 We focus on textual entailments mediated by syntax and propose a new methodology to evaluate textual ***** entailment recognition ***** systems on such data. | ||
| homographic pun | 32 | |
| S17-2079 We participated in 2 subtasks related to *****homographic puns***** achieve comparable results for these tasks. | ||
| D18-1272 *****Homographic puns***** have a long history in human writing, widely used in written and spoken literature, which usually occur in a certain syntactic or stylistic structure. | ||
| S17-2011 Our system achieved f-score of calculating, 0.663, and 0.07 in *****homographic puns***** and 0.8439, 0.6631, and 0.0806 in heterographic puns in task 1, task 2, and task 3 respectively. | ||
| N19-1217 A pun is a form of wordplay for an intended humorous or rhetorical effect, where a word suggests two or more meanings by exploiting polysemy (*****homographic pun*****) or phonological similarity to another word (heterographic pun). | ||
| S17-2076 This paper describes our system participating in the SemEval-2017 Task 7, for the subtasks of *****homographic pun***** location and *****homographic pun***** interpretation. | ||
| rule - based | 32 | |
| W03-3019 We investigate an aspect of the relationship between parsing and corpus - based methods in NLP that has received relatively little attention : coverage augmentation in *****rule - based***** parsers . | ||
| L08-1457 In this paper we discuss a *****rule - based***** approach to chunking sentences in Croatian , implemented using local regular grammars within the NooJ development environment . | ||
| L12-1559 This paper describes a *****rule - based***** approach to segment Arabic texts into clauses . | ||
| L12-1640 We report on several experiments on combining a *****rule - based***** tagger and a trigram tagger for Spanish . | ||
| W18-0530 We present a novel *****rule - based***** system for automatic generation of factual questions from sentences , using semantic role labeling ( SRL ) as the main form of text analysis . | ||
| Japanese | 32 | |
| W16-4006 We are constructing an annotated diachronic corpora of the *****Japanese***** language . | ||
| 2021.wat-1.12 For updating the translations of *****Japanese***** statutes based on their amendments , we need to consider the translation focality ; that is , we should only modify expressions that are relevant to the amendment and retain the others to avoid misconstruing its contents . | ||
| 2001.mtsummit-papers.28 In *****Japanese***** constructions of the form [ N1 no Adj N2 ] , the adjective Adj modifies either N1 or N2 . | ||
| L14-1103 Relations between frames and constructions must be made explicit in FrameNet - style linguistic resources such as Berkeley FrameNet ( Fillmore & Baker , 2010 , Fillmore , Lee - Goldman & Rhomieux , 2012 ) , *****Japanese***** FrameNet ( Ohara , 2013 ) , and Swedish Constructicon ( Lyngfelt et al . , 2013 ) . | ||
| L08-1077 In this paper we describe the construction of an illustrated *****Japanese***** Wordnet . | ||
| domain - specific | 32 | |
| 2021.mtsummit-asltrw.1 We address the problem of language model customization in applications where the ASR component needs to manage *****domain - specific***** terminology ; although current state - of - the - art speech recognition technology provides excellent results for generic domains , the adaptation to specialized dictionaries or glossaries is still an open issue . | ||
| 2021.emnlp-main.96 Recent development in NLP shows a strong trend towards refining pre - trained models with a *****domain - specific***** dataset . | ||
| 2020.insights-1.16 Intent Detection systems in the real world are exposed to complexities of imbalanced datasets containing varying perception of intent , unintended correlations and *****domain - specific***** aberrations . | ||
| L08-1164 Semantic annotation of text requires the dynamic merging of linguistically structured information and a world model , usually represented as a *****domain - specific***** ontology . | ||
| L10-1146 Domain specific entity recognition often relies on *****domain - specific***** knowledge to improve system performance . | ||
| Social media | 32 | |
| W19-1301 *****Social media***** sites like Facebook , Twitter , and other microblogging forums have emerged as a platform for people to express their opinions and views on different issues and events . | ||
| W19-3511 *****Social media***** platforms like Twitter and Instagram face a surge in cyberbullying phenomena against young users and need to develop scalable computational methods to limit the negative consequences of this kind of abuse . | ||
| 2020.figlang-1.11 *****Social media***** platforms and discussion forums such as Reddit , Twitter , etc . | ||
| 2019.jeptalnrecital-court.21 *****Social media***** networks have become a space where users are free to relate their opinions and sentiments which may lead to a large spreading of hatred or abusive messages which have to be moderated . | ||
| R19-2002 *****Social media***** platforms have become prime forums for reporting news , with users sharing what they saw , heard or read on social media . | ||
| ordinal | 31 | |
| S18-1018 For the ***** ordinal ***** classification we additionally use our Brainy system with features using parse tree, POS tags, and morphological features. | ||
| S18-1057 We submitted the system's output for subtasks 1 (emotion intensity prediction), 2 (emotion ***** ordinal ***** classification), 3 (valence intensity regression) and 4 (valence ***** ordinal ***** classification), for English tweets. | ||
| 2021.acl-long.214 Ordinal Quantification (OQ) is a related task where the gold data is a distribution over ***** ordinal ***** classes, and the system is required to estimate this distribution. | ||
| 2021.emnlp-main.557 To support this investigation, we develop Wiki-Convert, a 900,000 sentence dataset annotated with numbers and units, to avoid conflating nominal and ***** ordinal ***** number occurrences. | ||
| D19-1182 We propose deep ***** ordinal ***** regression approaches for specificity prediction, under both supervised and semi-supervised settings, and provide empirical results demonstrating the effectiveness of the proposed techniques over several baseline approaches | ||
| linearization | 31 | |
| N19-1238 We introduce a novel graph transforming encoder which can leverage the relational structure of such knowledge graphs without imposing ***** linearization ***** or hierarchical constraints. | ||
| 2000.amta-papers.7 This engine has been used successfully in creating an English ***** linearization ***** program that is currently employed as part of a Chinese-English machine translation system. | ||
| W17-3539 This demo paper presents the multilingual deep sentence generator developed by the TALN group at Universitat Pompeu Fabra, implemented as a series of rule-based graph-transducers for the syntacticization of the input graphs, the resolution of morphological agreements, and the ***** linearization ***** of the trees. | ||
| N19-1235 We show that a sequence-to-sequence model that maps a ***** linearization ***** of Dependency MRS, a graph-based representation of MRS, to text can achieve a BLEU score of 66.11 when trained on gold data. | ||
| 2020.acl-main.299 We propose a novel ***** linearization ***** of a constituent tree, together with a new locally normalized model | ||
| quantifying | 31 | |
| 2020.acl-main.262 In ***** quantifying ***** this uncertainty, our method, which we call Bernstein-bounded unfairness, helps prevent classifiers from being deemed biased or unbiased when there is insufficient evidence to make either claim. | ||
| D19-1531 However, most studies to date have focused on ***** quantifying ***** and mitigating such bias only in English. | ||
| 2020.emnlp-main.293 Previous work is mostly based on statistical methods that estimate word-level salience, which does not consider semantics and larger context when ***** quantifying ***** importance. | ||
| 2020.acl-main.265 Our analysis lays groundwork for future ***** quantifying ***** and mitigating bias in NRE. | ||
| 2020.emnlp-main.71 Extensive work in linguistic typology has sought to characterize word class flexibility across languages, but ***** quantifying ***** this phenomenon accurately and at scale has been fraught with difficulties | ||
| applicability | 31 | |
| C18-1034 Among the solutions to alleviate this problem is the automatic evaluation of sentence readability, task which has been receiving a lot of attention due to its large ***** applicability *****. | ||
| L12-1569 Also, some user-based scenarios are mentioned to demonstrate the corpus services and ***** applicability *****. | ||
| 2016.gwc-1.50 Using WordNet instead of a domain-specific ontology or classification system ensures ***** applicability ***** of the method outside of the folktale domain. | ||
| W18-3813 We present a case study for the language pair English — German using the FrameNet and SALSA corpora and find that inferences can be made about cross-lingual frame ***** applicability ***** using a vector space model. | ||
| P17-2030 We evaluate our model with respect to dependency accuracy and grammaticality improvements for ungrammatical sentences, demonstrating the robustness and ***** applicability ***** of our scheme | ||
| repositories | 31 | |
| S18-1132 Large ***** repositories ***** of scientific literature call for the development of robust methods to extract information from scholarly papers. | ||
| L14-1524 Maintaining isolated ***** repositories ***** with overlapping data is costly in terms of time and effort. | ||
| 2021.eacl-srw.7 Institutes are required to catalog their articles with proper subject headings so that the users can easily retrieve relevant articles from the institutional ***** repositories *****. | ||
| L12-1647 In its current version, META-SHARE features 13 resource ***** repositories *****, with over 1200 resource packages. | ||
| L08-1108 Configurability of such frameworks and expressiveness of feature structure-based annotation schemes account for the high density of some such annotation ***** repositories ***** | ||
| tensor | 31 | |
| P18-2002 In this paper, we introduce restricted recurrent neural ***** tensor ***** networks (r-RNTN) which reserve distinct hidden layer weights for frequent vocabulary words while sharing a single set of weights for infrequent words. | ||
| P19-1355 As a result these models are costly to train and develop, both financially, due to the cost of hardware and electricity or cloud compute time, and environmentally, due to the carbon footprint required to fuel modern ***** tensor ***** processing hardware. | ||
| 2021.emnlp-main.625 The semantic filter module can be added to most geometric and ***** tensor ***** decomposition models with minimal additional memory. | ||
| P18-1209 However, these methods often suffer from exponential increase in dimensions and in computational complexity introduced by transformation of input into ***** tensor *****. | ||
| N18-1082 We then derive preposition embeddings via ***** tensor ***** decomposition on a large unlabeled corpus | ||
| calibration | 31 | |
| 2020.emnlp-main.667 Finally, to motivate the utility of ***** calibration ***** for KGE from a practitioner's perspective, we conduct a unique case study of human-AI collaboration, showing that calibrated predictions can improve human performance in a knowledge graph completion task. | ||
| 2020.emnlp-main.102 Our experiments demonstrate that the proposed method outperforms existing ***** calibration ***** methods for text classification in terms of expectation ***** calibration ***** error, misclassification detection, and OOD detection on six datasets. | ||
| 2021.emnlp-main.835 We categorize these examples as exhibiting a background shift or semantic shift, and find that the two major approaches to OOD detection, ***** calibration ***** and density estimation (language modeling for text), have distinct behavior on these types of OOD data. | ||
| 2021.bea-1.16 Classical approaches to question ***** calibration ***** are either subjective or require newly created questions to be deployed before being calibrated. | ||
| N18-1148 In this paper, we identify and differentiate between two relevant data generating scenarios (intrinsic vs. extrinsic labels), introduce a simple but novel method which emphasizes the importance of ***** calibration *****, and then analyze and experimentally validate the appropriateness of various methods for each of the two scenarios | ||
| aggregated | 31 | |
| 2020.findings-emnlp.375 Such conclusions about system and human performance are, however, based on estimates ***** aggregated ***** from scores collected over large test sets of translations and unfortunately leave some remaining questions unanswered. | ||
| 2021.naacl-main.59 At inference time, we use the labels of the retrieved spans to construct the final structure with the highest ***** aggregated ***** score. | ||
| L14-1351 We present the Weltmodell, a commonsense knowledge base that was automatically generated from ***** aggregated ***** dependency parse fragments gathered from over 3.5 million English language books. | ||
| L16-1328 In distributional semantics words are represented by ***** aggregated ***** context features. | ||
| D18-1148 We make our ***** aggregated ***** and anonymized community-level data, derived from 37 billion tweets – over 1 billion of which were mapped to counties, available for research | ||
| hallucination | 31 | |
| D18-1437 We analyze how captioning model architectures and learning objectives contribute to object ***** hallucination *****, explore when ***** hallucination ***** is likely due to image misclassification or language priors, and assess how well current sentence metrics capture object ***** hallucination *****. | ||
| 2021.naacl-main.475 We analyze the typical ***** hallucination ***** phenomenon by different types of neural summarization systems, in hope to provide insights for future work on the direction. | ||
| 2020.sigmorphon-1.1 Most teams demonstrate utility of data ***** hallucination ***** and augmentation, ensembles, and multilingual training for low-resource languages. | ||
| 2021.eacl-main.236 Our analysis shows that higher predictive uncertainty corresponds to a higher chance of ***** hallucination *****. | ||
| 2021.blackboxnlp-1.10 Large language models are known to suffer from the *****hallucination***** problem in that they are prone to output statements that are false or inconsistent , indicating a lack of knowledge . | ||
| lemmatized | 31 | |
| L14-1464 In our evaluation against a gold standard, we compare different pre-processing strategies (***** lemmatized ***** vs. inflected forms) and introduce language model scores of synonym candidates in the context of the input particle verb as well as distributional similarity as additional re-ranking criteria. | ||
| L08-1449 We propose precision-oriented semiautomatic extraction which can operate on tokenized, tagged and ***** lemmatized ***** texts. | ||
| W16-5414 Every token in the AVMWE list is ***** lemmatized ***** and tagged with POS information. | ||
| D19-6310 The Multilingual Surface Realization Shared Task 2019 focuses on generating sentences from ***** lemmatized ***** sets of universal dependency parses with rich features. | ||
| 2019.gwc-1.22 It is extracted from two tokenised and ***** lemmatized ***** scenarios pertaining to two imagined microworlds in which the robot is supposed to play an assistive role | ||
| unconstrained | 31 | |
| 2020.vardial-1.24 We submitted solutions to all subtasks but focused our development efforts on the CH subtask, where we achieved third place out of 16 submissions with a median distance of 15.93 km and had the best result of 14 ***** unconstrained ***** systems. | ||
| 2021.vardial-1.16 Following our successful participation at VarDial 2020, we again propose constrained and ***** unconstrained ***** systems based on the BERT architecture. | ||
| K19-1045 For parsing, unlike ***** unconstrained ***** approaches, our algorithm always generates valid output, incurring only a small drop in performance. | ||
| P18-4016 The demonstration will allow users to issue ***** unconstrained ***** spoken language commands to ScoutBot. | ||
| 2020.vardial-1.19 Our solutions are based on the BERT Transformer models, the constrained versions of our models reaching 1st place in two subtasks and 3rd place in one subtask, while our ***** unconstrained ***** models outperform all the constrained systems by a large margin | ||
| overfit | 31 | |
| 2021.cl-3.18 Additional analysis shows that several systems ***** overfit ***** on the structure of the ECB+ corpus. | ||
| 2021.eacl-main.220 Despite advances in modeling techniques, abstractive summarization models still suffer from several key challenges: (i) layout bias: they ***** overfit ***** to the style of training corpora; (ii) limited abstractiveness: they are optimized to copying n-grams from the source rather than generating novel abstractive summaries; (iii) lack of transparency: they are not interpretable. | ||
| 2020.acl-main.532 In this paper, we show that neural machine translation (NMT) systems trained on large back-translated data ***** overfit ***** some of the characteristics of machine-translated texts. | ||
| 2021.emnlp-main.579 However, non-parametric methods are prone to ***** overfit ***** the retrieved examples. | ||
| N18-1162 Second, the conditional VAE structure whose generation process is conditioned on a context, makes the range of training targets very sparse; that is, the RNN decoders can easily ***** overfit ***** to the training data ignoring the latent variables | ||
| Slot | 31 | |
| 2021.dravidianlangtech-1.11 We hand-curate a new test dataset in two low-resource Dravidian languages and show the significance and impact of our training dataset construction using a state-of-the-art mBERT model - achieving a ***** Slot ***** F1 of 81.51 (Kannada) and 78.82 (Tamil) on our test sets. | ||
| 2020.acl-main.567 We propose a Dialogue State Tracker with ***** Slot ***** Attention and ***** Slot ***** Information Sharing (SAS) to reduce redundant information's interference and improve long dialogue context tracking. | ||
| W19-5911 ***** Slot ***** filling is a core operation for utterance understanding in task-oriented dialogue systems | ||
| N18-3019 *****Slot***** tagging , the task of detecting entities in input user utterances , is a key component of natural language understanding systems for personal digital assistants . | ||
| 2020.emnlp-main.152 *****Slot***** filling and intent detection are two main tasks in spoken language understanding ( SLU ) system . | ||
| Counterfactual | 31 | |
| D19-1530 An alternative approach is ***** Counterfactual ***** Data Augmentation (CDA), in which a corpus is duplicated and augmented to remove bias, e.g. by swapping all inherently-gendered words in the copy. | ||
| D19-1509 *****Counterfactual***** reasoning requires predicting how alternative events , contrary to what actually happened , might have resulted in different outcomes . | ||
| P18-1169 *****Counterfactual***** learning from human bandit feedback describes a scenario where user feedback on the quality of outputs of a historic system is logged and used to improve a target system . | ||
| P17-2103 *****Counterfactual***** statements , describing events that did not occur and their consequents , have been studied in areas including problem - solving , affect management , and behavior regulation . | ||
| 2021.emnlp-main.568 *****Counterfactual***** statements describe events that did not or can not take place . | ||
| thematic | 31 | |
| 2020.ccl-1.83 In our method, the keywords planning strategy is used to improve ***** thematic ***** consistency while the CVAE module allows enhancing wording diversity. | ||
| C18-1167 The task is to assign the literary classification to a full-length book belonging to a corpus of literature, where the works on average are well over 200,000 words long and genre is an abstract ***** thematic ***** concept. | ||
| 1998.amta-papers.4 Specifically we describe a two-step process for creating candidate ***** thematic ***** grids for Mandarin Chinese verbs, using the English verb heading the VP in the subde_nitions to separate senses, and roughly parsing the verb complement structure to match ***** thematic ***** structure templates. | ||
| L14-1164 This concept has been verified by the process of extending descriptions stored in ***** thematic ***** Digital Library of Polish and Poland-related Ephemeral Prints from the 16th, 17th and 18th Centuries with extended item-associated information provided by historians, philologists, librarians and computer scientists | ||
| D19-1180 According to screenwriting theory , turning points ( e.g. , change of plans , major setback , climax ) are crucial narrative moments within a screenplay : they define the plot structure , determine its progression and segment the screenplay into *****thematic***** units ( e.g. , setup , complications , aftermath ) . | ||
| executable | 31 | |
| D19-6111 Semantic parsers are used to convert user's natural language commands to ***** executable ***** logical form in intelligent personal agents. | ||
| 2021.naacl-main.219 Due to the large search space of ***** executable ***** programs, conventional methods that use beam-search for approximation, such as self-training and top-k marginal likelihood training, do not perform as well. | ||
| 2020.acl-main.684 By leveraging the idea of inverse semantics from program synthesis to reason backwards from observed demonstrations, we ensure that all considered interpretations are consistent with ***** executable ***** actions in any context, thus simplifying the problem of search over logical forms. | ||
| 2021.iwcs-1.17 We adopt, evaluate, and improve upon a two-step natural language understanding (NLU) pipeline that incrementally tames the variation of unconstrained natural language input and maps to ***** executable ***** robot behaviors. | ||
| 2021.acl-short.121 In this paper, we propose sequence-to-general tree (S2G) that learns to generate interpretable and ***** executable ***** operation trees where the nodes can be formulas with an arbitrary number of arguments | ||
| inflectional morphology | 31 | |
| 2020.lrec-1.344 We present the first resource focusing on the verbal ***** inflectional morphology ***** of San Juan Quiahije Chatino, a tonal mesoamerican language spoken in Mexico. | ||
| 2021.eacl-main.264 However, much of such work focused almost exclusively on English — a language with rigid word order and a lack of ***** inflectional morphology *****. | ||
| 2020.acl-main.263 We perturb the ***** inflectional morphology ***** of words to craft plausible and semantically similar adversarial examples that expose these biases in popular NLP models, e.g., BERT and Transformer, and show that adversarially fine-tuning them for a single epoch significantly improves robustness without sacrificing performance on clean data. | ||
| P19-1491 We speculated that ***** inflectional morphology ***** may be the primary culprit for the discrepancy | ||
| 2004.amta-papers.6 We demonstrate that normalizing ***** inflectional morphology ***** improves the perplexity of models and reduces alignment errors. | ||
| complementary | 31 | |
| P19-1468 In this paper, we show that these two problems are actually ***** complementary *****. | ||
| L12-1052 Overall our study shows that eye tracking can give ***** complementary ***** information to error analysis, such as aiding in ranking error types for seriousness. | ||
| D18-1244 We also show through detailed analysis that this model has ***** complementary ***** strengths to sequence models, and combining them further improves the state of the art. | ||
| 2020.coling-main.97 In this paper, we attempt to leverage the potential ***** complementary ***** information among distinct sources and alleviate the occasional conflicts of them. | ||
| 2020.insights-1.11 We show that graph embeddings are modestly ***** complementary ***** with text embeddings, but the low performance of graph embedding features alone indicate that the model fails to capture topological features pertinent of the topic prediction task | ||
| semantically annotated | 31 | |
| L14-1317 The text is lexically and ***** semantically annotated ***** on the basis of a lexicon and a domain ontology, the former structuring the most relevant terms occurring in the text and the latter representing the domain entities of interest (e.g. people, places, etc.). | ||
| 2004.amta-papers.26 In this paper, we describe the creation of an interlingua and the development of a corpus of ***** semantically annotated ***** text, to be validated in six languages and evaluated in several ways. | ||
| 2020.lrec-1.783 However, to provide sufficient training data for the speech recognition system, many hours of air traffic communications have to be transcribed and ***** semantically annotated *****. | ||
| 2020.emnlp-main.663 We describe a method for developing broad-coverage semantic dependency parsers for languages for which no ***** semantically annotated ***** resource is available. | ||
| L12-1299 What would be a good method to provide a large collection of ***** semantically annotated ***** texts with formal, deep semantics rather than shallow? | ||
| overview | 31 | |
| 2021.acl-tutorials.5 This tutorial will ***** overview ***** the computational modeling of prosody, including recent advances and diverse actual and potential applications. | ||
| D17-1123 We also ***** overview ***** resources for evaluation and discuss challenges for future research. | ||
| 2021.emnlp-demo.18 The Press Freedom Monitor enables the monitoring experts to get a fast ***** overview ***** over recently reported incidents and it has shown an impressive performance in this regard. | ||
| 2015.jeptalnrecital-invite.1 Multilinguality is a key feature of today's Web, and it is this feature that we leverage and exploit in our research work at the Sapienza University of Rome's Linguistic Computing Laboratory, which I am going to ***** overview ***** and showcase in this talk. | ||
| 2020.lrec-1.324 This paper gives a preliminary ***** overview ***** of the current state of the project and details our workflow, in particular standardization of formats and conventions, the addition of segmental alignments with WebMAUS, and DoReCo's applicability for subsequent research programs | ||
| commonsense inference | 31 | |
| P18-1043 In addition, we demonstrate how ***** commonsense inference ***** on people's intents and reactions can help unveil the implicit gender inequality prevalent in modern movie scripts. | ||
| 2021.emnlp-main.303 LogicNLI is an NLI-style dataset that effectively disentangles the target FOL reasoning from ***** commonsense inference ***** and can be used to diagnose LMs from four perspectives: accuracy, robustness, generalization, and interpretability. | ||
| 2020.repl4nlp-1.8 Without relying on any human-crafted features, knowledge bases, or additional datasets other than the target datasets, our model boosts the fine-tuning performance of RoBERTa, achieving competitive results on multiple reading comprehension datasets that require ***** commonsense inference *****. | ||
| 2021.emnlp-main.598 Pre-trained language models (PTLMs) have achieved impressive performance on ***** commonsense inference ***** benchmarks, but their ability to employ commonsense to make robust inferences, which is crucial for effective communications with humans, is debated. | ||
| 2021.sigdial-1.33 In this work, we introduce CIDER – a manually curated dataset that contains dyadic dialogue explanations in the form of implicit and explicit knowledge triplets inferred using contextual ***** commonsense inference ***** | ||
| claim | 31 | |
| 2021.acl-long.347 We construct a novel dataset of WhatsApp tipline and public group messages alongside fact-checked ***** claim *****s that are first annotated for containing “***** claim *****-like statements” and then matched with potentially similar items and annotated for ***** claim ***** matching. | ||
| 2020.argmining-1.2 Our IAA study reports substantial agreement scores for argumentativeness detection (0.76 Fleiss' kappa) and moderate agreement for ***** claim ***** labelling (0.45 Fleiss' kappa). | ||
| N19-1054 Empirical results show that using this approach improves the state of art performance across four benchmark argumentation data sets by an average of 4 absolute F1 points in ***** claim ***** detection. | ||
| 2020.insights-1.11 We study the effectiveness of using neural-graph embedding features for ***** claim ***** topic prediction and their complementarity with text embeddings. | ||
| N18-1074 Thus we believe that FEVER is a challenging testbed that will help stimulate progress on ***** claim ***** verification against textual sources | ||
| spectral clustering | 31 | |
| L08-1288 However, ***** spectral clustering ***** needs to solve an eigenvalue problem of the matrix converted from the similarity matrix corresponding to the data set. | ||
| Q15-1002 Second, we reduce the computational cost by a new algorithm that embeds sample-based bisection, using ***** spectral clustering ***** or graph partitioning, in a hierarchical clustering process. | ||
| W18-1705 Motivated by the document-term co-clustering framework by Dhillon (2001), we propose a landmark-based scalable ***** spectral clustering ***** approach in which we first use the selected landmark set and the given data to form a bipartite graph and then run a diffusion process on it to obtain a family of diffusion coordinates for clustering. | ||
| P17-1087 For ***** spectral clustering ***** using such word embeddings, words are points in a vector space where synonyms are linked with positive weights, while antonyms are linked with negative weights | ||
| L16-1448 Our method is, for the most part, unsupervised: we use the *****spectral clustering***** algorithm described in Brew and Schulte in Walde (2002) to build a noise model from a short, manually verified seed list of verbs. | ||
| feature structures | 31 | |
| 1993.iwpt-1.21 It is briefly sketched how the parser can be enhanced with ***** feature structures *****. | ||
| 1993.iwpt-1.4 In this paper we consider the merging of the language of ***** feature structures ***** with a formal logical language, and how the semantic definition of the resulting language can be used in parsing. | ||
| W03-3010 These penalties are based on the signature and the shape of the ***** feature structures *****, and thus realise an elegant and general approach to relaxation. | ||
| W89-0241 The parsing algorithm operates in a systematic bottom-up (BU) fashion, thus taking earliest advantage of LFQ's concentration of information in the lexicon and also making use of unrestricted ***** feature structures ***** to realize LFG's Top-Down (TD) predictive potential. | ||
| 1995.iwpt-1.9 These instructions manipulate ***** feature structures ***** by means of features, equality, and typing, and manipulate the program state by search and sequencing operations. | ||
| open information | 31 | |
| 2020.findings-emnlp.99 In this paper, we propose Multi^2OIE, which performs ***** open information ***** extraction (open IE) by combining BERT with multi-head attention. | ||
| L16-1732 Capturing common-sense and domain-specific knowledge can be achieved by taking advantage of recent advances in ***** open information ***** extraction (IE) techniques and, more importantly, of knowledge embeddings, which are multi-dimensional representations of concepts and relations. | ||
| D18-1129 SalIE is unsupervised and knowledge agnostic, based on ***** open information ***** extraction to detect facts in natural language text, PageRank to determine their relevance, and clustering to promote diversity. | ||
| 2020.acl-demos.8 It is supported by novel data-driven methods for distantly supervised named entity recognition and ***** open information ***** extraction. | ||
| D19-1067 We propose a novel supervised ***** open information ***** extraction (Open IE) framework that leverages an ensemble of unsupervised Open IE systems and a small amount of labeled data to improve system performance. | ||
| workshop | 31 | |
| 1997.mtsummit-***** workshop *****.4 (1) From a linguistic viewpoint, the expected benefits include a refinement of the aspectual model in (Olsen, 1994; Olsen, 1997) (which provides necessary but not sufficient conditions for aspectual com- position), and a refinement of the verb classifications in (Levin, 1993); we also expect our approach to eventually produce a systematic definition (in terms of LCSs and compositional operations) of the precise meaning components responsible for Levin's classification. | ||
| 2020.fnp-1.17 We participate in the FNS-Summarisation 2020 shared task to be held at FNP 2020 ***** workshop ***** at COLING 2020. | ||
| S19-2119 The OffensEval Shared Task was conducted in SemEval 2019 ***** workshop *****. | ||
| L14-1379 It was created for Tweet-Norm, a tweet normalization ***** workshop ***** and shared task, and is the result of a joint annotation effort from different research groups. | ||
| 2010.amta-***** workshop *****.2 The recent emergence of crowdsourced translation à la Facebook or Twitter has exposed a raw nerve in the translation industry. | ||
| text data | 31 | |
| 2020.aacl-main.29 In this paper, we propose a method which trains the generation model in a completely unsupervised way with unaligned raw ***** text data ***** and KB triples. | ||
| W18-3304 In this paper, we propose a new method of learning about the hidden representations between just speech and ***** text data ***** using convolutional attention networks. | ||
| D19-1299 Experimental results on twenty-one Wikipedia infobox-to-***** text data *****sets show our model, KBAtt, consistently improves a state-of-the-art model on most of the datasets. | ||
| 2020.semeval-1.226 For sub-task 1, we use contextual embeddings extracted from pre-trained transformer models to represent the ***** text data ***** at various granularities and propose a multi-granularity knowledge sharing approach. | ||
| D19-1029 It improves F1 score relatively by 4.2% on BioNLP2013 and by 6.2% on a new bio-***** text data *****set for tuple extraction. | ||
| topic model | 31 | |
| Q13-1008 Supervised learning methods and LDA based ***** topic model ***** have been successfully applied in the field of multi-document summarization. | ||
| W18-5902 The occurrence of stance-taking towards vaccination was measured in documents extracted by ***** topic model *****ling from two different corpora, one discussion forum corpus and one tweet corpus. | ||
| E17-1033 Our results show that the ***** topic model *****ing experts reach substantial improvements when compared to the general versions. | ||
| E17-1091 We outline a novel method using correspondence ***** topic model *****s and a lightweight manual process to reduce noise from mis-labeled data in the training set. | ||
| D19-1049 Prior approaches including bag-of-words model and probabilistic ***** topic model ***** are less effective to deal with the vocabulary mismatch and partial topic overlap between the submission and reviewer. | ||
| target domain | 31 | |
| 2020.coling-main.603 Motivated by the latest advances, in this survey we review neural unsupervised domain adaptation techniques which do not require labeled ***** target domain ***** data. | ||
| 2021.emnlp-main.442 Extensive experiments on four benchmarks show that PDALN can effectively adapt high-resource domains to low-resource ***** target domain *****s, even if they are diverse in terms and writing styles. | ||
| D18-1190 Our model reaches an average accuracy of 53.4% on 7 domains in the OVERNIGHT dataset, substantially better than other zero-shot baselines, and performs as good as a parser trained on over 30% of the ***** target domain ***** examples. | ||
| W18-4924 Our results show that synthetic methods can be effective at significantly reducing parsing errors for a ***** target domain ***** without having to invest large resources on manual annotation; and the combination of manual and synthetic methods is our best domain-independent performer. | ||
| W18-0903 We undertake a corpus-based analysis of predicate-argument constructions and their metaphoric properties, and attempt to effectively represent syntactic constructions as features for metaphor processing, both in identifying source and ***** target domain *****s and in distinguishing metaphoric words from non-metaphoric. | ||
| knowledge extraction | 31 | |
| 2020.acl-demos.11 We present the first comprehensive, open source multimedia ***** knowledge extraction ***** system that takes a massive stream of unstructured, heterogeneous multimedia data from various sources and languages as input, and creates a coherent, structured knowledge base, indexing entities, relations, and events, following a rich, fine-grained ontology. | ||
| C18-2005 The increased demand for structured knowledge has created considerable interest in ***** knowledge extraction ***** from natural language sentences. | ||
| 2020.coling-main.84 For that purpose, we develop a high precision ***** knowledge extraction ***** pipeline tailored for the financial domain. | ||
| L06-1206 A new direction is needed, based on an automated approach to ***** knowledge extraction *****. | ||
| 2021.eacl-main.39 We show that by separating the two stages, i.e., ***** knowledge extraction ***** and knowledge composition, the classifier can effectively exploit the representations learned from multiple tasks in a non-destructive manner. | ||
| terminology extraction | 31 | |
| L08-1406 It includes modules that operate on word lists or texts and allow to perform various linguistic annotation, classification and clustering tasks, including language detection, POS-tagging, base form reduction, named entity recognition, and ***** terminology extraction *****. | ||
| L12-1479 The Quaero program has organized a set of evaluations for ***** terminology extraction ***** systems in 2010 and 2011. | ||
| L10-1619 In the KYOTO project, we apply language-neutral ***** terminology extraction ***** from a parsed corpus for seven languages. | ||
| L10-1030 We provide a detailed description of the characteristics of this new collection as well results of an application of the corpus on term management tasks, including terminology validation and ***** terminology extraction *****. | ||
| L12-1436 We designed an application for German and English data that serves as a first evaluation of the methods for ***** terminology extraction ***** used in the project. | ||
| figurative language | 31 | |
| 2021.emnlp-main.592 When faced with dialog contexts consisting of ***** figurative language *****, some models show very large drops in performance compared to contexts without ***** figurative language *****. | ||
| 2021.insights-1.15 In natural language understanding, topics that touch upon ***** figurative language ***** and pragmatics are notably difficult. | ||
| 2020.emnlp-main.571 While hyperbole is one of the most prevalent rhetorical devices, it is arguably one of the least studied devices in the ***** figurative language ***** processing community. | ||
| W18-0902 The purpose of this study is to assess whether linguistic features can help explain differences in quality of ***** figurative language *****. | ||
| 2021.wanlp-1.42 The prominence of *****figurative language***** devices , such as sarcasm and irony , poses serious challenges for Arabic Sentiment Analysis ( SA ) . | ||
| distantly supervised relation | 31 | |
| 2021.naacl-main.2 We propose a multi-task, probabilistic approach to facilitate ***** distantly supervised relation ***** extraction by bringing closer the representations of sentences that contain the same Knowledge Base pairs. | ||
| D18-1247 In this paper, we aim to incorporate the hierarchical information of relations for ***** distantly supervised relation ***** extraction and propose a novel hierarchical attention scheme. | ||
| D18-1245 Attention mechanism is often used in deep neural networks for ***** distantly supervised relation ***** extraction (DS-RE) to distinguish valid from noisy instances. | ||
| P18-2015 This paper addresses the tasks of automatic seed selection for bootstrapping relation extraction, and noise reduction for ***** distantly supervised relation ***** extraction. | ||
| C18-1036 Meanwhile, the useful information expressed in knowledge graph is still underutilized in the state-of-the-art methods for ***** distantly supervised relation ***** extraction. | ||
| neural model | 31 | |
| Q18-1005 Specifically, we embed a differentiable non-projective parsing algorithm into a ***** neural model ***** and use attention mechanisms to incorporate the structural biases. | ||
| P17-2025 Recent work has proposed several generative ***** neural model *****s for constituency parsing that achieve state-of-the-art results. | ||
| W18-6112 Meanwhile, there is plenty of evidence to the effectiveness of character-based ***** neural model *****s in mitigating this OOV problem. | ||
| W19-4411 We examine this claim in ***** neural model *****s for content scoring. | ||
| D18-1263 Beyond SDP, our linearization technique opens the door to integration of graph-based semantic representations as features in ***** neural model *****s for downstream applications. | ||
| study | 31 | |
| 2010.amta-papers.6 In this paper, we present the insights gained from a detailed ***** study ***** of coupling a highly modular English-Hindi RBMT system with a standard phrase-based SMT system. | ||
| D19-5817 Our ***** study ***** suggests that while current metrics may be suitable for existing QA datasets, they limit the complexity of QA datasets that can be created. | ||
| E17-1062 This ***** study ***** introduces a statistical model able to generate variations of a proper name by taking into account the person to be mentioned, the discourse context and variation. | ||
| P18-2091 This paper presents the first ***** study ***** aimed at capturing stylistic similarity between words in an unsupervised manner. | ||
| 2019.iwslt-1.26 We ***** study ***** here a related setting, multi-domain adaptation, where the number of domains is potentially large and adapting separately to each domain would waste training resources. | ||
| emotions | 31 | |
| D18-1379 Moreover, as ***** emotions ***** might be evoked by hidden topics, it is important to unveil and incorporate such topical information to understand how the ***** emotions ***** are evoked. | ||
| 2020.lrec-1.203 Even though the majority of emotion expressed implicitly, most previous attempts at ***** emotions ***** have focused on the examination of explicit ***** emotions *****. | ||
| C18-2003 This tool uniquely combines state-of-the-art distributional semantics with a nuanced model of human ***** emotions *****, two information streams we deem beneficial for a data-driven interpretation of texts in the humanities. | ||
| L08-1159 Yet, building such models requires appropriate definition of various levels for representing the ***** emotions ***** themselves but also some contextual information such as the events that elicit these ***** emotions *****. | ||
| 2021.wassa-1.10 This paper presents the results that were obtained from the WASSA 2021 shared task on predicting empathy and ***** emotions *****. | ||
| important | 31 | |
| P19-1182 This ***** important ***** problem has not been explored mostly due to lack of datasets and effective models. | ||
| 2021.triton-1.5 Due to the wide-spread development of Machine Translation (MT) systems –especially Neural Machine Translation (NMT) systems– MT evaluation, both automatic and human, has become more and more ***** important ***** as it helps us establish how MT systems perform. | ||
| D18-1379 Moreover, as emotions might be evoked by hidden topics, it is ***** important ***** to unveil and incorporate such topical information to understand how the emotions are evoked. | ||
| 2021.triton-1.10 Named Entities play an ***** important ***** role in different NLP tasks such as Information Extraction, Question Answering and Machine Translation. | ||
| 2020.lrec-1.788 The data confirm the ***** important ***** role played by low pitch accents in Urdu spontaneous speech, in line with previous studies on Urdu/Hindi scripted speech. | ||
| review | 31 | |
| 2020.coling-main.45 Experiments over an Amazon ***** review ***** dataset indicate superior performance of the proposed method. | ||
| 2021.eacl-main.229 The framework enables the use of all input ***** review *****s by first condensing them into multiple dense vectors which serve as input to an abstractive model. | ||
| D19-1236 The ***** review ***** and selection process for scientific paper publication is essential for the quality of scholarly publications in a scientific field. | ||
| W19-0509 We evaluate the proposed method on a set of over 15,000 hospital ***** review *****s. | ||
| 2021.acl-long.4 We specify 29 model functionalities motivated by a ***** review ***** of previous research and a series of interviews with civil society stakeholders. | ||
| pronoun prediction | 31 | |
| W17-4807 This paper describes the UU-Hardmeier system submitted to the DiscoMT 2017 shared task on cross-lingual ***** pronoun prediction *****. | ||
| 2017.jeptalnrecital-recital.1 In this paper, we present and critique three experiments for the integration of context into a MT system, each focusing on a different type of context and exploiting a different method: adaptation to speaker gender, cross-lingual ***** pronoun prediction ***** and the generation of tag questions from French into English. | ||
| W17-4806 In this paper we present our systems for the DiscoMT 2017 cross-lingual ***** pronoun prediction ***** shared task. | ||
| W17-4801 We describe the design, the setup, and the evaluation results of the DiscoMT 2017 shared task on cross-lingual *****pronoun prediction*****. | ||
| W17-4808 In this paper we present our system in the DiscoMT 2017 Shared Task on Crosslingual *****Pronoun Prediction*****. | ||
| textual entailment recognition | 31 | |
| C16-1104 We combine i) health outcome detection, ii) keyphrase extraction, and iii) ***** textual entailment recognition ***** between sentences. | ||
| W18-5522 The sentences in these documents are then supplied to a ***** textual entailment recognition ***** module. | ||
| 2020.emnlp-main.125 Phrase alignment is the basis for modelling sentence pair interactions, such as paraphrase and ***** textual entailment recognition *****. | ||
| L10-1259 We focus on textual entailments mediated by syntax and propose a new methodology to evaluate ***** textual entailment recognition ***** systems on such data. | ||
| P17-1071 Our experiments on 8 different datasets show very encouraging results in paraphrase detection, ***** textual entailment recognition ***** and ranking relevance. | ||
| taxonomy induction | 31 | |
| P19-1474 We introduce the use of Poincarë embeddings to improve existing state-of-the-art approaches to domain-specific ***** taxonomy induction ***** from text as a signal for both relocating wrong hyponym terms within a (pre-induced) taxonomy as well as for attaching disconnected terms in a taxonomy. | ||
| P18-1229 We present a novel end-to-end reinforcement learning approach to automatic ***** taxonomy induction ***** from a set of terms. | ||
| D19-3005 In this paper, we describe these ***** taxonomy induction ***** and expansion features of KGIS. | ||
| L08-1267 We show that the ***** taxonomy induction ***** process is highly reliable - evaluated against the German version of WordNet, GermaNet, the resource obtained shows an accuracy of 83.34%. | ||
| L16-1236 We designed a statistically-based *****taxonomy induction***** algorithm consisting of a combination of different strategies not involving explicit linguistic knowledge. | ||
| chinese poetry | 31 | |
| D18-1430 Most previous works on automatic *****Chinese poetry***** generation focused on improving the coherency among lines. | ||
| U19-1002 In this paper, we adapt Deep-speare, a joint neural network model for English sonnets, to *****Chinese poetry*****. | ||
| K18-1024 As a precious part of the human cultural heritage, *****Chinese poetry***** has influenced people for generations. | ||
| C16-1100 *****Chinese poetry***** generation is a very challenging task in natural language processing. | ||
| D19-1637 Classical *****Chinese poetry***** is a jewel in the treasure house of Chinese culture. | ||
| cross - domain sentiment classification | 31 | |
| 2020.aacl-main.87 Experimental results on two *****cross-domain sentiment classification***** datasets show that the proposed method reports consistently good performance across domains, and at times outperforming more complex prior proposals. | ||
| P18-1089 Owing to these differences, *****cross-domain sentiment classification***** is still a challenging task. | ||
| S18-2030 We evaluate the proposed CP-decomposition-based feature expansion method on benchmark datasets for *****cross-domain sentiment classification***** and short-text classification. | ||
| 2020.acl-main.370 *****Cross-domain sentiment classification***** aims to address the lack of massive amounts of labeled data. | ||
| K17-1040 We experiment with the task of *****cross-domain sentiment classification***** on 16 domain pairs and show substantial improvements over strong baselines. | ||
| adversarial input | 31 | |
| 2020.findings-emnlp.103 Prior work on *****adversarial inputs***** typically studies model oversensitivity: semantically invariant text perturbations that cause a model's prediction to change. | ||
| 2021.emnlp-tutorials.5 In particular, we will review recent studies on analyzing the weakness of NLP systems when facing *****adversarial inputs***** and data with a distribution shift. | ||
| P19-1147 This work examines the robustness of self-attentive neural networks against *****adversarial input***** perturbations. | ||
| 2021.eacl-main.71 Analysis of these attacks on the state of the art transformers in NLP can help improve the robustness of these models against such *****adversarial inputs*****. | ||
| N18-1024 We evaluate our approach against a number of baselines and experimentally demonstrate its effectiveness on both the AES task and the task of flagging *****adversarial input*****, further contributing to the development of an approach that strengthens the validity of neural essay scoring models. | ||
| open - domain | 31 | |
| 2021.naacl-main.124 Existing dialogue corpora and models are typically designed under two disjoint motives : while task - oriented systems focus on achieving functional goals ( e.g. , booking hotels ) , *****open - domain***** chatbots aim at making socially engaging conversations . | ||
| D19-5806 A key challenge of multi - hop question answering ( QA ) in the *****open - domain***** setting is to accurately retrieve the supporting passages from a large corpus . | ||
| 2021.naacl-main.43 Recent advances in *****open - domain***** QA have led to strong models based on dense retrieval , but only focused on retrieving textual passages . | ||
| P19-1659 In this paper , we study abstractive summarization for *****open - domain***** videos . | ||
| C16-1189 In this study , we applied a deep LSTM structure to classify dialogue acts ( DAs ) in *****open - domain***** conversations . | ||
| data - driven | 31 | |
| W19-0607 This paper investigates *****data - driven***** segmentation using Re - Pair or Byte Pair Encoding - techniques . | ||
| W18-4503 We present a *****data - driven***** approach to detect periods of linguistic change and the lexical and grammatical features contributing to change . | ||
| 2020.acl-main.685 We present a method for combining multi - agent communication and traditional *****data - driven***** approaches to natural language learning , with an end goal of teaching agents to communicate with humans in natural language . | ||
| 2021.acl-short.6 Humor recognition has been widely studied as a text classification problem using *****data - driven***** approaches . | ||
| 2020.emnlp-main.603 The lack of large and diverse discourse treebanks hinders the application of *****data - driven***** approaches , such as deep - learning , to RST - style discourse parsing . | ||
| Code - | 31 | |
| 2021.emnlp-main.499 *****Code -***** switching is the communication phenomenon where the speakers switch between different languages during a conversation . | ||
| 2020.semeval-1.123 *****Code -***** switching is a phenomenon in which two or more languages are used in the same message . | ||
| D18-1347 *****Code -***** switching , the use of more than one language within a single utterance , is ubiquitous in much of the world , but remains a challenge for NLP largely due to the lack of representative data for training models . | ||
| 2021.dravidianlangtech-1.9 It is an important task for social media monitoring and has many applications , as a large chunk of social media data is *****Code -***** Mixed . | ||
| 2020.coling-main.163 *****Code -***** switching has long interested linguists , with computational work in particular focusing on speech and social media data ( Sitaram et al . , 2019 ) . | ||
| CQA | 30 | |
| D17-1089 ***** CQA ***** retrieval enables usage of historical ***** CQA ***** archives to solve new questions posed by users. | ||
| P18-1162 Answer selection is an important subtask of community question answering (***** CQA *****). | ||
| W16-4405 An Entity-based approach to Answering recurrent and non-recurrent questions with Past Answers Abstract Community question answering (***** CQA *****) systems such as Yahoo! | ||
| 2021.naacl-main.315 We propose a practical instant question answering (QA) system on product pages of e-commerce services, where for each user query, relevant community question answer (***** CQA *****) pairs are retrieved. | ||
| D19-1171 Finally, we show that weak supervision with question title and body information is also an effective method to train ***** CQA ***** answer selection models without direct answer supervision | ||
| misspellings | 30 | |
| 2020.sltu-1.41 We corrected ***** misspellings ***** and distinguished English loan words to be integrated in our dictionary from instances of code switching. | ||
| W18-4514 The presence of ***** misspellings ***** and other errors or non-standard word forms poses a considerable challenge for NLP systems. | ||
| L14-1098 The corpus contains many unique characteristics such as emoticons, common mobile ***** misspellings *****, and images associated with many of the questions. | ||
| L14-1075 We put emphasis on two statistical methods to lexicon extension and adjustment: in terms of a letter-based HMM and in terms of a detector of spelling variants and ***** misspellings *****. | ||
| N19-1326 In this paper we present a method to learn word embeddings that are resilient to ***** misspellings ***** | ||
| Whereas | 30 | |
| D19-5102 ***** Whereas ***** previous weakly supervised approaches required a knowledge-base of such events, or corresponding financial figures, our approach requires no such additional data, and can be employed to extract economic events related to companies which are not even mentioned in the training data. | ||
| 2021.mwe-1.3 ***** Whereas ***** native speakers can intuitively handle multiword expressions whose compositional meanings are hard to trace back to individual word semantics, there is still ample scope for improvement regarding computational approaches. | ||
| W16-4124 ***** Whereas ***** some ansatzes of this kind were proposed in previous research papers, here we introduce a stretched exponential extrapolation function that has a smaller error of fit. | ||
| 2020.lrec-1.353 ***** Whereas ***** part of the errors can be explained by phonetic variation, the recording mismatch poses a major problem. | ||
| J18-4002 ***** Whereas ***** there is ample support for searching resources using metadata-based search, or full-text search, or for aggregating resources into virtual collections, there is little support for users to help them process resources in one way or another | ||
| pointwise | 30 | |
| 2020.conll-1.18 We additionally report an inability of three measures of processing difficulty — entropy-based UID, surprisal-based UID, and ***** pointwise ***** mutual information — to correctly predict the correct typological distribution, using transitive constructions from 20 languages in the Universal Dependencies project (version 2.5). | ||
| D18-1454 We next introduce a novel system for selecting grounded multi-hop relational commonsense information from ConceptNet via a ***** pointwise ***** mutual information and term-frequency based scoring function. | ||
| W19-4115 Specifically, using positive ***** pointwise ***** mutual information, it first identifies keywords that frequently co-occur in responses given an utterance. | ||
| L14-1166 The evaluation experiment using a synonym identification task demonstrates that the kanji-based DSM achieves the best performance when a kanji-kanji matrix is weighted by positive ***** pointwise ***** mutual information and word vectors are composed by weighted multiplication. | ||
| D17-1093 In this work, we propose a new hybrid approach combining preference ranking applied to TKs and ***** pointwise ***** ranking applied to CNNs | ||
| Prolog | 30 | |
| 1995.iwpt-1.9 In this paper, we describe the fundamental data structures and compilation techniques that we have employed to develop a unification and constraint-resolution engine capable of performance rivaling that of directly compiled ***** Prolog ***** terms while greatly exceeding ***** Prolog ***** in flexibility, expressiveness and modularity. | ||
| 2021.acl-long.213 To address this challenge, we decompose statutory reasoning into four types of language-understanding challenge problems, through the introduction of concepts and structure found in ***** Prolog ***** programs. | ||
| 2003.mtsummit-semit.10 Both the Arabic parser and the Arabic morphological analyzer are implemented in ***** Prolog *****. | ||
| 1991.iwpt-1.8 At first there is a brief description of the grammar implemented in ***** Prolog ***** using XGs (extraposition grammars) introduced by Pereira (1981;1983) | ||
| 1993.iwpt-1.5 The unification - based approach to processing attribute - value logic grammars , similar to *****Prolog***** interpretation | ||
| topological | 30 | |
| 2020.insights-1.11 We show that graph embeddings are modestly complementary with text embeddings, but the low performance of graph embedding features alone indicate that the model fails to capture ***** topological ***** features pertinent of the topic prediction task. | ||
| W19-0605 In this paper we present new results on applying ***** topological ***** data analysis to discourse structures. | ||
| 2020.knlp-1.1 The CKG combines semantic information with document ***** topological ***** information for the application of similar document retrieval. | ||
| 2021.eacl-main.1 To delineate such neighbourhoods we experiment with several set-distance metrics, including the recently proposed Word Mover's distance, while the fixed-dimensional projection is achieved by employing a scalable and efficient manifold approximation method rooted in ***** topological ***** data analysis. | ||
| 2021.inlg-1.13 Compared to the previous approach of using ***** topological ***** sorting, our proposed technique gracefully handles the presence of cycles and is more expressive since it takes into account real-valued constraint/edge scores rather than just the presence/absence of edges | ||
| persuasiveness | 30 | |
| 2020.peoples-1.4 In our work, we dig deeper into the ***** persuasiveness ***** of both content and style, exploring the role of the intensity of an ideology (lean vs.extreme) and the reader's personality traits (agreeableness, conscientiousness, extraversion, neuroticism, and openness). | ||
| P18-1058 This corpus could trigger the development of novel computational models concerning argument ***** persuasiveness ***** that provide useful feedback to students on why their arguments are (un)persuasive in addition to how persuasive they are. | ||
| I17-1060 Key to our approach is the novel hypothesis that lightly-supervised ***** persuasiveness ***** scoring is possible by explicitly modeling the major errors that negatively impact ***** persuasiveness *****. | ||
| 2020.acl-main.632 While clearly relevant to the task, the personal characteristics of an argument's source and audience have not yet been fully exploited toward automated ***** persuasiveness ***** prediction. | ||
| 2020.nlpcss-1.12 Anonymity influences social interactions in online communities in these many ways, which can lead to influences on opinion change and the ***** persuasiveness ***** of a message | ||
| investigates | 30 | |
| 2020.sigmorphon-1.24 This work ***** investigates ***** the most basic units that underlie contextualized word embeddings, such as BERT — the so-called word pieces. | ||
| N19-1393 Our work ***** investigates ***** the use of high-level language descriptions in the form of typological features for multilingual dependency parsing. | ||
| 2019.iwslt-1.22 This work ***** investigates ***** a simple data augmentation technique, SpecAugment, for end-to-end speech translation. | ||
| 2018.iwslt-1.24 This work ***** investigates ***** how a low-resource translation task can be improved within a multilingual setting. | ||
| 2020.acl-main.684 Our approach ***** investigates ***** a new direction for semantic parsing that models explaining a demonstration in a context, rather than mapping explanations to demonstrations | ||
| Support Vector | 30 | |
| L06-1272 The aligner uses a ***** Support Vector ***** Machine classifier to discriminate between positive and negative examples of sentence pairs. | ||
| 2020.wosp-1.3 The tool is built on a ***** Support Vector ***** Machine (SVM) model trained on a set of 7,058 manually annotated citation context sentences, curated from 34,000 papers from the ACL Anthology. | ||
| 2020.lrec-1.795 Six features computed from Fundamental Frequency (F0) contours are considered and two classifier models based on ***** Support Vector ***** Machine (SVM) & Deep Neural Network (DNN) are implemented for automatic tonerecognition task respectively. | ||
| L10-1102 Our setup based on a ***** Support Vector ***** Machine (SVM) with linear kernel reaches a comparably poor performance of 58% accuracy, which can be attributed to an average utterance length of only 1.6 seconds. | ||
| S19-2170 We developed a ***** Support Vector ***** Machine (SVM) model that uses TF-IDF of tokens, Language Inquiry and Word Count (LIWC) features, and structural features such as number of paragraphs and hyperlink count in an article | ||
| TEI | 30 | |
| W16-4024 For this purpose an initial ***** TEI ***** profile has been formalised and tested as a use case to enable the semantical encoding of the resource `textbook'. | ||
| L12-1487 We develop conversion tools between Wiktionary and ***** TEI *****, using ISO standards (LMF, MAF), to make such resources available to both the Digital Humanities community and the Language Resources community. | ||
| L16-1450 The tool used for the conversion process might turn useful for bridging the gap between traditional digital humanities and modern NLP applications since the ***** TEI ***** original format is not usually suitable for being processed with standard NLP tools. | ||
| 2020.wac-1.5 The standard ***** TEI ***** format and Schema.org encoded metadata is used for the output format, but we stress that placing the corpus in a digital repository system is recommended in order to be able to define semantic relations between the segments and to add rich annotation. | ||
| L14-1356 Interoperability of language resources and tools in the federation of CLARIN Centers is ensured by adherence to ***** TEI ***** and ISO standards for text encoding, by the use of persistent identifiers, and by the observance of common protocols | ||
| template | 30 | |
| L08-1508 Speech utterances in British English are recorded and processed approaching the issue of command and control and ***** template ***** driven dialog systems on the motorcycle. | ||
| 2020.acl-tutorials.6 We will explore the approaches targeted at unstructured text that largely rely on learning syntactic or semantic textual patterns, approaches targeted at semi-structured documents that learn to identify structural patterns in the ***** template *****, and approaches targeting web tables which rely heavily on entity linking and type information. | ||
| N18-1003 This is achieved by using entity and ***** template ***** seeds jointly (as opposed to just one as in previous work), by expanding entities and ***** template *****s in parallel and in a mutually constraining fashion in each iteration and by introducing higherquality similarity measures for ***** template *****s. | ||
| 2021.dash-1.1 Our system generates knowledge graphs from the articles mentioned in the ***** template *****, which we then process using Wikidata and machine learning algorithms. | ||
| W18-6520 We demonstrate the feasibility of ***** template ***** generation for the IoT domain using our self-learning architecture | ||
| toxicity | 30 | |
| 2021.wnut-1.35 Through a case study on a publicly available ***** toxicity ***** detection model, we demonstrate that our method identifies salient groups of cross-geographic errors, and, in a follow up, demonstrate that these groupings reflect human judgments of offensive and inoffensive language in those geographic contexts. | ||
| 2021.semeval-1.115 We are particularly interested in the correlation between ***** toxicity ***** and the emotions expressed in online posts. | ||
| 2021.semeval-1.125 This task asks competitors to extract spans that have ***** toxicity ***** from the given texts, and we have done several analyses to understand its structure before doing experiments. | ||
| W19-3501 In this paper, we explore various aspects of sentiment detection and their correlation to ***** toxicity *****, and use our results to implement a ***** toxicity ***** detection tool. | ||
| 2021.woah-1.5 We find that an uncertainty-based strategy consistently outperforms the widely used strategy based on ***** toxicity ***** scores, and moreover that the choice of review strategy drastically changes the overall system performance | ||
| WebNLG | 30 | |
| 2021.eacl-main.64 On both the E2E and ***** WebNLG ***** benchmarks, we show that this weakly supervised training paradigm is able to outperform fully supervised sequence-to-sequence models with less than 10% of the training set. | ||
| 2020.acl-main.641 On both E2E and ***** WebNLG ***** benchmarks, we show the proposed model consistently outperforms its neural attention counterparts. | ||
| 2020.acl-main.136 It enjoys further performance boost when employing a pre-trained BERT encoder, outperforming the strongest baseline by 17.5 and 30.2 absolute gain in F1-score on two public datasets NYT and ***** WebNLG *****, respectively. | ||
| 2021.acl-srw.2 This new approach has significantly improved the performance of all text generation metrics for the English ***** WebNLG ***** 2017 dataset. | ||
| W19-8659 In the recent ***** WebNLG ***** challenge (the first comprehensive task addressing the mapping of RDF triples to text) FORGe ranked first with respect to the overall quality in human evaluation | ||
| optimal | 30 | |
| 2020.emnlp-main.238 We design two message-passing mechanisms to transfer knowledge between annotated and non-annotated data, named prior ***** optimal ***** transport and bi-directional lexicon update respectively. | ||
| 2020.inlg-1.37 Additionally, we experiment with using self-training and reverse model reranking to better handle train/test data mismatches, and find that while these methods help reduce content errors, it remains essential to include discourse relations in the input to obtain ***** optimal ***** performance. | ||
| 2021.eacl-main.219 At test time they typically employ beam search to avoid locally ***** optimal ***** but globally sub***** optimal ***** predictions. | ||
| 2020.findings-emnlp.130 Our work demonstrates that ***** optimal ***** balancing policies can significantly improve classifier performance, while augmenting just part of the classes and under-sampling others. | ||
| C18-1019 Despite individual users' differences in vocabulary knowledge, current systems do not consider these variations; rather, they are trained to find one ***** optimal ***** substitution or ranked list of substitutions for all users | ||
| cyberbullying | 30 | |
| 2020.figlang-1.10 It is helpful in the field of sentimental analysis and ***** cyberbullying *****. | ||
| 2021.acl-long.168 We then propose a context-aware and model-agnostic debiasing strategy that leverages a reinforcement learning technique, without requiring any extra resources or annotations apart from a pre-defined set of sensitive triggers commonly used for identifying ***** cyberbullying ***** instances. | ||
| 2020.lrec-1.430 Our work for both the English and the Danish language captures the type and targets of offensive language, and present automatic methods for detecting different kinds of offensive language such as hate speech and ***** cyberbullying *****. | ||
| 2020.trac-1.23 Due to its harmful effects on people, especially youth, it is imperative to detect ***** cyberbullying ***** as early as possible before it causes irreparable damages to victims. | ||
| W19-3511 We evaluate the classification module on a data set built on Instagram messages, and we describe the ***** cyberbullying ***** monitoring user interface | ||
| taxonomic | 30 | |
| 2020.law-1.15 When fluency data are annotated using this scheme, a relation between fluency and age emerges; this is in contrast to a strict implementation of the traditional method of annotating verbal fluency data, which has no way of deal-ing with score-confounding phenomena because it force-groups all verbal fluency pro-ductions –regardless of speaker intention— into one of three ***** taxonomic ***** groups (i.e. val-id answers, perseverations, and intrusions). | ||
| 2020.coling-main.110 This performance improvement is observed both in overall accuracy and the weighted spread by true ***** taxonomic ***** depth. | ||
| 2021.naacl-main.373 We present a method for constructing ***** taxonomic ***** trees (e.g., WordNet) using pretrained language models. | ||
| P19-1313 Moreover – and in contrast with other methods – the hierarchical nature of hyperbolic space allows us to learn highly efficient representations and to improve the ***** taxonomic ***** consistency of the inferred hierarchies. | ||
| L10-1324 In this paper, we demonstrate how it is possible to extract ***** taxonomic ***** information without any analysis of the specific text, by comparing the same lexical entry in a number of different dictionaries | ||
| Indigenous | 30 | |
| 2020.coling-main.313 After generations of exploitation, ***** Indigenous ***** people often respond negatively to the idea that their languages are data ready for the taking. | ||
| C18-1222 In this article, we discuss which text, speech, and image technologies have been developed, and would be feasible to develop, for the approximately 60 ***** Indigenous ***** languages spoken in Canada. | ||
| C18-1006 ***** Indigenous ***** languages of the American continent are highly diverse | ||
| 2020.lrec-1.333 Mi'kmaq is an *****Indigenous***** language spoken primarily in Eastern Canada . | ||
| 2020.lrec-1.307 We introduce the first attempt at automatic speech recognition ( ASR ) in Inuktitut , as a representative for polysynthetic , low - resource languages , like many of the 900 *****Indigenous***** languages spoken in the Americas . | ||
| Concept | 30 | |
| D17-1320 *****Concept***** maps can be used to concisely represent important information and bring structure into large document collections . | ||
| W18-5047 *****Concept***** definition is important in language understanding ( LU ) adaptation since literal definition difference can easily lead to data sparsity even if different data sets are actually semantically correlated . | ||
| 2020.acl-main.717 *****Concept***** graphs are created as universal taxonomies for text understanding in the open - domain knowledge . | ||
| 2020.acl-main.748 *****Concept***** normalization , the task of linking textual mentions of concepts to concepts in an ontology , is challenging because ontologies are large . | ||
| D19-5414 *****Concept***** maps are visual summaries , structured as directed graphs : important concepts from a dataset are displayed as vertexes , and edges between vertexes show natural language descriptions of the relationships between the concepts on the map . | ||
| Mandarin | 30 | |
| 2021.emnlp-main.454 We train LSTMs, Recurrent Neural Network Grammars, Transformer language models, and Transformer-parameterized generative parsing models on two ***** Mandarin ***** Chinese datasets of different sizes. | ||
| 2021.rocling-1.31 This paper thus aimed to focus on hidden advertorial detection of online posts in Taiwan ***** Mandarin ***** Chinese. | ||
| L14-1454 The present tools have been tested in a number of studies on English, ***** Mandarin ***** and Polish, and are introduced here with reference to results from these studies | ||
| W16-5403 This article proposes a Universal Dependency Annotation Scheme for *****Mandarin***** Chinese , including POS tags and dependency analysis . | ||
| 2020.lrec-1.3 This article introduces Mandarinograd , a corpus of Winograd Schemas in *****Mandarin***** Chinese . | ||
| numerical | 30 | |
| 2021.acl-long.241 We show that while state-of-the-art transformer models perform very well for small databases, they exhibit limitations in processing noisy data, ***** numerical ***** operations, and queries that aggregate facts. | ||
| W18-5040 We introduce an unsupervised learning method on text coherence that could produce ***** numerical ***** representations that improve implicit discourse relation recognition in a semi-supervised manner. | ||
| 2021.acl-long.455 In recent years, math word problem solving has received considerable attention and achieved promising results, but previous methods rarely take ***** numerical ***** values into consideration. | ||
| 2021.emnlp-main.563 While the previous models for ***** numerical ***** MRC are able to interpolate the learned ***** numerical ***** reasoning capabilities, it is not clear whether they can perform just as well on numbers unseen in the training dataset. | ||
| 2021.eacl-main.267 We introduce a new information extraction task, metric-type identification from multi-level header ***** numerical ***** tables, and provide a dataset extracted from scientific papers consisting of header tables, captions, and metric-types | ||
| challenging | 30 | |
| 2021.emnlp-main.144 Experimental results show that our proposed method achieves state-of-the-art performance on three ***** challenging ***** intent detection datasets under 5-shot and 10-shot settings. | ||
| 2020.coling-main.478 Argumentation mining on essays is a new ***** challenging ***** task in natural language processing, which aims to identify the types and locations of argumentation components. | ||
| L16-1200 The tremendous amount of data exchanged on these platforms as well as the specific form of language adopted by social media users constitute a new ***** challenging ***** context for existing argument mining techniques. | ||
| N18-1175 On this basis, we present a new ***** challenging ***** task, the argument reasoning comprehension task. | ||
| 2020.acl-main.116 This paper presents a new ***** challenging ***** information extraction task in the domain of materials science | ||
| posterior regularization | 30 | |
| D19-1103 We propose new algorithms that adapt two techniques, Lagrangian relaxation and ***** posterior regularization *****, to conduct inference with corpus-statistics constraints. | ||
| 2020.acl-main.264 We further propose a bias mitigation approach based on ***** posterior regularization *****. | ||
| P17-1139 In this work, we propose to use ***** posterior regularization ***** to provide a general framework for integrating prior knowledge into neural machine translation. | ||
| N19-1290 This paper adopts ***** posterior regularization ***** (PR) to integrate some domain-specific rules in instance selection using REINFORCE | ||
| 2021.eacl-main.285 In this paper, we propose a ***** posterior regularization ***** framework for the variational approach to the weakly supervised sentiment analysis to better control the posterior distribution of the label assignment. | ||
| knowledge base population | 30 | |
| C18-2002 We introduce INCEpTION, a new annotation platform for tasks including interactive and semantic annotation (e.g., concept linking, fact linking, ***** knowledge base population *****, semantic frame annotation). | ||
| N19-1327 We evaluate our model in a one-shot learning task by showing a promising generalization capability in order to classify unseen relation types, which makes this approach suitable to perform automatic ***** knowledge base population ***** with minimal supervision. | ||
| P19-1382 Our approach outperforms baselines previously used for this problem, as well as a strong baseline from ***** knowledge base population *****. | ||
| P17-1038 However, hand-labeled training data is expensive to produce, in low coverage of event types, and limited in size, which makes supervised methods hard to extract large scale of events for ***** knowledge base population *****. | ||
| N19-1069 The automatic detection of satire vs. regular news is relevant for downstream applications (for instance, ***** knowledge base population *****) and to improve the understanding of linguistic characteristics of satire. | ||
| theory | 30 | |
| 2020.pam-1.2 We use this model to develop a socio-semantic ***** theory ***** of conventionalised reasoning patterns, known as topoi. | ||
| 2021.hackashop-1.19 In the 2021 Embeddia Hackathon, we implemented one novel, normative ***** theory *****-based evaluation metric, “activation”, and use it to compare two recommendation strategies of New York Times comments, one based on user likes and another on editor picks. | ||
| 2015.lilt-10.4 In this paper we offer a probabilistic formulation of a rich type ***** theory *****, Type Theory with Records (TTR), and we illustrate how this framework can be used to approach the problem of semantic learning. | ||
| 2021.eacl-main.179 First, inspired by the ***** theory ***** of impoliteness, we propose a novel task of detecting a subtler form of abuse, namely unpalatable questions. | ||
| W19-1104 We use monads from category ***** theory ***** in order to `upgrade' an ordinary intensional semantics to a possible hyperintensional counterpart. | ||
| sense induction | 30 | |
| D18-1025 This paper proposes a modularized ***** sense induction ***** and representation learning model that jointly learns bilingual sense embeddings that align well in the vector space, where the cross-lingual signal in the English-Chinese parallel corpus is exploited to capture the collocation and distributed characteristics in the language pair. | ||
| E17-1009 On the example of word ***** sense induction ***** and disambiguation (WSID), we show that it is possible to develop an interpretable model that matches the state-of-the-art models in accuracy. | ||
| 2020.semeval-1.29 This paper presents an approach to lexical semantic change detection based on Bayesian word ***** sense induction ***** suitable for novel word sense identification. | ||
| 2020.coling-main.107 generation of plausible words that can replace a particular target word in a given context, is an extremely powerful technology that can be used as a backbone of various NLP applications, including word ***** sense induction ***** and disambiguation, lexical relation extraction, data augmentation, etc. | ||
| D18-1174 In experimental evaluation disambiguated skip-gram improves state-of-the are results in several word ***** sense induction ***** benchmarks. | ||
| comments | 30 | |
| 2021.hackashop-1.19 In the 2021 Embeddia Hackathon, we implemented one novel, normative theory-based evaluation metric, “activation”, and use it to compare two recommendation strategies of New York Times ***** comments *****, one based on user likes and another on editor picks. | ||
| 2021.cinlp-1.8 We use propensity score stratification, a causal inference method for observational data, and estimate whether the amount of ***** comments ***** —as a measure of social support— increases or decreases the likelihood of posting again on SW. One hypothesis is that receiving more ***** comments ***** may decrease the likelihood of the user posting in SW in the future, either by reducing symptoms or because ***** comments ***** from untrained peers may be harmful. | ||
| 2020.coling-main.259 The automatic generation of music ***** comments ***** is of great significance for increasing the popularity of music and the music platform's activity. | ||
| 2021.hackashop-1.13 The research on the summarization of user ***** comments ***** is still in its infancy, and human-created summarization datasets are scarce, especially for less-resourced languages. | ||
| W19-8666 In this paper, we report on the results of the TL;DR challenge, discussing an extensive manual evaluation of the expected properties of a good summary based on analyzing the ***** comments ***** provided by human annotators. | ||
| automatic term extraction | 30 | |
| W16-4703 In the paper, we address the problem of recognition of non-domain phrases in terminology lists obtained with an ***** automatic term extraction ***** tool. | ||
| 2020.computerm-1.12 The TermEval 2020 shared task provided a platform for researchers to work on ***** automatic term extraction ***** (ATE) with the same dataset: the Annotated Corpora for Term Extraction Research (ACTER). | ||
| L12-1426 The starting point was the ***** automatic term extraction ***** from a corpus of web documents concerning the domain of interest (304,000 words); as regards corpus construction, we describe the main criteria of the web documents selection and its critical points, concerning the definition of user profile and of degrees of specialisation. | ||
| 2020.computerm-1.14 This paper describes RACAI's ***** automatic term extraction ***** system, which participated in the TermEval 2020 shared task on English monolingual term extraction. | ||
| L12-1532 In this paper we argue that the *****automatic term extraction***** procedure is an inherently multifactor process and the term extraction models needs to be based on multiple features including a specific type of a terminological resource under development . | ||
| target word | 30 | |
| 2020.semeval-1.30 It consists of preparing a semantic vector space for each corpus, earlier and later; computing a linear transformation between earlier and later spaces, using Canonical Correlation Analysis and orthogonal transformation;and measuring the cosines between the transformed vector for the ***** target word ***** from the earlier corpus and the vector for the ***** target word ***** in the later corpus. | ||
| D19-5604 At test time, given multiple documents, the Distribute step of our MSQG model predicts ***** target word ***** distributions for each document using the trained model. | ||
| 2020.emnlp-main.738 It firstly estimates the input data's supportiveness for each ***** target word ***** with an estimator and then applies a supportiveness adaptor and a rebalanced beam search to harness the over-generation problem in the training and generation phases respectively. | ||
| 2021.emnlp-main.267 To reduce the negative impact of noises, we propose a self-supervised method for both sentence- and word-level QE, which performs quality estimation by recovering the masked ***** target word *****s. | ||
| L12-1595 We present a method for improving word alignment quality for phrase-based statistical machine translation by reordering the source text according to the ***** target word ***** order suggested by an initial word alignment. | ||
| sentence level | 30 | |
| W18-6223 Deep Learning architectures such as LSTMs, CNNs, and RNNs show promise in ***** sentence level ***** classification problems. | ||
| L04-1250 We present a supervised method for training a ***** sentence level ***** confidence measure on translation output using a human-annotated corpus. | ||
| D19-1398 In this paper, we propose an easy first approach for relation extraction with information redundancies, embedded in the results produced by local ***** sentence level ***** extractors, during which conflict decisions are resolved with domain and uniqueness constraints. | ||
| P17-1144 Drafts are manually aligned at the ***** sentence level *****, and the writer's purpose for each revision is annotated with categories analogous to those used in argument mining and discourse analysis. | ||
| 2020.lrec-1.158 We release corpus of high quality sentences and parse trees with these two types of labels on ***** sentence level *****. | ||
| technology | 30 | |
| L12-1309 The RIDIRE-CPI user-friendly interface is specifically intended for allowing collaborative work performance by users with low skills in web ***** technology ***** and text processing. | ||
| L16-1526 For this, we use the WebLicht language ***** technology ***** infrastructure. | ||
| L08-1579 Recently the LATL has undertaken the development of a multilingual translation system based on a symbolic parsing ***** technology ***** and on a transfer-based translation model. | ||
| 2020.coling-main.579 Large text corpora are increasingly important for a wide variety of Natural Language Processing (NLP) tasks, and automatic language identification (LangID) is a core ***** technology ***** needed to collect such datasets in a multilingual context. | ||
| 1999.mtsummit-1.85 Most importantly, the objectives the ***** technology *****'s use is expected to accomplish must be known, the objectives must be expressed as tasks that accomplish the objectives, and then successful outcomes defined for the tasks. | ||
| language documentation | 30 | |
| L06-1507 However for the much more restricted domain of ***** language documentation ***** such a category system might still prove reasonable if not indispensable. | ||
| 2020.lrec-1.353 On the long term, ASR4LD will not only be an integral part of the ongoing documentation project of Muyu, but will be further developed in order to facilitate also the ***** language documentation ***** process of other language groups. | ||
| 2020.emnlp-main.422 Creating a descriptive grammar of a language is an indispensable step for ***** language documentation ***** and preservation. | ||
| 2020.lrec-1.324 Natural speech data on many languages have been collected by ***** language documentation ***** projects aiming to preserve lingustic and cultural traditions in audivisual records. | ||
| 2020.coling-main.471 Interlinear Glossed Text ( IGT ) is a widely used format for encoding linguistic information in *****language documentation***** projects and scholarly papers . | ||
| problems | 30 | |
| D19-1212 Multi-view learning algorithms are powerful representation learning tools, often exploited in the context of multimodal ***** problems *****. | ||
| D17-1084 It first retrieves a few relevant equation system templates and aligns numbers in math word ***** problems ***** to those templates for candidate equation generation. | ||
| 2021.acl-long.336 However, the existing methods fail to solve these two ***** problems ***** at the same time, which leads to unsatisfactory results. | ||
| L06-1294 While results are often encouraging, the paper also highlights evident ***** problems ***** and drawbacks with the method, and outlines suggestions for future work. | ||
| 2020.emnlp-main.203 These ***** problems ***** can be partially overcome by incorporating a segmentation into tokens in the model. | ||
| electronic health | 30 | |
| 2020.lrec-1.547 Multiple efforts have been done to protect the integrity of patients while making ***** electronic health ***** records usable for research by removing personally identifiable information in patient records. | ||
| W19-1915 Recently natural language processing (NLP) tools have been developed to identify and extract salient risk indicators in ***** electronic health ***** records (EHRs). | ||
| 2020.lrec-1.714 However, retrieving eligible patients for a trial from the ***** electronic health ***** record (EHR) database remains a challenging task for clinicians since it requires not only medical knowledge about eligibility criteria, but also an adequate understanding of structured query language (SQL). | ||
| W19-5003 This paper proposes a dataset and method for automatically generating paraphrases for clinical questions relating to patient-specific information in ***** electronic health ***** records (EHRs). | ||
| 2021.naacl-main.318 Given the clinical notes written in ***** electronic health ***** records (EHRs), it is challenging to predict the diagnostic codes which is formulated as a multi-label classification task. | ||
| article | 30 | |
| P17-1028 We evaluate a suite of methods across two different applications and text genres: Named-Entity Recognition in news ***** article *****s and Information Extraction from biomedical abstracts. | ||
| 2020.lrec-1.641 The texts come from different sources: daily newspaper ***** article *****s, Czech translation of the Wall Street Journal, transcribed dialogs and a small amount of user-generated, short, often non-standard language segments typed into a web translator. | ||
| L16-1453 In the automatic alignments of parallel corpora, most of the p***** article *****s align to NULL. | ||
| D19-1664 We first produce a new dataset, BASIL, of 300 news ***** article *****s annotated with 1,727 bias spans and find evidence that informational bias appears in news ***** article *****s more frequently than lexical bias. | ||
| L14-1596 This ***** article ***** presents the methods, results, and precision of the syntactic annotation process of the Rhapsodie Treebank of spoken French. | ||
| propaganda | 30 | |
| 2021.wnut-1.15 We observe a high surge in information and ***** propaganda ***** flow during election campaigning. | ||
| D19-5023 In this paper, we provide additional analysis regarding our method of detecting spans of ***** propaganda ***** with synthetically generated representations. | ||
| 2020.emnlp-main.320 Instead of merely learning from input-output datapoints in training data, we introduce an approach to inject declarative knowledge of fine-grained ***** propaganda ***** techniques. | ||
| D19-5016 Digital content production technologies with logical fallacies and emotional language can be used as ***** propaganda ***** techniques to gain more readers or mislead the audience. | ||
| 2020.semeval-1.245 The article describes a fast solution to *****propaganda***** detection at SemEval-2020 Task 11 , based on feature adjustment . | ||
| temporal relation | 30 | |
| 2020.emnlp-main.461 Extracting event ***** temporal relation *****s is a critical task for information extraction and plays an important role in natural language understanding. | ||
| S17-2093 11 teams participated in the tasks, with the best systems achieving F1 scores above 0.55 for time expressions, above 0.70 for event expressions, and above 0.40 for ***** temporal relation *****s. | ||
| L14-1404 Third, every text in the corpus has been annotated for 14 layers of syntax and semantics, including: referring expressions and co-reference; events, time expressions, and ***** temporal relation *****ships; semantic roles; and word senses. | ||
| D17-1190 We present a sequential model for ***** temporal relation ***** classification between intra-sentence events. | ||
| 2020.coling-main.453 Determining whether an event in a news article is a foreground or background event would be useful in many natural language processing tasks , for example , *****temporal relation***** extraction , summarization , or storyline generation . | ||
| automatic keyphrase extraction | 30 | |
| W16-3917 The SemEval-2010 benchmark dataset has brought renewed attention to the task of ***** automatic keyphrase extraction *****. | ||
| L16-1304 The output keyphrases of ***** automatic keyphrase extraction ***** methods for test documents are typically evaluated by comparing them to manually assigned reference keyphrases. | ||
| 2020.coling-main.184 In this paper, we introduce two advancements in the ***** automatic keyphrase extraction ***** (AKE) space - KeyGames and pke+. | ||
| C16-1077 We found that our approach has a slightly positive impact on the performance of *****automatic keyphrase extraction*****, in particular when considering the ranking of the results. | ||
| P19-1588 This paper studies *****automatic keyphrase extraction***** on social media. | ||
| contextualized word | 30 | |
| 2020.coling-main.338 To deal with this problem, we propose to improve the ***** contextualized word ***** representations via adversarial learning and fine-tuning BERT processes. | ||
| 2020.acl-demos.41 We fine-tune the ***** contextualized word ***** representations of the RoBERTa language model using labeled DDI data, and apply the fine-tuned model to identify supplement interactions. | ||
| 2020.cogalex-1.9 We also experiment with a non-***** contextualized word ***** embedding baseline, in this case word2Vec (Mikolov et al., 2013) and compare its performance with that of CWEs. | ||
| 2020.mwe-1.20 We encode tokens as feature vectors combining multilingual ***** contextualized word ***** embeddings provided by the XLM-RoBERTa language model with a more traditional linguistic feature set relying on context windows and dependency relations. | ||
| 2021.sdp-1.9 We also investigate how intermediate pretraining interacts with ***** contextualized word ***** embeddings trained on different domains. | ||
| automatic readability assessment | 30 | |
| W18-3703 Advances in ***** automatic readability assessment ***** can impact the way people consume information in a number of domains. | ||
| 2021.acl-long.235 Deep learning models for ***** automatic readability assessment ***** generally discard linguistic features traditionally used in machine learning models for the task. | ||
| 2020.lrec-1.404 In this paper, we present a corpus for use in *****automatic readability assessment***** and automatic text simplification for German, the first of its kind for this language. | ||
| W19-4437 *****Automatic readability assessment***** aims to ensure that readers read texts that they can comprehend. | ||
| 2021.eval4nlp-1.7 *****Automatic readability assessment***** (ARA) is the task of automatically assessing readability with little or no human supervision. | ||
| description generation | 30 | |
| I17-2032 Specifically, after extracting templates and learning writing knowledge from attribute-description parallel data, we use the learned knowledge to decide what to say and how to say for product ***** description generation *****. | ||
| W18-6513 To this end, we propose an architecture that uses the source code-docstring relationship to guide the ***** description generation *****. | ||
| 2020.coling-main.182 Digging the relationship of concepts from scratch is non-trivial, therefore, we retrieve prototypes from external knowledge to assist the understanding of the scenario for better ***** description generation *****. | ||
| C16-1005 Automatic video ***** description generation ***** has recently been getting attention after rapid advancement in image caption generation. | ||
| 2020.emnlp-main.377 In particular, we propose the first approach to image ***** description generation ***** where visual processing is modelled sequentially. | ||
| conversation disentanglement | 30 | |
| 2020.emnlp-main.512 In this work, we propose an end-to-end online framework for ***** conversation disentanglement ***** that avoids time-consuming domain-specific feature engineering. | ||
| 2021.emnlp-main.181 In this work, we explore training a ***** conversation disentanglement ***** model without referencing any human annotations. | ||
| 2021.starsem-1.14 In this paper, we apply DAG-LSTMs to the ***** conversation disentanglement ***** task. | ||
| N18-1164 In this paper, we propose to leverage representation learning for ***** conversation disentanglement *****. | ||
| P19-1374 Our manually-annotated data presents an opportunity to develop robust data-driven methods for ***** conversation disentanglement *****, which will help advance dialogue research. | ||
| elastic weight consolidation | 30 | |
| 2020.emnlp-main.394 We find that ***** elastic weight consolidation ***** provides best overall scores yielding only a 0.33% drop in performance across seven generic tasks while remaining competitive in bio-medical tasks. | ||
| 2021.eacl-main.82 In this paper, we show that ***** elastic weight consolidation ***** (EWC) allows fine-tuning of models to mitigate biases while being less susceptible to catastrophic forgetting. | ||
| W19-5340 Two techniques provide the fabric of the Cambridge University Engineering Department's (CUED) entry to the WMT19 evaluation campaign: ***** elastic weight consolidation ***** (EWC) and different forms of language modelling (LMs). | ||
| 2020.acl-main.690 During adaptation we show that *****Elastic Weight Consolidation***** allows a performance trade-off between general translation quality and bias reduction. | ||
| 2021.emnlp-main.666 We find that we can improve over the performance trade-off offered by *****Elastic Weight Consolidation***** with a relatively simple data mixing strategy. | ||
| contextual word embedding | 30 | |
| 2020.coling-main.109 The stellar success of *****contextual word embedding***** models such as BERT in NLP tasks has led many to question whether these models have learned linguistic information, but up till now, most research has focused on syntactic information. | ||
| 2020.wanlp-1.11 In this work, we propose two embedding strategies that modify the tokenization phase of traditional word embedding models (Word2Vec) and *****contextual word embedding***** models (BERT) to take into account Arabic's relatively complex morphology. | ||
| 2020.nlposs-1.5 Non-*****contextual word embedding***** models have been shown to inherit human-like stereotypical biases of gender, race and religion from the training corpora. | ||
| W19-2607 We examine three common neural architectures in NLP: 1) convolutional neural network, 2) multi-layer perceptron (both applied in a sliding window context) and 3) bidirectional LSTM and apply contextual and non-*****contextual word embedding***** layers to these models. | ||
| 2020.semeval-1.290 In this work, we introduce a cross-lingual inductive approach to identify the offensive language in tweets using the *****contextual word embedding***** XLM-RoBERTa (XLM-R). | ||
| paragraph vector | 30 | |
| E17-2073 The results show that DV-LSTM significantly outperforms TF-IDF vector and *****paragraph vector***** (PV-DM) in most cases, and their combinations may further improve the classification performance. | ||
| W17-2615 Recently Le & Mikolov described two log-linear models, called *****Paragraph Vector*****, that can be used to learn state-of-the-art distributed representations of documents. | ||
| S17-2024 The proposed method makes use of *****Paragraph Vector***** for assessing the semantic similarity between pairs of sentences. | ||
| D17-1069 We present a feature vector formation technique for documents - Sparse Composite Document Vector (SCDV) - which overcomes several shortcomings of the current distributional *****paragraph vector***** representations that are widely used for text representation. | ||
| L16-1662 MultiVec includes word2vec's features, *****paragraph vector***** (batch and online) and bivec for bilingual distributed representations. | ||
| event factuality | 30 | |
| N18-1067 We present two neural models for *****event factuality***** prediction, which yield significant performance gains over previous models on three *****event factuality***** datasets: FactBank, UW, and MEANTIME. | ||
| L16-1699 The “First CLIN Dutch Shared Task” at CLIN26 was based on the Dutch section, while the EVALITA 2016 FactA (*****Event Factuality***** Annotation) shared task, based on the Italian section, is currently being organized. | ||
| 2021.wnut-1.6 The goal of *****Event Factuality***** Prediction (EFP) is to determine the factual degree of an event mention, representing how likely the event mention has happened in text. | ||
| N19-1287 Document-level *****event factuality***** identification is an important subtask in *****event factuality***** and is crucial for discourse understanding in Natural Language Processing (NLP). | ||
| P19-1412 Inferring speaker commitment (aka *****event factuality*****) is crucial for information extraction and question answering. | ||
| formal | 30 | |
| K18-1049 Tree - structured neural network architectures for sentence encoding draw inspiration from the approach to semantic composition generally seen in *****formal***** linguistics , and have shown empirical improvements over comparable sequence models by doing so . | ||
| 2020.pam-1.3 The following paper presents a *****formal***** model for the description of dogwhistles . | ||
| 2020.pam-1.8 We present a *****formal***** semantics ( a version of Type Theory with Records ) which places classifiers of perceptual information at the core of semantics . | ||
| 2021.acl-long.292 Despite their impressive performance in NLP , self - attention networks were recently proved to be limited for processing *****formal***** languages with hierarchical structure , such as Dyck - k , the language consisting of well - nested parentheses of k types . | ||
| Q18-1043 Neural methods have had several recent successes in semantic parsing , though they have yet to face the challenge of producing meaning representations based on *****formal***** semantics . | ||
| structural | 30 | |
| N19-1156 In the principles - and - parameters framework , the *****structural***** features of languages depend on parameters that may be toggled on or off , with a single parameter often dictating the status of multiple features . | ||
| 2020.lifelongnlp-1.1 However , for query selection , most of these studies mainly rely on uncertainty - based sampling , which generally does not exploit the *****structural***** information of the unlabeled data . | ||
| 2000.amta-papers.5 This paper describes an approach for handling *****structural***** divergences and recovering dropped arguments in an implemented Korean to English machine translation system . | ||
| 2020.coling-main.231 The *****structural***** information of Knowledge Bases ( KBs ) has proven effective to Question Answering ( QA ) . | ||
| 2020.emnlp-main.375 Humans can learn *****structural***** properties about a word from minimal experience , and deploy their learned syntactic representations uniformly in different grammatical contexts . | ||
| divergence | 29 | |
| 2005.mtsummit-posters.4 We take Dorr's (1993, 1994) classification of translation ***** divergence ***** as the base to examine the different topics of translation ***** divergence ***** in Hindi and English. | ||
| 2020.coling-main.186 The ambiguous annotation criteria lead to ***** divergence ***** of Chinese Word Segmentation (CWS) datasets in various granularities. | ||
| 2002.amta-papers.4 In this paper, we introduce DUSTer, a method for systematically identifying common ***** divergence ***** types and transforming an English sentence structure to bear a closer resemblance to that of another language. | ||
| 2020.emnlp-main.327 These algorithms use KL-control to penalize ***** divergence ***** from a pre-trained prior language model, and use a new strategy to make the algorithm pessimistic, instead of optimistic, in the face of uncertainty. | ||
| W19-5921 Structured Fusion Networks are shown to have several valuable properties, including better domain generalizability, improved performance in reduced data scenarios and robustness to ***** divergence ***** during reinforcement learning | ||
| subtitling | 29 | |
| 2020.eamt-1.53 ELITR (European Live Translator) project aims to create a speech translation system for simultaneous ***** subtitling ***** of conferences and online meetings targetting up to 43 languages. | ||
| 2021.mtsummit-asltrw.4 In an attempt to automatise the process, we aim at exploring the feasibility of simultaneous speech translation (SimulST) for live ***** subtitling *****. | ||
| 2020.lrec-1.790 The newest generation of speech technology caused a huge increase of audio-visual data nowadays being enhanced with orthographic transcripts such as in automatic ***** subtitling ***** in online platforms. | ||
| 2020.iwslt-1.30 However, in some use cases such as ***** subtitling *****, verbatim transcription would reduce output readability given limited screen size and reading time. | ||
| L14-1392 Finally, we present and discuss feedback from the subtitlers who participated in the evaluation, a key aspect for any eventual adoption of machine translation technology in professional ***** subtitling ***** | ||
| proficiency | 29 | |
| 2020.coling-main.77 First, the latent trait of examinee ***** proficiency ***** is measured using the scored MCQs and then a model is trained on the experimental SAQ responses as input, aiming to predict ***** proficiency ***** as its target variable. | ||
| N18-4009 We focus on the use of English grammatical morphemes across four ***** proficiency ***** levels. | ||
| W19-4445 This study aims to build an automatic system for the detection of plagiarized spoken responses in the context of an assessment of English speaking ***** proficiency ***** for non-native speakers. | ||
| W18-6312 We reassess a recent study (Hassan et al., 2018) that claimed that machine translation (MT) has reached human parity for the translation of news from Chinese into English, using pairwise ranking and considering three variables that were not taken into account in that previous study: the language in which the source side of the test set was originally written, the translation ***** proficiency ***** of the evaluators, and the provision of inter-sentential context. | ||
| 2021.naacl-main.351 In this paper, we investigate which aspects contribute to the notion of lexical complexity in various groups of readers, focusing on native and non-native speakers of English, and how the notion of complexity changes depending on the ***** proficiency ***** level of a non-native reader | ||
| paradigmatic | 29 | |
| W16-5320 Those relations include both ***** paradigmatic ***** relations, i.e. vertical relations, such as synonymy, antonymy and meronymy and syntagmatic relations, i.e. horizontal relations, such as objective qualification (legitimate demand), subjective qualification (fruitful analysis), positive evaluation (good review) and support verbs (pay a visit, subject to an interrogation). | ||
| L08-1581 And although the details of that linguistic structure vary from language to language, language universals such as context-free syntactic structure and the ***** paradigmatic ***** structure of inflectional morphology, allow us to learn the specific details of a minority language. | ||
| L10-1206 In addition, the co-based space with a larger context size yields better performance for the syntagmatic relation, while the co-based space with a smaller context size tends to show better performance for the ***** paradigmatic ***** relation. | ||
| L14-1408 In computation linguistics a combination of syntagmatic and ***** paradigmatic ***** features is often exploited. | ||
| L06-1392 In addition, it is argued that a ***** paradigmatic ***** reliability study should relate measures of inter-annotator agreement to independent assessments, such as significance tests of the annotated variables with respect to other phenomena | ||
| Concretely | 29 | |
| C16-1242 ***** Concretely *****, we investigate the effect of dialogue acts, speakers, gender, and text register on SMT quality when translating fictional dialogues. | ||
| D18-1177 ***** Concretely *****, on the tasks of assessing pairwise word similarity and image/caption retrieval, our approach attains equally competitive or stronger results when compared to other state-of-the-art multimodal models. | ||
| 2019.gwc-1.23 ***** Concretely *****, in the case of synonymous phrases, we try to link adverbial expressions which are a part of phrases to the adverbial synset in Japanese wordnet. | ||
| 2020.acl-main.477 ***** Concretely *****, we present a new task and corpus for learning alignments between machine and human preferences. | ||
| P19-2022 ***** Concretely *****, we incorporate informativeness in a previously proposed model of nonce learning, using it for context selection and learning rate modulation | ||
| disfluent | 29 | |
| 2020.iwslt-1.21 We focus on the translation ***** disfluent ***** speech transcripts that include ASR errors and non-grammatical utterances. | ||
| D17-1287 In particular, our detector achieves performance above 0.70 F1 across a variety of combinations of lexically different corpora for training and testing, as well as dramatic improvements (up to 4,000%) in performance when trained on a small, ***** disfluent ***** data set. | ||
| 2021.sigdial-1.22 Analysis on sentence embeddings of ***** disfluent ***** and fluent sentence pairs reveals that the deeper the layer, the more similar their representation (exp2). | ||
| D19-5408 They are also better suited than individual words and phrases that can potentially lead to ***** disfluent *****, fragmented summaries. | ||
| 2021.eacl-main.150 In this paper, we investigate semantic parsing of ***** disfluent ***** speech with the ATIS dataset | ||
| distractor | 29 | |
| L10-1170 Such algorithms are computational models that automatically generate referring expressions by computing how a specific target can be identified to an addressee by distinguishing it from a set of ***** distractor ***** objects. | ||
| 2020.coling-main.189 In this paper, we propose a question and answer guided ***** distractor ***** generation (EDGE) framework to automate ***** distractor ***** generation. | ||
| 2020.bea-1.10 Simple features of the ***** distractor ***** and correct answer correlate with the annotations, though we find substantial benefit to additionally using large-scale pretrained models to measure the fit of the ***** distractor ***** in the context. | ||
| 2021.cmcl-1.6 Specifically, we show that surprisal of the verb or reflexive pronoun predicts facilitatory interference effects in ungrammatical sentences, where a ***** distractor ***** noun that matches in number with the verb or pronouns leads to faster reading times, despite the ***** distractor ***** not participating in the agreement relation. | ||
| 2020.emnlp-main.65 We further extend the framework by learning the ***** distractor ***** selection, which has been usually done manually or randomly | ||
| BERT embeddings | 29 | |
| P19-1506 The advantage for training with visual context when testing without is robust across different languages (English, German and Spanish) and different models (GRU, LSTM, Delta-RNN, as well as those that use ***** BERT embeddings *****). | ||
| 2021.wassa-1.7 Using multilingual ***** BERT embeddings *****, we show that emotions can be reliably inferred both within and across languages. | ||
| 2021.ranlp-1.69 We also show that the general information encoded in ***** BERT embeddings ***** can be used as a substitute feature set for low-resource languages like Filipino with limited semantic and syntactic NLP tools to explicitly extract feature values for the task. | ||
| 2020.cogalex-1.8 The system presented here works by employing ***** BERT embeddings ***** of the words and passing the same over tuned neural network to produce a learning model for the pair of words and their relationships. | ||
| 2020.starsem-1.12 We take this finding as evidence that ***** BERT embeddings ***** might be better representations of context than encodings of word meaning | ||
| retrieve | 29 | |
| 2021.naacl-industry.23 Conceptually, to answer a user question on a technical forum, a human expert has to first ***** retrieve ***** relevant documents, and then read them carefully to identify the answer snippet. | ||
| D19-1636 Our framework introduces incongruity into the literal input version through modules that: (a) filter factual content from the input opinion, (b) ***** retrieve ***** incongruous phrases related to the filtered facts and (c) synthesize sarcastic text from the incongruous filtered and incongruous phrases. | ||
| W17-2615 This model can be used to rapidly ***** retrieve ***** a short list of highly relevant documents from a large document collection. | ||
| 2020.findings-emnlp.305 We instead collect training data with active learning, using a BERT-based embedding model to efficiently ***** retrieve ***** uncertain points from a very large pool of unlabeled utterance pairs. | ||
| 2020.findings-emnlp.332 The global pandemic has made it more important than ever to quickly and accurately ***** retrieve ***** relevant scientific literature for effective consumption by researchers in a wide range of fields | ||
| Wordnets | 29 | |
| 2019.gwc-1.49 ***** Wordnets ***** can be built for any language in GeoNames, we give results for those wordnets in the Open Multilingual Wordnet. | ||
| 2020.rail-1.2 The underlying data structure is open for monolingual, bilingual or multilingual dictionaries and also supports the connection to complex external resources like ***** Wordnets *****. | ||
| 2018.gwc-1.10 ***** Wordnets ***** are extensively used in natural language processing, but the current approaches for manually building a wordnet from scratch involves large research groups for a long period of time, which are typically not available for under-resourced languages. | ||
| 2020.lrec-1.378 We digitize the cognate data from an Indian language cognate dictionary and utilize linked Indian language ***** Wordnets ***** to generate cognate sets | ||
| 2016.gwc-1.44 This paper presents our first attempt at verifying integrity constraints of our openWordnet - PT against the ontology for *****Wordnets***** encoding . | ||
| incorporation | 29 | |
| D17-2010 While DLATK provides standard NLP pipeline steps such as tokenization or SVM-classification, its novel strengths lie in analyses useful for psychological, health, and social science: (1) ***** incorporation ***** of extra-linguistic structured information, (2) specified levels and units of analysis (e.g. document, user, community), (3) statistical metrics for continuous outcomes, and (4) robust, proven, and accurate pipelines for social-scientific prediction problems. | ||
| N19-1182 We address this shortcoming by allowing the proper ***** incorporation ***** of global information into the GCN family of models through the use of scaled node weights. | ||
| 2020.acl-main.400 Previous works propose various ***** incorporation ***** methods, but most of them do not consider the relative importance of multiple modalities. | ||
| 2021.emnlp-main.173 To better cooperate with this framework, we devise a variant of Transformer with decoupled decoder which facilitates the disentangled learning of response generation and knowledge ***** incorporation *****. | ||
| L08-1201 In this paper we present a novel approach to the incremental ***** incorporation ***** of semantic information in natural language processing which does not fall victim to the notorious problems of ambiguity and lack of robustness, namely through the formal interpretation of semantic annotation | ||
| distinguishing | 29 | |
| N18-4008 In Igbo language, diacritics - orthographic and tonal - play a huge role in the ***** distinguishing ***** the meaning and pronunciation of words. | ||
| L10-1397 By comparing the output of different parameter settings in our model to a data set of human-produced referring expressions, we determine that an approach to subsequent reference based on conceptual pacts provides a better explanation of our data than previously proposed algorithmic approaches which compute a new ***** distinguishing ***** description for the intended referent every time it is mentioned. | ||
| R17-1046 Given the constantly growing proliferation of false claims online in recent years, there has been also a growing research interest in automatically ***** distinguishing ***** false rumors from factually true claims. | ||
| 2021.smm4h-1.30 Identifying personal mentions of COVID19 symptoms requires ***** distinguishing ***** personal mentions from other mentions such as symptoms reported by others and references to news articles or other sources. | ||
| 2020.coling-main.522 The fine-tuning experiments show an accuracy of 80.16% when predicting the presence of non-literal translations in a sentence and an accuracy of 85.20% when ***** distinguishing ***** literal and non-literal translations at phrase level | ||
| classify | 29 | |
| W19-2507 Using these features, we ***** classify ***** almost all surviving classical Greek literature as prose or verse with 97% accuracy and F1 score, and further ***** classify ***** a selection of the verse texts into the traditional genres of epic and drama. | ||
| W18-6121 To ***** classify ***** Tweets in this task, this paper proposes a method to input and concatenate character and word sequences in Japanese Tweets by using convolutional neural networks. | ||
| D19-3001 First, it generates domain-specific aspect and opinion lexicons based on an unlabeled dataset; second, it enables the user to view and edit those lexicons (weak supervision); and finally, it enables the user to select an unlabeled target dataset from the same domain, ***** classify ***** it, and generate an aspect-based sentiment report. | ||
| W16-3701 We broadly ***** classify ***** the compounds into four different classes namely, Avyayībhäva, Tatpuruṣa, Bahuvrīhi and Dvandva. | ||
| S17-2019 Our representations allow to use large datasets in language pairs with many instances to better ***** classify ***** instances in smaller language pairs avoiding the necessity of translating into a single language | ||
| Literary | 29 | |
| J76-4001 ACL: New Officers for 1977, Call for Papers, Minutes of 1976 Business Meeting, Secretary-Treasurer's Report, Financial Report; Humanities - 3rd International Conference (J. S. North); Linguistics and ***** Literary ***** Analysis - 5th International (D. E. Ager); Graphics and Interactive Techiniques - 4th Annual (James E. George); Undergraduate Curricula and Computing Conference (Gerald L. Engel); | ||
| 2020.lrec-1.98 The histories are documented in Classical (***** Literary *****) Chinese in a corpus of over 20 million characters, suitable for the computational analysis of historical lexicon and semantic change. | ||
| J74-1001 NSF Sponsorship for AJCL (A. Hood Roberts); Microfiche Viewing Equipment Guide (Ronald F. Borden); ACL Officers 1975 (Aravind K. Joshi); ACL Program, July 26-27, 1974; Association for ***** Literary ***** and Linguistic Computing (R. A. Wisbey); Computer at MIT Can Read (Jonathan Allen); Computer-Assisted Lexicography - Bibliography (Richard W. Bailey); Current Bibliography (Brian Harris; R. Laskowski) | ||
| 2021.lchange-1.4 In this study, we have normalized and lemmatized an Old ***** Literary ***** Finnish corpus using a lemmatization model trained on texts from Agricola | ||
| 2015.lilt-12.6 *****Literary***** works are becoming increasingly available in electronic formats , thus quickly transforming editorial processes and reading habits . | ||
| NL | 29 | |
| 2020.findings-emnlp.390 Data collection for natural language (***** NL *****) understanding tasks has increasingly included human explanations alongside data points, allowing past works to introduce models that both perform a task and generate ***** NL ***** explanations for their outputs. | ||
| 2021.eacl-main.202 As transparency becomes key for robotics and AI, it will be necessary to evaluate the methods through which transparency is provided, including automatically generated natural language (***** NL *****) explanations. | ||
| W18-5705 We propose a reinforcement-learning-driven translation model framework able to 1) learn the translation from ***** NL ***** expressions to queries in a supervised way, and, 2) to overcome the lack of large-scale dataset by framing the translation model as a word selection approach and injecting relevance feedback as a reward in the learning process. | ||
| C16-1055 It is a further development of an existing summariser that has an incremental, proposition-based content selection process but lacks a natural language (***** NL *****) generator for the final output. | ||
| L12-1158 Among the readings available for *****NL***** sentences , those where two or more sets of entities are independent of one another are particularly challenging from both a theoretical and an empirical point of view . | ||
| syntactic parser | 29 | |
| L06-1018 Our annotator is designed to process text before the operation of a ***** syntactic parser *****. | ||
| 2010.jeptalnrecital-long.30 In addition, while detecting verbs and their subjects is a hard task, our ***** syntactic parser ***** detects VS constructions better in matrix than in non-matrix clauses. | ||
| 2020.coling-main.266 Second, we extract the implicit syntactic representations from ***** syntactic parser ***** trained with heterogeneous treebanks. | ||
| D17-1122 Although the proposed model only use a sequential LSTM for sentence modeling without any external resource such as ***** syntactic parser ***** tree and additional lexicon features, experimental results show that our model achieves state-of-the-art performance on three datasets of two tasks. | ||
| C16-1227 Instead of relying on a ***** syntactic parser ***** which might be noisy and slow to build, we compute weights representing probabilities of syntactic relations based on the Huffman softmax tree in an efficient heuristic | ||
| essay | 29 | |
| 2021.acl-demo.29 Based on the comprehensive analysis, IFlyEA provides application services for ***** essay ***** scoring, review generation, recommendation, and explainable analytical visualization. | ||
| N18-3008 We give an overview of two operational automated scoring systems —one for ***** essay ***** scoring and one for speech scoring— and describe the filtering models they use. | ||
| D18-1090 In order to address this issue, we propose a reinforcement learning framework for ***** essay ***** scoring that incorporates quadratic weighted kappa as guidance to optimize the scoring system. | ||
| 2020.bea-1.8 In this paper, we show how a deep-learning based system can outperform feature-based machine learning systems, as well as a string kernel system in scoring ***** essay ***** traits. | ||
| 2020.aacl-srw.17 Our approach demonstrates that sentiment features are beneficial for some ***** essay ***** prompts, and the performance is competitive to other deep learning models on the Automated StudentAssessment Prize (ASAP) benchmark | ||
| capturing | 29 | |
| 2021.ecnlp-1.11 The model combines a generative variational autoencoder, with an integrated class-correlation gating mechanism and a hierarchical structure ***** capturing ***** dependence among products, reviews and classes. | ||
| 2020.coling-main.143 The major advantage is that it is capable of not only ***** capturing ***** both the sequential and structural information of documents but also mixing them together to benefit for multi-hop reasoning and final decision-making. | ||
| K18-1051 In this paper, we argue that there is an inherent trade-off between ***** capturing ***** similarity and faithfully modelling features as directions. | ||
| 2020.clinicalnlp-1.28 In this work, we present an ensemble method that consolidates the predictions of three models, ***** capturing ***** various attributes of textual information for automatic labeling of sentences with section labels. | ||
| 2021.nlp4if-1.10 In this paper, we propose a novel framework that considers entities mentioned in news articles and external knowledge about them, ***** capturing ***** the bias with respect to those entities | ||
| grounded | 29 | |
| D19-1064 We further propose two new complementary objectives ensuring that (1) sentences associated with the same visual content are close in the ***** grounded ***** space and (2) similarities between related elements are preserved across modalities. | ||
| Q13-1016 This paper introduces Logical Semantics with Perception (LSP), a model for ***** grounded ***** language acquisition that learns to map natural language statements to their referents in a physical environment. | ||
| P19-1326 We propose an online method to construct a graph from ***** grounded ***** information and design an algorithm to map from the resulting graphical structure to the space of the pre-trained embeddings. | ||
| 2021.emnlp-main.166 We analyze the ***** grounded ***** SCAN (gSCAN) benchmark, which was recently proposed to study systematic generalization for ***** grounded ***** language understanding | ||
| 2021.acl-srw.8 The impressive performances of pre-trained visually ***** grounded ***** language models have motivated a growing body of research investigating what has been learned during the pre-training. | ||
| tutorial | 29 | |
| P19-4009 The attendees of the ***** tutorial ***** will be able to take away from this ***** tutorial *****, (1) the basic ideas around how modern NLP and NLG techniques could be applied to describe and summarize textual data in format that is non-linguistic in nature or has some structure, and (2) a few interesting open-ended questions, which could lead to significant research contributions in future. | ||
| I17-5006 A coming ***** tutorial ***** on “Deep Learning for Semantic Composition” will be given in ACL2017. | ||
| L08-1401 The results of this hands-on exercise, carried out as part of a conference ***** tutorial *****, have served to refine FEMTIs generic contextual quality model and to obtain feedback on the FEMTI guidelines in general. | ||
| N19-5006 Around 70% of the ***** tutorial ***** will review clinical problems, cutting-edge methodologies, and real-world clinical NLP tools while another 30% introduce use cases at Mayo Clinic and the University of Minnesota | ||
| P16-5001 With this ***** tutorial *****, our aim is to introduce researchers to the areas of NLP that have dealt with multimodal signals. | ||
| Clinical | 29 | |
| 2021.ccl-1.106 However directly applying advances in deep learning to ***** Clinical ***** Event Detection tasks often produces undesirable results. | ||
| 2021.teachingnlp-1.25 We see people using NLP methods in a range of academic disciplines from Asian Studies to ***** Clinical ***** Oncology | ||
| 2020.clinicalnlp-1.11 *****Clinical***** notes contain rich information , which is relatively unexploited in predictive modeling compared to structured data . | ||
| D19-6216 *****Clinical***** notes provide important documentation critical to medical care , as well as billing and legal needs . | ||
| L10-1166 *****Clinical***** texts contain a large amount of information . | ||
| unification | 29 | |
| L06-1265 The goal of this paper is (1) to illustrate a specific procedure for merging different monolingual lexicons, focussing on techniques for detecting and mapping equivalent lexical entries, and (2) to sketch a production model that enables one to obtain lexical resources via ***** unification ***** of existing data. | ||
| L08-1452 The formalism and the engine are more flexible than either the usual shallow parsing formalisms, which assume disambiguated input, or the usual ***** unification *****-based formalisms, which couple disambiguation (via ***** unification *****) with structure building. | ||
| 2017.jeptalnrecital-recital.12 Finding Missing Categories in Incomplete Utterances This paper introduces an efficient algorithm (O(n4 )) for finding a missing category in an incomplete utterance by using ***** unification ***** technique as when learning categorial grammars, and dynamic programming as in Cocke–Younger–Kasami algorithm. | ||
| 2020.lrec-1.354 The methods described aim at solving the tension between ***** unification ***** of data sets and vocabularies on the one hand and maximum openness for the integration of future resources and adaption of external information on the other hand | ||
| N18-1069 Our solution relies on a graph reformulation of partial variable ***** unification *****s and an algorithm that induces subgraph alignments between meaning representations. | ||
| frequency | 29 | |
| 2020.emnlp-main.737 MefMax assigns tokens uniquely to ***** frequency ***** classes, trying to group tokens with similar frequencies and equalize ***** frequency ***** mass between the classes. | ||
| 2021.eacl-main.13 In this work, we show that adversarial attacks against CNN, LSTM and Transformer-based classification models perform word substitutions that are identifiable through ***** frequency ***** differences between replaced words and their corresponding substitutions. | ||
| L16-1267 We introduce Cro36WSD, a freely-available medium-sized lexical sample for Croatian word sense disambiguation (WSD).Cro36WSD comprises 36 words: 12 adjectives, 12 nouns, and 12 verbs, balanced across both ***** frequency ***** bands and polysemy levels. | ||
| L16-1562 WAGS is composed of 6,715 sentence pairs containing 11,958 occurrences of OOV and rare words up to ***** frequency ***** 15 in the Europarl Training set (5,080 English words and 6,878 Italian words), representing almost 3% of the whole text. | ||
| L14-1450 There are 51 target nouns, 51 adjectives, and 51 verbs randomly selected from 3 ***** frequency ***** groups based on the lemma ***** frequency ***** list of the German WaCKy corpus | ||
| syntactically annotated | 29 | |
| L10-1072 We describe a ***** syntactically annotated ***** parallel corpus containing typologically partly different languages, namely English, Swedish and Turkish. | ||
| L16-1248 This paper presents the construction of an open-source dependency treebank of spoken Slovenian, the first ***** syntactically annotated ***** collection of spontaneous speech in Slovenian. | ||
| L08-1450 Data models and encoding formats for ***** syntactically annotated ***** text corpora need to deal with syntactic ambiguity; underspecified representations are particularly well suited for the representation of ambiguous data because they allow for high informational efficiency. | ||
| L14-1305 In this paper, we would like to exemplify how a ***** syntactically annotated ***** bilingual treebank can help us in exploring and revising a developed linguistic theory. | ||
| L10-1192 11,000 words of Quranic Arabic have been ***** syntactically annotated ***** as part of a gold standard treebank | ||
| adaptation | 29 | |
| 2019.iwslt-1.16 We extensively explore data selection in popular multilingual NMT settings, namely in (zero-shot) translation, and in ***** adaptation ***** from a multilingual pre-trained model, for both directions (LRL↔en). | ||
| L08-1600 The remaining failures are analyzed and an outlook on ways of improving the results by ***** adaptation ***** to specific resources is given. | ||
| 2014.amta-researchers.26 The classification of the data is then used to distinguish between the different dialects, split the data accordingly, and utilize the new splits for several ***** adaptation ***** techniques. | ||
| 2011.iwslt-evaluation.2 Additional improvement was achieved not only by ***** adaptation ***** of the language model but also by parallel usage of the baseline and speaker-dependent acoustic models. | ||
| L12-1266 In the present context, alignment is meant as ***** adaptation ***** on the syntactic, semantic and pragmatic levels of communication between the two interlocutors, including choice of similar lexical items and speaking style | ||
| shallow discourse parsing | 29 | |
| K19-1072 This paper describes a novel approach for the task of end-to-end argument labeling in ***** shallow discourse parsing *****. | ||
| 2021.codi-main.12 This paper demonstrates discopy, a novel framework that makes it easy to design components for end-to-end ***** shallow discourse parsing *****. | ||
| E17-4004 Sense classification of discourse relations is a sub-task of ***** shallow discourse parsing *****. | ||
| 2020.lrec-1.139 This paper describes a novel application of semi-supervision for ***** shallow discourse parsing *****. | ||
| 2020.lrec-1.133 The aim of this is to increase usability of the corpus for the task of ***** shallow discourse parsing *****. | ||
| aspect term extraction | 29 | |
| 2020.coling-main.73 The improvements justify the effectiveness of the constituency lattice for ***** aspect term extraction *****. | ||
| 2020.acl-main.340 Aspect-based sentiment analysis (ABSA) involves three subtasks, i.e., ***** aspect term extraction *****, opinion term extraction, and aspect-level sentiment classification. | ||
| W19-0413 ***** aspect term extraction ***** and aspect sentiment classification). | ||
| L16-1429 Evaluation results show the average F-measure of 41.07% for ***** aspect term extraction ***** and accuracy of 54.05% for sentiment classification. | ||
| P19-1056 This paper focuses on two related subtasks of aspect-based sentiment analysis, namely ***** aspect term extraction ***** and aspect sentiment classification, which we call aspect term-polarity co-extraction. | ||
| reference corpus | 29 | |
| 2020.acl-main.424 ASSET is a crowdsourced multi-***** reference corpus ***** where each simplification was produced by executing several rewriting transformations. | ||
| L08-1451 The two corpora have been sampled from the 600M-word Slovene ***** reference corpus ***** FidaPLUS. | ||
| W19-4013 This paper presents the identification of formulaic sequences in the ***** reference corpus ***** of spoken Slovenian and their annotation in terms of syntactic structure, pragmatic function and lexicographic relevance. | ||
| 2020.lrec-1.409 We describe a new version of the Gigafida ***** reference corpus ***** of Slovene. | ||
| L10-1286 A particular methodology was experimented by dividing the automatic acquisition of texts, and consequently, the creation of ***** reference corpus ***** in two phases. | ||
| toxic spans | 29 | |
| 2021.semeval-1.120 The SemEval 2021 task 5: Toxic Spans Detection is a task of identifying considered-***** toxic spans ***** in text, which provides a valuable, automatic tool for moderating online contents. | ||
| 2021.semeval-1.111 This motivates the organization of the SemEval-2021 Task 5: Toxic Spans Detection competition, which has provided participants with a dataset containing ***** toxic spans ***** annotation in English posts. | ||
| 2021.semeval-1.112 Experimental results showed that the introduced auxiliary information can improve the performance of ***** toxic spans ***** detection. | ||
| 2021.semeval-1.31 We found that feeding the model with an expanded training set using Reddit comments of polarized-toxicity and labeling with LIME on top of logistic regression classification could help RoBERTa more accurately learn to recognize ***** toxic spans *****. | ||
| 2021.semeval-1.116 In this paper, we describe our solutions to tackle ***** toxic spans ***** detection. | ||
| stance classification | 29 | |
| W19-6122 This paper introduces a stance-annotated Reddit dataset for the Danish language, and describes various implementations of ***** stance classification ***** models. | ||
| C16-1230 Rumour ***** stance classification *****, the task that determines if each tweet in a collection discussing a rumour is supporting, denying, questioning or simply commenting on the rumour, has been attracting substantial interest. | ||
| S17-2084 For the competition SemEval-2017 we investigated the possibility of performing ***** stance classification ***** (support, deny, query or comment) for messages in Twitter conversation threads related to rumours. | ||
| P19-1113 In this study, we propose a new multi-task learning approach for rumor detection and ***** stance classification ***** tasks. | ||
| W17-5107 We design a joint inference method for the task by modeling argument relation classification and ***** stance classification ***** jointly. | ||
| conversational speech | 29 | |
| L14-1372 Because of the dominance of non-standard Arabic in ***** conversational speech *****, a graphemic pronunciation model (PM) is utilized. | ||
| L08-1219 Corpora of multi-modal ***** conversational speech ***** are rare and frequently difficult to use due to privacy and copyright restrictions. | ||
| 2021.acl-long.138 In this paper, we present a neural model for joint dropped pronoun recovery (DPR) and conversational discourse parsing (CDP) in Chinese ***** conversational speech *****. | ||
| P19-1107 We evaluated the models on the Switchboard ***** conversational speech ***** corpus and show that our model outperforms standard end-to-end speech recognition models. | ||
| L12-1513 A further aim of this study is to investigate the usability of audiobooks as a language resource for expressive speech synthesis of utterances of ***** conversational speech *****. | ||
| answer selection | 29 | |
| Q16-1019 How to model a pair of sentences is a critical issue in many NLP tasks such as ***** answer selection ***** (AS), paraphrase identification (PI) and textual entailment (TE). | ||
| I17-4010 We describe the work of a team from the ADAPT Centre in Ireland in addressing automatic ***** answer selection ***** for the Multi-choice Question Answering in Examinations shared task. | ||
| 2020.scai-1.2 In this paper, we show that question rewriting (QR) of the conversational context allows to shed more light on this phenomenon and also use it to evaluate robustness of different ***** answer selection ***** approaches. | ||
| P17-1168 This enables the reader to build query-specific representations of tokens in the document for accurate ***** answer selection *****. | ||
| D19-1604 In this paper, we establish the effectiveness of using hard negatives, coupled with a siamese network and a suitable loss function, for the tasks of ***** answer selection ***** and answer triggering. | ||
| free text | 29 | |
| 2021.wanlp-1.15 Twitter allows users to declare their locations as ***** free text *****, and these user-declared locations are often noisy and hard to decipher automatically. | ||
| 2004.amta-papers.20 Named-entities in ***** free text ***** represent a challenge to text analysis in Machine Translation and Cross Language Information Retrieval. | ||
| N19-4008 In this paper, we introduce an approach that builds executable probabilistic models from raw, ***** free text *****. | ||
| L12-1605 Automatically segmenting and classifying clinical ***** free text ***** into sections is an important first step to automatic information retrieval, information extraction and data mining tasks, as it helps to ground the significance of the text within. | ||
| 2020.semeval-1.96 This paper describes the unixlong team's system for the SemEval 2020 task6: DeftEval: Extracting term-definition pairs in ***** free text *****. | ||
| discussion | 29 | |
| 1993.eamt-1.11 TransLexis takes up several ideas emerging from the reuse ***** discussion *****. | ||
| L14-1337 Interoperability of annotation schemes is one of the key words in the ***** discussion *****s about annotation of corpora. | ||
| L14-1015 We annotate ***** discussion *****s drawn from two different sets of corpora in order to ensure that our model of social roles and their signals hold up in general. | ||
| W18-5902 The occurrence of stance-taking towards vaccination was measured in documents extracted by topic modelling from two different corpora, one ***** discussion ***** forum corpus and one tweet corpus. | ||
| 2021.emnlp-main.150 However, current ***** discussion *****s primarily treat gender as binary, which can perpetuate harms such as the cyclical erasure of non-binary gender identities. | ||
| math word | 29 | |
| D17-1084 It first retrieves a few relevant equation system templates and aligns numbers in ***** math word ***** problems to those templates for candidate equation generation. | ||
| 2020.coling-main.38 Recently, to address the ***** math word ***** problem-solving task, researchers have applied the encoder-decoder architecture, which is mainly used in machine translation tasks. | ||
| 2021.acl-long.456 Previous ***** math word ***** problem solvers following the encoder-decoder paradigm fail to explicitly incorporate essential math symbolic constraints, leading to unexplainable and unreasonable predictions. | ||
| U19-1024 We have experimented our model on the tasks of semantic parsing and ***** math word ***** problem solving. | ||
| 2021.acl-short.121 With the recent advancements in deep learning , neural solvers have gained promising results in solving *****math word***** problems . | ||
| online hate speech | 29 | |
| W17-1101 Given the steadily growing body of social media content, the amount of ***** online hate speech ***** is also increasing. | ||
| 2021.cinlp-1.6 This survey summarises the relevant research that revolves around estimations of causal effects related to ***** online hate speech *****. | ||
| I17-1078 To address various limitations of supervised hate speech classification methods including corpus bias and huge cost of annotation, we propose a weakly supervised two-path bootstrapping approach for an ***** online hate speech ***** detection model leveraging large-scale unlabeled data. | ||
| 2020.coling-main.557 Academia and industry have developed machine learning and natural language processing models to detect ***** online hate speech ***** automatically. | ||
| W18-5102 Over the past years, interest in ***** online hate speech ***** detection and, particularly, the automation of this task has continuously grown, along with the societal impact of the phenomenon. | ||
| neural text simplification | 29 | |
| P19-1037 This work uses ***** neural text simplification ***** methods to automatically improve the understandability of clinical letters for patients. | ||
| P19-1198 The paper presents a first attempt towards unsupervised ***** neural text simplification ***** that relies only on unlabeled text corpora. | ||
| W19-2305 *****Neural text simplification***** has gained increasing attention in the NLP community thanks to recent advancements in deep sequence-to-sequence learning. | ||
| P17-2014 Unlike the previously proposed automated TS systems, our *****neural text simplification***** (NTS) systems are able to simultaneously perform lexical simplification and content reduction. | ||
| 2020.lrec-1.686 This work presents a replication study of Exploring *****Neural Text Simplification***** Models (Nisioi et al., 2017). | ||
| paradigm completion | 29 | |
| D17-1074 We overview the theoretical motivation for a paradigmatic treatment of derivational morphology, and introduce the task of derivational ***** paradigm completion ***** as a parallel to inflectional ***** paradigm completion *****. | ||
| 2020.acl-main.733 We propose a frugal ***** paradigm completion ***** approach that predicts all related forms in a morphological paradigm from as few manually provided forms as possible. | ||
| 2020.acl-main.598 We propose the task of unsupervised morphological ***** paradigm completion *****. | ||
| 2020.sigmorphon-1.9 In this paper, we present the systems of the University of Stuttgart IMS and the University of Colorado Boulder (IMS–CUBoulder) for SIGMORPHON 2020 Task 2 on unsupervised morphological ***** paradigm completion ***** (Kann et al., 2020). | ||
| 2020.sigmorphon-1.8 We describe the NYU-CUBoulder systems for the SIGMORPHON 2020 Task 0 on typologically diverse morphological inflection and Task 2 on unsupervised morphological ***** paradigm completion *****. | ||
| constituency | 29 | |
| P18-2075 On four ***** constituency ***** parsers in three languages, the method substantially outperforms static oracle likelihood training in almost all settings. | ||
| D17-1174 One of the most pressing issues in discontinuous ***** constituency ***** transition-based parsing is that the relevant information for parsing decisions could be located in any part of the stack or the buffer. | ||
| P17-2025 Recent work has proposed several generative neural models for ***** constituency ***** parsing that achieve state-of-the-art results. | ||
| 2020.emnlp-main.354 In this work, we study visually grounded grammar induction and learn a ***** constituency ***** parser from both unlabeled text and its visual groundings. | ||
| L14-1324 PARSEVAL , the default paradigm for evaluating *****constituency***** parsers , calculates parsing success ( Precision / Recall ) as a function of the number of matching labeled brackets across the test set . | ||
| minority class | 29 | |
| 2020.trac-1.12 BERT based classifiers were found to predict the *****minority classes***** better. | ||
| 2020.emnlp-main.638 Our results demonstrate that AL can boost BERT performance, especially in the most realistic scenario in which the initial set of labeled examples is created using keyword-based queries, resulting in a biased sample of the *****minority class*****. | ||
| L12-1091 However, we achieve significant gains in accuracy with the trigram tagger, and significant gains in performance recognition of *****minority class***** instances with both taggers via Balanced Classification Rate. | ||
| 2021.emnlp-main.252 Difficult samples of the *****minority class***** in imbalanced text classification are usually hard to be classified as they are embedded into an overlapping semantic region with the majority class. | ||
| 2020.semeval-1.226 Our results reveal that the linguistic features are the strong indicators for covering *****minority classes***** in a highly imbalanced dataset. | ||
| user - generated | 29 | |
| L14-1532 The *****user - generated***** content represents an increasing share of the information available today . | ||
| 2021.ranlp-1.62 We use a deep bidirectional transformer to extract the Myers - Briggs personality type from *****user - generated***** data in a multi - label and multi - class classification setting . | ||
| W19-3707 Nowadays it is becoming more important than ever to find new ways of extracting useful information from the evergrowing amount of *****user - generated***** data available online . | ||
| 2020.acl-main.514 Affective tasks such as sentiment analysis , emotion classification , and sarcasm detection have been popular in recent years due to an abundance of *****user - generated***** data , accurate computational linguistic models , and a broad range of relevant applications in various domains . | ||
| W16-3906 Information extraction from *****user - generated***** text has gained much attention with the growth of the Web . Disaster analysis using information from social media provides valuable , real - time , geolocation information for helping people caught up these in disasters . | ||
| reuse | 28 | |
| W19-2514 The detection of allusive text ***** reuse ***** is particularly challenging due to the sparse evidence on which allusive references rely — commonly based on none or very few shared words. | ||
| L06-1421 In this paper, we present an open source toolkit for Malay incorporating a word and sentence tokeniser, a lemmatiser and a partial POS tagger, based on heavy ***** reuse ***** of pre-existing language resources. | ||
| W16-4118 The first method (M1) was originally developed for the task of text ***** reuse ***** detection, measuring sentence similarity by a modified version of a TF-IDF vector space model. | ||
| W17-3530 To reduce the human effort needed in the generation of the linguistic resources for a new domain, the general aspects that can be ***** reuse ***** across domains are separated from those more specific. | ||
| 2020.lrec-1.275 This corpus is designed to be as modular as possible in order to allow for maximum ***** reuse ***** in different tasks pertaining to Economics, Finance and Regulation | ||
| facilitating | 28 | |
| 2020.acl-main.358 Our framework focuses on the design of scoring functions and highlights two critical characteristics: 1) ***** facilitating ***** sufficient feature interactions; 2) preserving both symmetry and antisymmetry properties of relations. | ||
| L12-1040 The output can be provided in several formats: XML, RDF triples, logic forms or plain text, ***** facilitating ***** interoperability with other tools. | ||
| C18-2035 We aim at ***** facilitating ***** easy exploration of model structures for multiple languages with different characteristics. | ||
| 2020.findings-emnlp.291 Specifically, we show the effectiveness of our approach at ***** facilitating ***** bias analysis by finding topics that correspond to demographic inequalities in generated text and comparing the relative effectiveness of inducing biases for different demographics. | ||
| L14-1164 The article presents the idea of transforming a conventional digital library into knowledge source and research collaboration platform, ***** facilitating ***** content augmentation, interpretation and co-operation of geographically distributed researchers representing different academic fields | ||
| referent | 28 | |
| I17-2023 The gesture form on its own is often ambiguous, and the aspect of the ***** referent ***** that it highlights is constrained by what the language makes salient. | ||
| 2003.mtsummit-papers.35 This kind of ellipsis deserves attention in view of the fact that its ***** referent ***** is the agent of the sentence and that these constructions are observed in diverse languages. | ||
| 2020.winlp-1.9 An interesting challenge for situated dialogue systems is ***** referent *****ial visual dialog: by asking questions, the system has to identify the ***** referent ***** to which the user refers to. | ||
| L10-1397 This paper draws a distinction between discourse context ―other entities that have been mentioned in the dialogue― and visual context ―visually available objects near the intended ***** referent *****. | ||
| C18-1169 Extracting location names from informal and unstructured social media data requires the identification of ***** referent ***** boundaries and partitioning compound names | ||
| pronunciations | 28 | |
| L08-1477 A new procedure is described for generating ***** pronunciations ***** for a dictionary of place-names in a less-resourced language (Welsh, spoken in Wales, UK). | ||
| L04-1264 First, we detail the processes used to generate possible ***** pronunciations ***** for each sentence and to select to most likely one. | ||
| L10-1013 The system includes various methods for extracting potential terms of interest from raw text, for providing guesses on the ***** pronunciations ***** of terms, and for comparing two strings as possible transliterations using both phonetic and temporal measures. | ||
| L06-1430 Since manual phonetic annotations are available for the speech data, the evaluation was performed on the transcription level by measuring the phonetic distance of the automatically generated ***** pronunciations ***** variants and actual ***** pronunciations ***** of non-native speakers. | ||
| 2020.lrec-1.331 This generation of ***** pronunciations ***** for previously unknown words is key in training extensible automated speech recognition (ASR) systems, which are key beneficiaries of this dictionary | ||
| DL | 28 | |
| 2016.gwc-1.6 Both AGWN and ***** DL ***** are works in progress that need accuracy improvement and manual validation. | ||
| L08-1266 A ***** DL ***** mapping ontology is generated as result of the mapping process. | ||
| 2021.hcinlp-1.4 We propose to integrate a user interface with an underlying ***** DL ***** model, instead of tackling summarization as an isolated task from the end user. | ||
| D19-5016 Recently, several researchers have proposed deep learning (***** DL *****) models to address this issue. | ||
| 2020.wmt-1.74 Sentence - level ( SL ) machine translation ( MT ) has reached acceptable quality for many high - resourced languages , but not document - level ( DL ) MT , which is difficult to 1 ) train with little amount of *****DL***** data ; and 2 ) evaluate , as the main methods and data sets focus on SL evaluation . | ||
| EM | 28 | |
| Q13-1024 We also propose an approximate ***** EM ***** algorithm and a Gibbs sampling algorithm to estimate model parameters in an unsupervised manner. | ||
| 2020.iwpt-1.9 In particular, Dirichlet priors during ***** EM ***** training, ensemble models, and a new nonterminal scheme for hybrid grammars are evaluated. | ||
| 2020.findings-emnlp.91 The experimental results show that the ***** EM ***** scores obtained by two baselines are below 20%, while the hybrid model can achieve an ***** EM ***** over 40%. | ||
| D19-1599 In particular, on the OpenSQuAD dataset, our model gains 21.4% ***** EM ***** and 21.5% F1 over all non-BERT models, and 5.8% ***** EM ***** and 6.5% F1 over BERT-based models | ||
| L10-1434 This paper presents a comparison of three computational approaches to selectional preferences : ( i ) an intuitive distributional approach that uses second - order co - occurrence of predicates and complement properties ; ( ii ) an EM - based clustering approach that models the strengths of predicatenoun relationships by latent semantic clusters ( Rooth et al . , 1999 ) ; and ( iii ) an extension of the latent semantic clusters by incorporating the MDL principle into the *****EM***** training , thus explicitly modelling the predicatenoun selectional preferences by WordNet classes ( Schulte i m Walde et al . , 2008 ) . | ||
| biaffine | 28 | |
| 2020.emnlp-main.317 Our model consists of an attention only stacked encoder and a light enough decoder for the greedy segmentation plus two highway connections for smoother training, in which the encoder is composed of a newly proposed Transformer variant, Gaussian-masked Directional (GD) Transformer, and a ***** biaffine ***** attention scorer. | ||
| 2021.eacl-main.55 In a thorough analysis, we investigate the factors that contribute to the success of our model: the ***** biaffine ***** model itself, our representation for the dependency structure of arguments, different encoders in the ***** biaffine ***** model, and syntactic information additionally fed to the model. | ||
| 2020.acl-main.134 We formulate the task as a Traveling Salesman Problem (TSP), and use a ***** biaffine ***** attention model to calculate the edge costs. | ||
| 2020.acl-main.577 The ***** biaffine ***** model scores pairs of start and end tokens in a sentence which we use to explore all spans, so that the model is able to predict named entities accurately. | ||
| W19-4204 In this paper we describe our system for morphological analysis and lemmatization in context, using a transformer-based sequence to sequence model and a ***** biaffine ***** attention based BiLSTM model | ||
| CWS | 28 | |
| 2020.iwdp-1.5 This paper proposes a three-step strategy to improve the performance for discourse ***** CWS *****. | ||
| I17-1017 Recently, many character-based neural models have been applied to ***** CWS *****. | ||
| 2020.acl-main.735 In this paper, we propose a neural model named TwASP for joint ***** CWS ***** and POS tagging following the character-based sequence labeling paradigm, where a two-way attention mechanism is used to incorporate both context feature and their corresponding syntactic knowledge for each input character. | ||
| 2021.acl-short.70 To this end, we attempt to combine the multi-modality (mainly the converted text and actual voice information) to perform ***** CWS *****. | ||
| 2020.emnlp-main.318 Existing methods have already achieved a competitive performance for ***** CWS ***** on large-scale annotated corpora | ||
| segmenting | 28 | |
| 2020.lrec-1.317 Our study allowed us to develop two new models for ***** segmenting ***** impaired speech transcriptions, along with an ideal combination of datasets and specific groups of narratives to be used as the training set. | ||
| Q14-1014 We introduce a method for automatically ***** segmenting ***** a corpus into chunks such that many uncertain labels are grouped into the same chunk, while human supervision can be omitted altogether for other segments. | ||
| 2020.acl-main.275 Empirical results on machine translation suggest that DPE is effective for ***** segmenting ***** output sentences and can be combined with BPE dropout for stochastic segmentation of source sentences. | ||
| L10-1026 We also discuss the importance of ***** segmenting ***** the text; experiments show up to 6F points improvement of the mention detection system performance when morphological segmentation is used instead of not ***** segmenting ***** the text. | ||
| W18-4303 This position paper explores the above proposition with respect to narrative theory and ongoing research on ***** segmenting ***** event chains into narrative units | ||
| SL | 28 | |
| 2021.eacl-main.2 To generate such specific questions, we propose Multi-Source Coordinated Question Generator (MSCQG), a novel framework that includes a supervised learning (***** SL *****) stage and a reinforcement learning (RL) stage. | ||
| 1999.mtsummit-1.47 The system is meant for a source language (***** SL *****) speaker who does not know the target language (TL). | ||
| W16-3711 This research will prove to be beneficial for developing efficient MT systems if the mentioned factors are incorporated considering the inherent structural constraints between source and target languages.ated considering the inherent structural constraints between ***** SL ***** and TL pairs. | ||
| 2020.signlang-1.34 The Hamburg Notation System (HamNoSys) was developed for movement annotation of any sign language (***** SL *****) and can be used to produce signing animations for a virtual avatar with the JASigning platform. | ||
| 2020.signlang-1.21 One of these conditions is that many of the prerequesites for the automatic syntactic parsing of corpora are not yet available for ***** SL ***** | ||
| stacking | 28 | |
| W18-5103 We demonstrated that leveraging more advanced technologies such as word embeddings, recurrent neural networks, attention mechanism, ***** stacking ***** of classifiers and semi-supervised training can improve the ROC AUC score of classification to 0.9862. | ||
| W17-2305 We report experiments that show that naive ensembling does not always outperform component Entity Linking systems, that ***** stacking ***** usually outperforms naive ensembling, and that auxiliary features added to the stacker further improve its performance on three distinct datasets. | ||
| 2021.eacl-main.189 Using a novel layer ablation technique and analyses of the model's internal representations, we show that multilingual BERT, a popular multilingual language model, can be viewed as the ***** stacking ***** of two sub-networks: a multilingual encoder followed by a task-specific language-agnostic predictor. | ||
| S19-2142 The third subsystem is a ***** stacking ***** ensemble of four different deep learning architectures. | ||
| N18-1090 We present a treebank of Hindi-English code-switching tweets under Universal Dependencies scheme and propose a neural ***** stacking ***** model for parsing that efficiently leverages the part-of-speech tag and syntactic tree annotations in the code-switching treebank and the preexisting Hindi and English treebanks | ||
| structuring | 28 | |
| 2021.nodalida-main.11 As such, it is suitable as a low-cost (in terms of implementation effort) baseline for document ***** structuring ***** prior to introduction of domain-specific knowledge. | ||
| 2007.jeptalnrecital-long.1 In this article we address the task of automatic text ***** structuring ***** into linear and non-overlapping thematic episodes. | ||
| 2020.findings-emnlp.281 Text ***** structuring ***** is a fundamental step in NLG, especially when generating multi-sentential text. | ||
| C16-2008 Our system is operational online, implementing core mechanisms for document ***** structuring ***** and controlled writing. | ||
| 2020.clinicalnlp-1.27 As the process of manual eligibility determination is time-consuming, automatic ***** structuring ***** of the eligibility criteria into various semantic categories or aspects is the need of the hour | ||
| concise | 28 | |
| W19-3635 Question Answering (QA) systems attempt to provide users ***** concise ***** answer(s) to natural language questions. | ||
| 2021.emnlp-main.490 However, these extractive explanations are not necessarily ***** concise ***** i.e. not minimally sufficient for answering a question. | ||
| L16-1625 On the other hand, machines need to understand the information that is published in online data streams and generate ***** concise ***** and meaningful overviews. | ||
| N19-3011 Twitter has become a major source for information about real-world events because of the use of hashtags and the small word limit of Twitter that ensures ***** concise ***** presentation of events | ||
| 2020.emnlp-main.295 Sentence - level extractive text summarization is substantially a node classification task of network mining , adhering to the informative components and *****concise***** representations . | ||
| variables | 28 | |
| W19-1102 The agents, objects, and other roles in the schemas are represented by typed ***** variables *****, and the event ***** variables ***** can be related through partial temporal ordering and causal relations. | ||
| 2001.mtsummit-ebmt.3 The strings and ***** variables *****, of which translations patterns are composed, are aligned in order to provide a more refined bilingual knowledge source, necessary for the recombination phase. | ||
| 2020.emnlp-main.115 Further, free-text medical notes may contain information not immediately available in structured ***** variables *****. | ||
| C18-1043 The sketch is obtained by using placeholders for specific entities in the SQL query, such as column names, table names, aliases and ***** variables *****, in a process similar to semantic parsing. | ||
| 2020.sigdial-1.25 Our experiences consist of comparing convergence on low level ***** variables ***** (Energy, Pitch, Speech Rate) measured on raw data sets, with human and automatically DA-labelled data sets | ||
| metaphorical | 28 | |
| P18-2024 Furthermore, most studies focus on English, and few in other languages, particularly Sino-Tibetan languages such as Chinese, for emotion analysis from ***** metaphorical ***** texts, although there are likely to be many differences in emotional expressions of ***** metaphorical ***** usages across different languages. | ||
| W18-1404 Prior methodologies for understanding spatial language have treated literal expressions such as “Mary pushed the car over the edge” differently from ***** metaphorical ***** extensions such as “Mary's job pushed her over the edge”. | ||
| L14-1359 An alternative approach to metaphor detection emphasizes the fact that many metaphors become conventionalized collocations, while still preserving their active ***** metaphorical ***** status. | ||
| P18-1113 Current word embedding based metaphor identification models cannot identify the exact ***** metaphorical ***** words within a sentence. | ||
| 2021.ccl-1.80 We also use the emotional information of words to learn the emotional consistency features of ***** metaphorical ***** words and their context | ||
| Mathematical | 28 | |
| W19-2610 ***** Mathematical ***** expressions (ME) are widely used in scholar documents. | ||
| 2020.lrec-1.266 ***** Mathematical ***** text is written using a combination of words and mathematical expressions. | ||
| 2000.iwpt-1.36 *****Mathematical***** equations in LaTeX are composed with tags that express formatting as opposed to structure . | ||
| 2021.eacl-main.282 *****Mathematical***** statements written in natural language are usually composed of two different modalities : mathematical elements and natural language . | ||
| 2021.emnlp-main.273 *****Mathematical***** reasoning aims to infer satisfiable solutions based on the given mathematics questions . | ||
| synthesized | 28 | |
| L12-1318 Our results suggest that a professional human voice can supersede both an amateur human voice and ***** synthesized ***** voices. | ||
| 2021.naacl-main.220 Our empirical results show that the ***** synthesized ***** data generated from our model can substantially help a semantic parser achieve better compositional and domain generalization. | ||
| N18-1057 Surprisingly, when trained on additional data ***** synthesized ***** using our best-performing noising scheme, our model approaches the same performance as when trained on additional non***** synthesized ***** data. | ||
| L14-1566 Presence of appropriate acoustic cues of affective features in the ***** synthesized ***** speech can be a prerequisite for the proper evaluation of the semantic content by the message recipient. | ||
| W18-0506 Then we report on the results of a shared task challenge aimed studying the SLA task via this corpus, which attracted 15 teams and ***** synthesized ***** work from various fields including cognitive science, linguistics, and machine learning | ||
| pun | 28 | |
| S17-2072 In SemEval Task 7 PunFields shows a considerably good result in ***** pun ***** classification, but requires improvement in searching for the target word and its definition. | ||
| S17-2077 We consider this system as a precursor for deeper exploration on efficient feature selection for ***** pun ***** detection. | ||
| 2020.emnlp-main.229 We further make an error analysis and discuss the challenges for the computational ***** pun ***** models. | ||
| S17-2076 We use the output of ***** pun ***** interpretation for ***** pun ***** location. | ||
| D19-1336 In this paper , we focus on the task of generating a *****pun***** sentence given a pair of word senses . | ||
| clauses | 28 | |
| 2020.iwdp-1.9 NT Clause Complex Framework defines a clause complex as a combination of NT ***** clauses ***** through component sharing and logic-semantic relationship. | ||
| R19-1047 Proposition extraction from sentences is an important task for information extraction systems Evaluation of such systems usually conflates two aspects: splitting complex sentences into ***** clauses ***** and the extraction of propositions. | ||
| 2020.emnlp-main.252 Furthermore, we propose a prediction aggregation module with low computational overhead to fine-tune the prediction results based on the characteristics of the input ***** clauses *****. | ||
| L12-1443 We introduce a simple XML-based syntax for the annotation of factive verbs and ***** clauses *****, in order to capture this information. | ||
| 2020.emnlp-main.290 To tackle these shortcomings, we propose two joint frameworks for ECPE: 1) multi-label learning for the extraction of the cause ***** clauses ***** corresponding to the specified emotion clause (CMLL) and 2) multi-label learning for the extraction of the emotion ***** clauses ***** corresponding to the specified cause clause (EMLL) | ||
| beam | 28 | |
| 2021.naacl-main.429 Well-trained GEC models can generate several high-quality hypotheses through decoding, such as ***** beam ***** search, which provide valuable GEC evidence and can be used to evaluate GEC quality. | ||
| I17-2029 However, traditional seq2seq suffer from a severe weakness: during ***** beam ***** search decoding, they tend to rank universal replies at the top of the candidate list, resulting in the lack of diversity among candidate replies. | ||
| 2021.acl-long.512 By posing iterations in ***** beam ***** search as a series of subdeterminant maximization problems, we can turn the algorithm into a diverse subset selection process. | ||
| 2020.findings-emnlp.406 Structured prediction is often approached by training a locally normalized model with maximum likelihood and decoding approximately with ***** beam ***** search. | ||
| 2020.acl-main.325 Previous work either required re-training existing models with the lexical constraints or incorporating them during ***** beam ***** search decoding with significantly higher computational overheads | ||
| systematic | 28 | |
| L08-1057 The scope of our guidelines is limited to the alignment between Chinese and Korean, but the instruction methods exemplified in this paper are also applicable in developing ***** systematic ***** and comprehensible alignment guidelines for other languages having such different linguistic phenomena. | ||
| 2021.inlg-1.22 Despite there being a number of style tasks with available data, there has been limited ***** systematic ***** discussion of how text style datasets relate to each other. | ||
| 2020.emnlp-main.236 Furthermore, we demonstrate how to grow the norm and direction of word vectors (vector converter); this is a new ***** systematic ***** approach derived from the sentence-vector estimation methods, which can significantly improve the performance of the proposed method. | ||
| 2020.findings-emnlp.117 Unfortunately, when a dataset has ***** systematic ***** gaps (e.g., annotation artifacts), these evaluations are misleading: a model can learn simple decision rules that perform well on the test set but do not capture the abilities a dataset is intended to test. | ||
| 2021.nlp4convai-1.17 Experiments on personalized test set showed that our personalized QR system is able to correct ***** systematic ***** and user errors by utilizing phonetic and semantic inputs | ||
| introduce | 28 | |
| 2020.ccl-1.93 However, word alignment methods ***** introduce ***** alignment errors inevitably. | ||
| C18-2003 We here ***** introduce ***** a substantially extended version of JeSemE, an interactive website for visually exploring computationally derived time-variant information on word meanings and lexical emotions assembled from five large diachronic text corpora. | ||
| 2020.figlang-1.4 However, few types of research ***** introduce ***** the rich linguistic information into the field of computational metaphor by leveraging powerful pre-training language models. | ||
| D19-5626 These solutions ***** introduce ***** a significant overhead of additional resources and computational costs. | ||
| 2020.acl-main.112 In order to break this bottleneck, we here ***** introduce ***** a methodology for creating almost arbitrarily large emotion lexicons for any target language | ||
| monolingual embedding | 28 | |
| 2020.lrec-1.499 In this paper, we present the first viability study of established techniques to align ***** monolingual embedding ***** spaces for Turkish, Uzbek, Azeri, Kazakh and Kyrgyz, members of the Turkic family which is heavily affected by the low-resource constraint. | ||
| 2021.acl-srw.17 Unsupervised cross-lingual word embedding(CLWE) methods learn a linear transformation matrix that maps two ***** monolingual embedding ***** spaces that are separately trained with monolingual corpora. | ||
| 2020.emnlp-main.186 In this work we present a large-scale study focused on the correlations between ***** monolingual embedding ***** space similarity and task performance, covering thousands of language pairs and four different tasks: BLI, parsing, POS tagging and MT. | ||
| 2021.mrl-1.9 This phenomenon manifests in that i) despite of the smaller corpus sizes, using only the comparable parts of Wikipedia for training the ***** monolingual embedding ***** spaces to be mapped is often more efficient than relying on all the contents of Wikipedia, ii) the smaller, in return less diversified Spanish Wikipedia works almost always much better as a training corpus for bilingual mappings than the ubiquitously used English Wikipedia. | ||
| N19-1161 In this paper, we propose an approach that instead expresses the two ***** monolingual embedding ***** spaces as probability densities defined by a Gaussian mixture model, and matches the two densities using a method called normalizing flow | ||
| categorial grammars | 28 | |
| 1995.iwpt-1.14 We present an approach to non-constituent coordination within ***** categorial grammars *****, and reformulate it as a generic rule. | ||
| 2020.repl4nlp-1.23 In this paper, we make use of the primitives and operators that constitute the lexical categories of ***** categorial grammars *****. | ||
| 2017.jeptalnrecital-recital.12 Finding Missing Categories in Incomplete Utterances This paper introduces an efficient algorithm (O(n4 )) for finding a missing category in an incomplete utterance by using unification technique as when learning ***** categorial grammars *****, and dynamic programming as in Cocke–Younger–Kasami algorithm. | ||
| 2021.semspace-1.3 We start by showing that standard ***** categorial grammars ***** can be expressed as a biclosed category, where all rules emerge as currying/uncurrying the identity; we then proceed to model permutation-inducing rules by exploiting the symmetry of the compact closed category encoding the word meaning | ||
| N19-1160 Combinatory ***** categorial grammars ***** are linguistically motivated and useful for semantic parsing, but costly to acquire in a supervised way and difficult to acquire in an unsupervised way. | ||
| online reviews | 28 | |
| 2021.newsum-1.9 For many NLP applications of ***** online reviews *****, comparison of two opinion-bearing sentences is key. | ||
| 2020.ecnlp-1.11 In recent years, there has been an increase in online shopping resulting in an increased number of ***** online reviews *****. | ||
| 2020.coling-main.37 Users express their opinions towards entities (e.g., restaurants) via ***** online reviews ***** which can be in diverse forms such as text, ratings, and images. | ||
| W17-4420 This paper reports our participation in the W-NUT 2017 shared task on emerging and rare entity recognition from user generated noisy text such as tweets, ***** online reviews ***** and forum discussions. | ||
| R17-1102 In this paper, we apply a data collection method based on social network analysis to quickly identify high quality deceptive and truthful ***** online reviews *****1 from Amazon. | ||
| frame semantics | 28 | |
| L06-1195 In this paper we discuss the annotation framework (***** frame semantics *****) and its cross-lingual applicability, problems arising from exhaustive annotation, strategies for quality control, and possible applications. | ||
| W19-8704 We propose a metric for machine translation evaluation based on ***** frame semantics ***** which does not require the use of reference translations or human corrections, but is aimed at comparing original and translated output directly. | ||
| W16-4412 In this paper, we propose a ***** frame semantics *****-based semantic parsing approach as KB-independent question pre-processing. | ||
| W19-4514 We also propose a verb-centric ***** frame semantics ***** with an effective set of semantic roles in order to support the analysis. | ||
| L12-1582 We also illustrate some application areas of the Latvian resource grammar, and briefly discuss the limitations of the RGL and potential long-term improvements using ***** frame semantics *****. | ||
| source sentence | 28 | |
| 2020.acl-main.22 Our work, inspired by pre-ordering literature in machine translation, uses syntactic transformations to softly “reorder” the ***** source sentence ***** and guide our neural paraphrasing model. | ||
| P17-1012 The prevalent approach to neural machine translation relies on bi-directional LSTMs to encode the ***** source sentence *****. | ||
| D19-1074 CapsNMT uses an aggregation mechanism to map the ***** source sentence ***** into a matrix with pre-determined size, and then applys a deep LSTM network to decode the target sequence from the source representation. | ||
| P17-1064 Even though a linguistics-free sequence to sequence model in neural machine translation (NMT) has certain capability of implicitly learning syntactic information of ***** source sentence *****s, this paper shows that source syntax can be explicitly incorporated into NMT effectively to provide further improvements. | ||
| 2020.wat-1.19 Unlike sentence-level MT, which translates the sentences independently, document-level MT aims to utilize contextual information while translating a given ***** source sentence *****. | ||
| test set | 28 | |
| W18-6312 We reassess a recent study (Hassan et al., 2018) that claimed that machine translation (MT) has reached human parity for the translation of news from Chinese into English, using pairwise ranking and considering three variables that were not taken into account in that previous study: the language in which the source side of the ***** test set ***** was originally written, the translation proficiency of the evaluators, and the provision of inter-sentential context. | ||
| 2020.semeval-1.121 However, on the ***** test set ***** it obtained an F1 score of 0.342. | ||
| 2020.wanlp-1.15 Our best system achieves 80.6% word accuracy and 58.7% BLEU on a blind ***** test set *****. | ||
| D19-1258 The system is evaluated on both fact verification and open-domain multihop QA, achieving state-of-the-art results on the leaderboard ***** test set *****s of both FEVER and HOTPOTQA. | ||
| 2020.emnlp-main.381 Besides, we present the first results of comparing multilingual models in the translated diagnostic ***** test set ***** and offer the first steps to further expanding or assessing State-of-the-art models independently of language. | ||
| background knowledge | 28 | |
| D19-1590 Furthermore, we investigate the effect of supplying ***** background knowledge ***** to our classifiers. | ||
| 2021.nlp4convai-1.23 Humans make appropriate responses not only based on previous dialogue utterances but also on implicit ***** background knowledge ***** such as common sense. | ||
| N18-1197 Can this linguistic ***** background knowledge ***** improve the generality and efficiency of learned classifiers and control policies? | ||
| 2012.amta-tutorials.1 We focus on ***** background knowledge ***** that will help you both get more out of the rest of AMTA2010 and to make better decisions about how to invest in machine translation. | ||
| 2012.amta-tutorials.5 We will start with ***** background knowledge ***** of statistical machine translation and then walk you through the process of installing and running an SMT system. | ||
| argument structure | 28 | |
| 2020.lrec-1.143 Our corpus can be used as a resource for analyzing persuasiveness and training an argument mining system to identify and extract ***** argument structure *****s. | ||
| 2020.emnlp-main.375 First, we assess few-shot learning capabilities by developing controlled experiments that probe models' syntactic nominal number and verbal ***** argument structure ***** generalizations for tokens seen as few as two times during training. | ||
| 2021.sigdial-1.39 The results vary between the investigated topics (and hence depend on the quality of the underlying data) but are in some instances surprisingly close to the results achieved with a manually annotated ***** argument structure *****. | ||
| L10-1469 Many natural language processing tasks, including information extraction, question answering and recognizing textual entailment, require analysis of the polarity, focus of polarity, tense, aspect, mood and source of the event mentions in a text in addition to its predicate-***** argument structure ***** analysis. | ||
| 2020.coling-main.114 The meaning of natural language text is supported by cohesion among various kinds of entities, including coreference relations, predicate-***** argument structure *****s, and bridging anaphora relations. | ||
| source text | 28 | |
| C16-1060 This allows us to quantify the considerable variation in accuracy depending on the specific ***** source text *****(s) used, even with different translations into the same language. | ||
| 2021.emnlp-main.421 This standard approach falls short, however, when a user's intent or context of work is not easily recoverable based solely on that ***** source text *****– a scenario that we argue is more of the rule than the exception. | ||
| 2020.findings-emnlp.159 In neural text editing, prevalent sequence-to-sequence based approaches directly map the unedited text either to the edited text or the editing operations, in which the performance is degraded by the limited ***** source text ***** encoding and long, varying decoding steps. | ||
| L12-1595 We present a method for improving word alignment quality for phrase-based statistical machine translation by reordering the ***** source text ***** according to the target word order suggested by an initial word alignment. | ||
| 2020.conll-1.19 Further analysis of misleading translations revealed that the most frequent error types are ambiguity, mistranslation, noun phrase error, word-by-word translation, untranslated word, subject-verb agreement, and spelling error in the ***** source text *****. | ||
| facts | 28 | |
| 2020.ai4hi-1.3 Our research question is motivated by two ***** facts *****. | ||
| L14-1091 Here, we extend the distant supervision approach to template-based event extraction, focusing on the extraction of passenger counts, aircraft types, and other ***** facts ***** concerning airplane crash events. | ||
| 2021.naacl-main.68 Extensive experiments on two benchmark datasets show that BEUrRE consistently outperforms baselines on confidence prediction and fact ranking due to its probabilistic calibration and ability to capture high-order dependencies among ***** facts *****. | ||
| 2021.emnlp-main.216 Low-resource Relation Extraction (LRE) aims to extract relation ***** facts ***** from limited labeled corpora when human annotation is scarce. | ||
| D19-1299 This allows the model to infer relevant ***** facts ***** which are not explicitly stated in the data table from an external knowledge source. | ||
| speech processing | 28 | |
| L14-1277 Speech disfluency is one of the most challenging tasks to deal with in automatic ***** speech processing *****. | ||
| L16-1313 Vocal User Interfaces in domestic environments recently gained interest in the ***** speech processing ***** community. | ||
| 2020.lrec-1.513 Thus, the research community can use the corpora to further improve ***** speech processing ***** systems. | ||
| W17-1317 The success of machine learning for automatic ***** speech processing ***** has raised the need for large scale datasets. | ||
| D19-1566 In recent times, multi-modal analysis has been an emerging and highly sought-after field at the intersection of natural language processing, computer vision, and ***** speech processing *****. | ||
| state | 28 | |
| D19-1566 Experimental results suggest the efficacy of the proposed model for both sentiment and emotion analysis over various existing ***** state *****-of-the-art systems. | ||
| 2020.emnlp-main.459 The experimental results on five datasets sampled from Freebase, NELL and Wikidata show that our method outperforms ***** state *****-of-the-art baselines. | ||
| 2020.repl4nlp-1.24 We highlight that on several tasks while such perturbations are natural, ***** state ***** of the art trained models are surprisingly brittle. | ||
| 2021.acl-long.523 We show that Polyjuice produces diverse sets of realistic counterfactuals, which in turn are useful in various distinct applications: improving training and evaluation on three different tasks (with around 70% less annotation effort than manual generation), augmenting ***** state *****-of-the-art explanation techniques, and supporting systematic counterfactual error analysis by revealing behaviors easily missed by human experts. | ||
| 2020.findings-emnlp.142 Existing DST models either ignore temporal feature dependencies across dialogue turns or fail to explicitly model temporal *****state***** dependencies in a dialogue . | ||
| feature engineering | 28 | |
| W18-0540 We developed solutions following three approaches: (i) a ***** feature engineering ***** method using lexical, n-gram and psycholinguistic features, (ii) a shallow neural network method using only word embeddings, and (iii) a Long Short-Term Memory (LSTM) language model, which is pre-trained on a large text corpus to produce a contextualized word vector. | ||
| K17-1032 It is also a constraint for classifiers which employ manual ***** feature engineering *****. | ||
| C18-1237 The system is simple and do not require any manual ***** feature engineering *****. | ||
| C18-1123 With the modeling power of RNN, we achieve superior reordering accuracy without any ***** feature engineering *****. | ||
| 2020.emnlp-main.512 In this work, we propose an end-to-end online framework for conversation disentanglement that avoids time-consuming domain-specific ***** feature engineering *****. | ||
| multilingual named entity | 28 | |
| W17-1413 In the paper we present an adaptation of Liner2 framework to solve the BSNLP 2017 shared task on ***** multilingual named entity ***** recognition. | ||
| W19-3712 Our paper addresses the problem of ***** multilingual named entity ***** recognition on the material of 4 languages: Russian, Bulgarian, Czech and Polish. | ||
| W19-3713 In this paper we tackle ***** multilingual named entity ***** recognition task. | ||
| W19-3711 This paper presents our participation at the shared task on ***** multilingual named entity ***** recognition at BSNLP2019. | ||
| W19-3710 This paper describes the Cognitive Computation (CogComp) Group's submissions to the ***** multilingual named entity ***** recognition shared task at the Balto-Slavic Natural Language Processing (BSNLP) Workshop. | ||
| emotion intensity | 28 | |
| S18-1001 The individual tasks are: 1. ***** emotion intensity ***** regression, 2. ***** emotion intensity ***** ordinal classification, 3. valence (sentiment) regression, 4. valence ordinal classification, and 5. emotion classification. | ||
| D19-1408 Existing methods based on supervised learning require a large amount of well-labelled training data, which is difficult to obtain due to inconsistent perception of fine-grained ***** emotion intensity *****. | ||
| S18-1028 Our system reaches average Pearson correlation score of 0.722 (ranked 12/48) in ***** emotion intensity ***** regression task, and 0.810 in valence regression task (ranked 15/38). | ||
| 2021.emnlp-main.781 Overall, we find that linguistic models carry substantial potential for inducing fine-grained ***** emotion intensity ***** scores, showing a far higher correlation with human ground truth ratings than state-of-the-art emotion lexicons based on labeled data. | ||
| W17-5225 In this work, we have used LIWC psycholinguistic categories to train regression models and predict ***** emotion intensity ***** in tweets for the EmoInt-2017 task. | ||
| text style | 28 | |
| 2021.emnlp-main.730 In this paper, we explore Non-AutoRegressive (NAR) decoding for unsupervised ***** text style ***** transfer. | ||
| R19-1098 We propose a simple unsupervised method for extracting pseudo-parallel monolingual sentence pairs from comparable corpora representative of two different ***** text style *****s, such as news articles and scientific papers. | ||
| 2021.ranlp-1.64 Through extensive experiments on two popular ***** text style ***** transfer tasks, we show that our proposed method significantly outperforms twelve state-of-the-art methods. | ||
| D19-1499 In this paper, we first propose a semi-supervised ***** text style ***** transfer model that combines the small-scale parallel data with the large-scale nonparallel data. | ||
| 2021.emnlp-main.729 In this paper, we propose a collaborative learning framework for unsupervised ***** text style ***** transfer using a pair of bidirectional decoders, one decoding from left to right while the other decoding from right to left. | ||
| similar language | 28 | |
| 2020.semeval-1.270 For Danish, we explore the possibility of fine-tuning a model pre-trained on a ***** similar language *****, Swedish, and additionally also cross-lingual training together with English. | ||
| D19-1076 An important motivation is to support lower resourced languages, however, most efforts focus on demonstrating the effectiveness of the techniques using embeddings derived from ***** similar language *****s to English with large parallel content. | ||
| 2021.nlp4convai-1.26 This increase in usage of code-mixed language has prompted dialog systems in a ***** similar language *****. | ||
| 2020.loresmt-1.4 We further noted that, although translation between ***** similar language *****s is no cakewalk, linguistically distinct languages require more data to give better results. | ||
| 2021.ranlp-srw.2 This increase in usage of code-mixed language has prompted dialog systems in a ***** similar language *****. | ||
| large language | 28 | |
| W18-6315 While Phrase-Based MT can seamlessly integrate very ***** large language ***** models trained on billions of sentences, the best option for Neural MT developers seems to be the generation of artificial parallel data through back-translation - a technique that fails to fully take advantage of existing datasets. | ||
| 2021.naacl-main.10 We analyze if ***** large language ***** models are able to predict patterns of human reading behavior. | ||
| 2021.naacl-main.240 Dialogue systems pretrained with ***** large language ***** models generate locally coherent responses, but lack fine-grained control over responses necessary to achieve specific goals. | ||
| 2021.conll-1.45 Relation classification (sometimes called `extraction') requires trustworthy datasets for fine-tuning ***** large language ***** models, as well as for evaluation. | ||
| 2020.findings-emnlp.340 Evaluation on both tasks shows that modeling preconditions is challenging even for today's ***** large language ***** models (LM). | ||
| parallel sentence extraction | 28 | |
| W17-2512 This paper presents the BUCC 2017 shared task on ***** parallel sentence extraction ***** from comparable corpora. | ||
| L14-1209 Experiments show that our system performs significantly better than the previous studies for both accuracy in ***** parallel sentence extraction ***** and SMT performance. | ||
| C18-1116 Our model also obtained a state-of-the-art result on the German-English dataset of BUCC 2017 shared task on ***** parallel sentence extraction ***** from comparable corpora. | ||
| W17-2508 This article presents the STACCw system for the BUCC 2017 shared task on ***** parallel sentence extraction ***** from comparable corpora. | ||
| 2013.iwslt-evaluation.25 A number of techniques were proposed to deal with these translation tasks, including ***** parallel sentence extraction *****, pre-processing, translation model (TM) optimization, language model (LM) interpolation, turning, and post-processing. | ||
| mental | 28 | |
| P19-1624 Experi***** mental ***** results on the WMT14 English-German and English-French benchmarks show that our model consistently improves performance over the strong Transformer model, demonstrating the necessity and effectiveness of exploiting sentential context for NMT. | ||
| 2021.naacl-main.258 We provide supportive evidence by ex-peri***** mental *****ly confirming that well-performingmodels show a low sensitivity to noise andfine-tuning with LNSR exhibits clearly bet-ter generalizability and stability. | ||
| D19-1566 Experi***** mental ***** results suggest the efficacy of the proposed model for both sentiment and emotion analysis over various existing state-of-the-art systems. | ||
| L12-1393 Our experi***** mental ***** evaluation showed that this approach is promising for applying SMT, even when a source-side parallel corpus is lacking. | ||
| 2020.emnlp-main.459 The experi***** mental ***** results on five datasets sampled from Freebase, NELL and Wikidata show that our method outperforms state-of-the-art baselines. | ||
| implicit discourse relation classification | 28 | |
| E17-2024 The task of *****implicit discourse relation classification***** has received increased attention in recent years, including two CoNNL shared tasks on the topic. | ||
| 2020.lrec-1.145 However, there have been few reports on its application to *****implicit discourse relation classification*****, and it is not clear how BERT is best adapted to the task. | ||
| 2021.unimplicit-1.1 In *****implicit discourse relation classification*****, we want to predict the relation between adjacent sentences in the absence of any overt discourse connectives. | ||
| W19-2703 *****Implicit discourse relation classification***** is one of the most challenging and important tasks in discourse parsing, due to the lack of connectives as strong linguistic cues. | ||
| W19-0416 *****Implicit discourse relation classification***** is one of the most difficult steps in discourse parsing. | ||
| unsupervised dependency | 28 | |
| 2020.emnlp-demos.14 Users can exploit the richness and diversity of these reference type systems for fine-grained supervised typing, in addition, they can choose among and combine four other typing modules: pre-trained real-world models, *****unsupervised dependency*****-based typing, knowledge base lookups, and constraint-based candidate consolidation. | ||
| P19-1526 Most of the *****unsupervised dependency***** parsers are based on probabilistic generative models that learn the joint distribution of the given sentence and its parse. | ||
| 2020.coling-main.227 *****Unsupervised dependency***** parsing aims to learn a dependency parser from sentences that have no annotation of their correct parse trees. | ||
| 2020.coling-main.347 Most of the *****unsupervised dependency***** parsers are based on first-order probabilistic generative models that only consider local parent-child information. | ||
| D17-1171 *****Unsupervised dependency***** parsing, which tries to discover linguistic dependency structures from unannotated data, is a very challenging task. | ||
| Word Sense Disambiguation ( WSD | 28 | |
| 2021.emnlp-main.610 *****Word Sense Disambiguation ( WSD***** ) aims to automatically identify the exact meaning of one word according to its context . | ||
| R19-1135 This paper presents a novel algorithm for *****Word Sense Disambiguation ( WSD***** ) based on Quantum Probability Theory . | ||
| D18-1170 The goal of *****Word Sense Disambiguation ( WSD***** ) is to identify the correct meaning of a word in the particular context . | ||
| 2020.emnlp-main.504 Contextual embeddings are proved to be overwhelmingly effective to the task of *****Word Sense Disambiguation ( WSD***** ) compared with other sense representation techniques . | ||
| 2016.gwc-1.8 Supervised methods for *****Word Sense Disambiguation ( WSD***** ) benefit from high - quality sense - annotated resources , which are lacking for many languages less common than English . | ||
| Princeton | 28 | |
| 2006.bcs-1.2 Arabic WordNet is a lexical resource for Modern Standard Arabic based on the widely used *****Princeton***** WordNet for English ( Fellbaum , 1998 ) . | ||
| 2019.gwc-1.14 In this article , we tackle the issue of the limited quantity of manually sense annotated corpora for the task of word sense disambiguation , by exploiting the semantic relationships between senses such as synonymy , hypernymy and hyponymy , in order to compress the sense vocabulary of *****Princeton***** WordNet , and thus reduce the number of different sense tags that must be observed to disambiguate all words of the lexical database . | ||
| 2020.lrec-1.368 We present the parallel creation of a WordNet resource for Swedish and Bulgarian which is tightly aligned with the *****Princeton***** WordNet . | ||
| 2021.gwc-1.13 We present here the results of a morphosemantic analysis of the verb - noun pairs in the *****Princeton***** WordNet as reflected in the standoff file containing pairs annotated with a set of 14 semantic relations . | ||
| 2018.gwc-1.7 Such a rich language resource like *****Princeton***** WordNet , containing linguistic information of different types ( semantic , lexical , syntactic , derivational , dialectal , etc . | ||
| web - based | 28 | |
| L16-1095 This paper presents CATaLog online , a new *****web - based***** MT and TM post - editing tool . | ||
| L10-1484 FrameSQL is a *****web - based***** application which the author ( Sato , 2003 ; Sato 2008 ) created originally for searching the Berkeley FrameNet lexical database . | ||
| 2021.emnlp-demo.19 We present UMR - Writer , a *****web - based***** application for annotating Uniform Meaning Representations ( UMR ) , a graph - based , cross - linguistically applicable semantic representation developed recently to support the development of interpretable natural language applications that require deep semantic analysis of texts . | ||
| L12-1151 We present a *****web - based***** tool for retrieving and annotating audio fragments of e.g. | ||
| C16-2056 We present PolyglotIE , a *****web - based***** tool for developing extractors that perform Information Extraction ( IE ) over multilingual data . | ||
| bot | 27 | |
| S19-1009 Chat***** bot *****s (i.e., ***** bot *****s) are becoming widely used in multiple domains, along with supporting ***** bot ***** programming platforms. | ||
| 2020.latechclfl-1.16 We propose to demonstrate different outputs of our implementation (a Web site, a Twitter ***** bot ***** and a specifically developed device, called `La Boîte ä poësie') based on a corpus of 19th century French poetry. | ||
| 2021.nlpmc-1.3 We compare different ensemble configurations and we show that the combination of the three ***** bot *****s (i) provides a better basis for collecting information than just the information seeking ***** bot ***** and (ii) collects information in a more user-friendly, more efficient manner that an ensemble model combining the information seeking and the social ***** bot *****. | ||
| R17-1035 Chat ***** bot ***** finds answers which are not only relevant by topic but also match the question by style, argumentation patterns, communication means, experience level and other attributes. | ||
| W19-4317 To answer this question, we focus on ***** bot ***** detection in Twitter as our evaluation task and test the performance of fine-tuning approaches based on language models against popular neural architectures such as LSTM and CNN combined with pre-trained and contextualized embeddings | ||
| MAP | 27 | |
| L12-1578 The competition website featured a leader board that displayed the top score for each participant, ranked according to the principal contest metric - mean average precision (***** MAP *****). | ||
| 2021.emnlp-main.185 Evaluation results show that our approach significantly outperforms the baselines across all three datasets in terms of ***** MAP ***** and Spearman's correlation measures, demonstrating its effectiveness. | ||
| 2010.iwslt-evaluation.16 Specifically, we focus on 1) cross-domain translation using ***** MAP ***** adaptation, 2) Turkish morphological processing and translation, 3) improved Arabic morphology for MT preprocessing, and 4) system combination methods for machine translation. | ||
| P19-1421 Our experiments on the four datasets from Coursera and XuetangX show that the proposed method achieves significant improvements(+0.19 by ***** MAP *****) over existing methods. | ||
| D19-5310 Our system secured 2nd rank in the task with a mean average precision (***** MAP *****) of 41.3% on the test set | ||
| suffix | 27 | |
| P19-1528 An impor- tant special case of this problem is computing the probability of a string appearing as a pre- fix, ***** suffix *****, or infix. | ||
| 2010.amta-papers.2 We use ***** suffix ***** arrays to detect exact n-gram matches, A* search heuristics to discard matches and A* parsing to validate candidate segments. | ||
| C16-3006 The focus of this tutorial will be efficient text processing utilising space efficient representations of ***** suffix ***** arrays, ***** suffix ***** trees and searchable integer compression schemes with specific applications of succinct data structures to common NLP tasks such as n-gram language modelling. | ||
| S19-2232 We implemented a deep-affix based LSTM-CRF NER model for task 1, which utilizes only character, word, pre- fix and ***** suffix ***** information for the identification of geolocation entities. | ||
| Q16-1034 We propose a language model based on compressed ***** suffix ***** trees, a representation that is highly compact and can be easily held in memory, while supporting queries needed in computing language model probabilities on-the-fly | ||
| thesauri | 27 | |
| L16-1588 In this paper, we address the problem of building and evaluating such ***** thesauri ***** with the help of Information Retrieval (IR) concepts. | ||
| L06-1378 In this paper we discuss the problem of sense disambiguation using lexical resources like ontologies or ***** thesauri ***** with a focus on the application of sense detection and merging methods in information retrieval systems. | ||
| 2016.gwc-1.25 Collaboratively created lexical resources is a trending approach to creating high quality ***** thesauri ***** in a short time span at a remarkably low price. | ||
| L16-1387 We conduct two experiments involving different ***** thesauri ***** in different languages. | ||
| L06-1408 In this paper, we propose a corpus-based approach to the construction of a Pan-Chinese lexical resource, starting out with the aim to enrich existing Chinese ***** thesauri ***** in the Pan-Chinese context | ||
| encyclopedic | 27 | |
| S19-1010 Our study yields some unexpected findings, e.g., that biases can be emphasized or downplayed by different embedding models or that user-generated content may be less biased than ***** encyclopedic ***** text. | ||
| C16-2024 We demonstrate the applicability of our system in English and German language for ***** encyclopedic ***** or medical text. | ||
| 2020.wanlp-1.17 We measure the presence of biases across several dimensions, namely: embedding models (Skip-Gram, CBOW, and FastText) and vector sizes, types of text (***** encyclopedic ***** text, and news vs. user-generated content), dialects (Egyptian Arabic vs. Modern Standard Arabic), and time (diachronic analyses over corpora from different time periods). | ||
| 2020.lrec-1.268 We describe the creation of such a multidisciplinary corpus and highlight the obtained findings in terms of the following features: 1) a generic conceptual formalism for scientific entities in a multidisciplinary scientific context; 2) the feasibility of the domain-independent human annotation of scientific entities under such a generic formalism; 3) a performance benchmark obtainable for automatic extraction of multidisciplinary scientific entities using BERT-based neural models; 4) a delineated 3-step entity resolution procedure for human annotation of the scientific entities via ***** encyclopedic ***** entity linking and lexicographic word sense disambiguation; and 5) human evaluations of Babelfy returned ***** encyclopedic ***** links and lexicographic senses for our entities. | ||
| N19-1362 Reasoning about implied relationships (e.g. paraphrastic, common sense, ***** encyclopedic *****) between pairs of words is crucial for many cross-sentence inference problems | ||
| phonetically | 27 | |
| L10-1295 For the annotation of the anaphoric links the corpus takes into account specific phenomena of the Italian language like incorporated clitics and ***** phonetically ***** non realized pronouns. | ||
| 2021.wnut-1.10 Recent work proposed to evaluate language models by using them to classify ground truth sentences among alternative ***** phonetically ***** similar sentences generated by a fine state transducer. | ||
| L04-1168 Experimental results reveal that the database is large and ***** phonetically ***** rich enough to get great acoustic models to be integrated in Continuous Speech Recognition Systems. | ||
| L14-1257 The speech recognition corpus is ***** phonetically ***** balanced and ***** phonetically ***** rich and the paper describes also the methodology how the phonetical balancedness has been assessed. | ||
| 2020.lrec-1.874 Specifically, SHR++ uses Sanskrit Heritage Reader, a lexicon driven shallow parser for enumerating all the ***** phonetically ***** and lexically valid word splits along with their morphological analyses for a given string | ||
| Duolingo | 27 | |
| W18-0543 The ***** Duolingo ***** Shared Task on Second Language Acquisition Modeling provides students' trace data that we extensively analyze and engineer features from for the task of predicting whether a student will correctly solve a vocabulary exercise. | ||
| W18-0506 We describe a large corpus of more than 7M words produced by more than 6k learners of English, Spanish, and French using ***** Duolingo *****, a popular online language-learning app. | ||
| 2020.ngt-1.20 We describe our submission to the 2020 ***** Duolingo ***** Shared Task on Simultaneous Translation And Paraphrase for Language Education (STAPLE). | ||
| 2020.ngt-1.14 Experiments on a corpus constructed out of the public dataset from ***** Duolingo *****, containing some 4 million pairs of sentences, found that gFCONV is a consistent winner over c-VAE though both suffered heavily from a low recall. | ||
| W18-0524 SLAM 2018 focuses on predicting a student's mistake while using the ***** Duolingo ***** application | ||
| DistilBERT | 27 | |
| 2020.findings-emnlp.286 Our proposed method is also orthogonal to existing compact pretrained language models such as ***** DistilBERT ***** using knowledge distillation, since a further 1.79x average compression rate can be achieved on top of ***** DistilBERT ***** with zero or minor accuracy degradation. | ||
| 2020.wnut-1.56 ***** DistilBERT ***** achieves a F1 score of 0.7508 on the test set, which is the best of our submissions. | ||
| 2020.emnlp-main.493 We validate our Neural Mask Generator (NMG) on several question answering and text classification datasets using BERT and ***** DistilBERT ***** as the language models, on which it outperforms rule-based masking strategies, by automatically learning optimal adaptive maskings. | ||
| 2021.naacl-main.189 In this paper, we investigate gender and racial bias across ubiquitous pre-trained language models, including GPT-2, XLNet, BERT, RoBERTa, ALBERT and ***** DistilBERT *****. | ||
| 2021.sigdial-1.7 Unlike typical dialog models that rely on huge, complex neural network architectures and large-scale pre-trained Transformers to achieve state-of-the-art results, our method achieves comparable results to BERT and even outperforms its smaller variant ***** DistilBERT ***** on conversational slot extraction tasks | ||
| Slovenian | 27 | |
| W19-4013 The annotation campaign, specific in terms of setting, subjectivity and the multifunctionality of items under investigation, resulted in a preliminary lexicon of formulaic sequences in spoken ***** Slovenian ***** with immediate potential for future explorations in formulaic language research. | ||
| 2020.lrec-1.501 We present a collection of such datasets for the word analogy task in nine languages: Croatian, English, Estonian, Finnish, Latvian, Lithuanian, Russian, ***** Slovenian *****, and Swedish. | ||
| W19-3704 We present experiments on ***** Slovenian *****, Croatian and Serbian morphosyntactic annotation and lemmatisation between the former state-of-the-art for these three languages and one of the best performing systems at the CoNLL 2018 shared task, the Stanford NLP neural pipeline. | ||
| L10-1160 The database consists of recordings of 10 ***** Slovenian ***** native speakers. | ||
| L06-1153 The paper represents the Turdis database of spontaneous conversations in tourist domain in *****Slovenian***** language . | ||
| plausibility | 27 | |
| 2020.coling-main.354 In this paper, we contribute to these research questions with a number of experiments that systematically probe different lexical semantics theories for their levels of cognitive ***** plausibility ***** and of technological usefulness. | ||
| S17-2058 We proposed a method that combines neural similarity features and hand-crafted comment ***** plausibility ***** features, and we modeled inter-comments relationship using conditional random field. | ||
| W18-2801 We address this task, with competitive results, by using instead a semantic network to encode lexical semantics, thus providing further evidence for the cognitive ***** plausibility ***** of this approach to model lexical meaning. | ||
| 2020.findings-emnlp.390 We argue that these evaluations are insufficient, since they fail to indicate whether explanations support actual model behavior (faithfulness), rather than simply match what a human would say (***** plausibility *****). | ||
| E17-4006 We demonstrate the cognitive ***** plausibility ***** of the model by running it on experimental items and simulating antecedent choice and reading times of human participants | ||
| inter | 27 | |
| 2021.naacl-main.31 While cross-lingual techniques are finding increasing success in a wide range of Natural Language Processing tasks, their application to Semantic Role Labeling (SRL) has been strongly limited by the fact that each language adopts its own linguistic formalism, from PropBank for English to AnCora for Spanish and PDT-Vallex for Czech, ***** inter ***** alia. | ||
| S17-1013 Collecting spontaneous speech corpora that are open-ended, yet topically constrained, is increasingly popular for research in spoken dialogue systems and speaker state, ***** inter ***** alia. | ||
| L10-1213 The suite can be used as regression and performance evaluations both intra-c-rater® or ***** inter ***** automatic content scoring technologies. | ||
| P19-1423 The graph is constructed using various ***** inter *****- and intra-sentence dependencies to capture local and non-local dependency information. | ||
| L10-1299 The corpus consists of videos, transcripts, and annotations of the ***** inter *****- action between a naive speaker and a confederate listener | ||
| resourced | 27 | |
| 2021.dravidianlangtech-1.51 Many publicly available corpora are there for research on identifying offensive text written in English language but rare for low ***** resourced ***** languages like Tamil. | ||
| 2020.lt4hala-1.15 While machine learning approaches to sequence modeling can be applied to solve the task, they typically face a severed skewness in the availability of training material, especially for lesser ***** resourced ***** languages. | ||
| L10-1063 A particular point is made for using HMM (HTS) synthesis in this case, as it seems to be very appropriate for less ***** resourced ***** languages. | ||
| L14-1231 Portuguese is a less ***** resourced ***** language in what concerns foreign language learning. | ||
| D19-1076 We show that our technique outperforms the state-of-the-art in lower ***** resourced ***** settings with an average of 3.7% improvement of precision @10 across 14 mostly low ***** resourced ***** languages | ||
| expressivity | 27 | |
| L14-1196 The indices are related to popularity, ***** expressivity ***** and singularity. | ||
| L08-1529 These usages range from unit selection speech synthesis to statistical modeling of speech phenomena like prosody or ***** expressivity *****. | ||
| L14-1581 The currently available databases do not contain full body ***** expressivity ***** and interaction patterns via avatars. | ||
| L14-1041 In this paper we propose MAPLE (MAPping Architecture based on Linguistic Evidences), an architecture and software platform that semi-automatically solves this configuration problem, by reasoning on metadata about the linguistic ***** expressivity ***** of the input ontologies, the available mediators and other components relevant to the mediation task. | ||
| K18-1021 We show that projecting the two languages onto a third, latent space, rather than directly onto each other, while equivalent in terms of ***** expressivity *****, makes it easier to learn approximate alignments | ||
| federated | 27 | |
| 2021.emnlp-main.606 To tackle these problems, we propose SEFL, a secure and efficient ***** federated ***** learning framework that (1) eliminates the need for the trusted entities; (2) achieves similar and even better model accuracy compared with existing FL designs; (3) is resilient to client dropouts. | ||
| 2020.emnlp-main.165 In this paper, we propose a privacy-preserving medical relation extraction model based on ***** federated ***** learning, which enables training a central model with no single piece of private local data being shared or exchanged. | ||
| 2020.coling-main.310 Experiments on two SLU benchmark datasets, including two tasks (intention detection and slot filling) and ***** federated ***** learning settings (horizontal ***** federated ***** learning, vertical ***** federated ***** learning and ***** federated ***** transfer learning), demonstrate the effectiveness and universality of our approach. | ||
| 2020.lrec-1.417 Data infrastructures such as CLARIN have recently embarked on the emerging frameworks for the federation of infrastructural services, such as the European Open Science Cloud and the integration of services resulting from multidisciplinary collaboration in ***** federated ***** services for the wider SSH domain | ||
| K19-1012 We propose algorithms to train production - quality n - gram language models using *****federated***** learning . | ||
| inducing | 27 | |
| R19-1090 Building representative linguistic resources and NLP tools for non-standardized languages is challenging: when spelling is not determined by a norm, multiple written forms can be encountered for a given word, ***** inducing ***** a large proportion of out-of-vocabulary words. | ||
| 2020.acl-main.73 With the help of autoencoding variational Bayes, our model improves data scalability and achieves competitive performance when ***** inducing ***** latent topics and tree structures, as compared to a prior tree-structured topic model (Blei et al., 2010). | ||
| 2020.findings-emnlp.291 We then analyze two scenarios: 1) ***** inducing ***** negative biases for one demographic and positive biases for another demographic, and 2) equalizing biases between demographics. | ||
| 2021.acl-long.559 While previous unsupervised parsing methods mostly focus on only ***** inducing ***** one class of grammars, we introduce a novel model, StructFormer, that can induce dependency and constituency structure at the same time. | ||
| 2020.emnlp-main.168 Experimental results show that the retrofitted structure-aware Transformer language model achieves improved perplexity, meanwhile ***** inducing ***** accurate syntactic phrases | ||
| Pointer | 27 | |
| 2021.acl-long.307 By introducing three novel components: ***** Pointer *****, Disambiguator, and Copier, our method PDC achieves the following merits inherently compared with previous efforts: (1) ***** Pointer ***** leverages the semantic information from bilingual dictionaries, for the first time, to better locate source words whose translation in dictionaries can potentially be used; (2) Disambiguator synthesizes contextual information from the source view and the target view, both of which contribute to distinguishing the proper translation of a specific source word from multiple candidates in dictionaries; (3) Copier systematically connects ***** Pointer ***** and Disambiguator based on a hierarchical copy mechanism seamlessly integrated with Transformer, thereby building an end-to-end architecture that could avoid error propagation problems in alternative pipe-line methods. | ||
| D17-1143 Specifically, we propose a novel architecture that applies ***** Pointer ***** Network sequence-to-sequence attention modeling to structural prediction in discourse parsing tasks. | ||
| 2021.emnlp-main.825 To that end, we develop a ***** Pointer ***** Network capable of accurately generating the continuous token arrangement for a given input sentence and define a bijective function to recover the original order | ||
| D19-1390 *****Pointer***** Generators have been the de facto standard for modern summarization systems . | ||
| 2020.acl-main.629 Transition - based parsers implemented with *****Pointer***** Networks have become the new state of the art in dependency parsing , excelling in producing labelled syntactic trees and outperforming graph - based models in this task . | ||
| transitivity | 27 | |
| L08-1579 In this paper we discuss a solution to derive a bilingual dictionary by ***** transitivity ***** using existing ones and to check the generated translations in a parallel corpus. | ||
| D18-1495 This assumption is very strong when documents are long with rich topic information and do not exhibit the ***** transitivity ***** of biterms. | ||
| 2020.findings-emnlp.426 We construct a paraphrase graph from the provided sentence pair labels, and create an augmented dataset by directly inferring labels from the original sentence pairs using a ***** transitivity ***** property. | ||
| W17-0702 How do children learn a verb's argument structure when their input contains nonbasic clauses that obscure verb ***** transitivity *****? | ||
| 1998.amta-papers.40 It suggests that the ***** transitivity ***** difference is best treated with verb entries containing information of the causal relation of the expressed events | ||
| monotonicity | 27 | |
| 2020.acl-main.543 We consider four aspects of ***** monotonicity ***** inferences and test whether the models can systematically interpret lexical and logical phenomena on different training/test splits. | ||
| S19-1027 We also find that the improvement is better on ***** monotonicity ***** inferences with lexical replacements than on downward inferences with disjunction and modification. | ||
| 2021.iwcs-1.12 However, there is hardly any work that connects dependency parsing to ***** monotonicity *****, which is an essential part of logic and linguistic semantics. | ||
| P19-1148 In this work, we ask the following question: Is ***** monotonicity ***** really a helpful inductive bias in these tasks? | ||
| 2021.naacl-main.354 General ***** monotonicity ***** does not benefit transformer multihead attention, however, we see isolated improvements when only a subset of heads is biased towards monotonic behavior | ||
| analogy | 27 | |
| 2010.jeptalnrecital-court.29 We give a special focus on objects structured as ordered and labeled trees, with an original definition of ***** analogy ***** based on optimal alignment. | ||
| 2020.lrec-1.315 Experiments show our model's performance on word ***** analogy ***** tasks, illustrating the divergent objectives of morphological vs. semantic analogies. | ||
| D17-1023 Comprehensive experiments are conducted on word ***** analogy ***** and similarity tasks. | ||
| 2021.acl-long.398 Especially, our model achieves the highest accuracy on ***** analogy ***** tasks in different language levels and significantly improves the performance on downstream tasks in the GLUE benchmark and a question answering dataset. | ||
| P19-1391 In this paper, we fill this gap for German by constructing deISEAR, a corpus designed in ***** analogy ***** to the well-established English ISEAR emotion dataset | ||
| toponym | 27 | |
| W19-9003 Additionally, we present the results of preliminary experiments on integrating a small amount of crowdsourced annotations to improve overall performance of ***** toponym ***** recognition in our heritage corpus. | ||
| R19-1106 Following this classification task, we use a string matching algorithm with a gazetteer to identify the exact index of a ***** toponym ***** within the sentence. | ||
| S19-2155 We present the SemEval-2019 Task 12 which focuses on ***** toponym ***** resolution in scientific articles. | ||
| S19-2229 Our systems achieved 83.03% strict macro F1, 74.50 strict micro F1, 85.92 overlap macro F1 and 78.47 overlap micro F1 in ***** toponym ***** detection subtask | ||
| S19-2230 The SemEval-2019 Task 12 is *****toponym***** resolution in scientific papers . | ||
| AMR parsing | 27 | |
| P18-1171 We evaluate our neural transition model on the ***** AMR parsing ***** task, and our parser outperforms other sequence-to-sequence approaches and achieves competitive results in comparison with the best-performing models. | ||
| 2021.starsem-1.20 One of the challenges we find in ***** AMR parsing ***** is to capture the structure of complex sentences which expresses the relation between predicates. | ||
| 2021.acl-long.73 We hope that knowledge gained while learning for English ***** AMR parsing ***** and text generation can be transferred to the counterparts of other languages. | ||
| 2021.emnlp-main.507 We provide a detailed comparison with recent progress in ***** AMR parsing ***** and show that the proposed parser retains the desirable properties of previous transition-based approaches, while being simpler and reaching the new parsing state of the art for AMR 2.0, without the need for graph re-categorization. | ||
| N18-1106 We make our data and code freely available for further research on ***** AMR parsing ***** and generation, and the relationship of AMR to syntax | ||
| negated | 27 | |
| W17-1808 The Dice coefficient for inter-annotator agreement is higher than 0.94 for negation markers and higher than 0.72 for ***** negated ***** events. | ||
| C18-1191 Our error analysis indicates that an approach that takes the information structure into account (i.e. which information is new or contrastive) may be promising, which requires looking beyond the syntactic and semantic characteristics of ***** negated ***** statements. | ||
| S18-1045 As part of the word embedding training, we also learn the distributed representations of multi-word expressions (MWEs) and ***** negated ***** forms of words. | ||
| 2021.naacl-main.346 Moreover, these ***** negated ***** or contradictory statements shift the commonsense implications of the original premise in interesting and nontrivial ways | ||
| D19-1230 Negation is a universal but complicated linguistic phenomenon , which has received considerable attention from the NLP community over the last decade , since a *****negated***** statement often carries both an explicit negative focus and implicit positive meanings . | ||
| unsegmented | 27 | |
| N19-1324 Our method does not depend on word segmentation and any human-annotated resources (e.g., word dictionaries), yet it is very effective for noisy corpora written in ***** unsegmented ***** languages such as Chinese and Japanese. | ||
| D17-1112 We present the first unsupervised LSTM speech segmenter as a cognitive model of the acquisition of words from ***** unsegmented ***** input. | ||
| I17-1094 We present a method for automatically extracting pairs of a variant word and its normal form from ***** unsegmented ***** text on the basis of a pair-wise similarity approach. | ||
| P19-1158 For ***** unsegmented ***** languages such as Japanese and Chinese, tokenization of a sentence has a significant impact on the performance of text classification. | ||
| D17-1080 To avoid these problems in learning word vectors of ***** unsegmented ***** languages, we consider word co-occurrence statistics over all possible candidates of segmentations based on frequent character n-grams instead of segmented sentences provided by conventional word segmenters | ||
| rumor | 27 | |
| S19-2195 The proposed system in this paper achieved 0.435 F-Macro in stance classification, and 0.262 F-macro and 0.801 RMSE in ***** rumor ***** verification tasks in Task7 of SemEval 2019. | ||
| P19-1498 Our Tree LSTM models employ multi-task (stance + ***** rumor *****) learning and propagate the useful stance signal up in the tree for ***** rumor ***** classification at the root node. | ||
| S19-2148 Task A is to determine a user's stance towards the source ***** rumor *****, and Task B is to detect the veracity of the ***** rumor *****: true, false or unverified. | ||
| 2021.ranlp-1.147 The deception in the text can be of different forms in different domains, including fake news, ***** rumor ***** tweets, and spam emails. | ||
| 2020.coling-main.561 Previous work for *****rumor***** resolution concentrates on exploiting time - series characteristics or modeling topology structure separately . | ||
| opinionated | 27 | |
| 2020.lrec-1.71 We introduce as a baseline an end-to-end trained self-attention decoder model trained on this data and show that it is able to generate ***** opinionated ***** responses that are judged to be natural and knowledgeable and show attentiveness. | ||
| L06-1120 This paper defines the annotations for ***** opinionated ***** materials. | ||
| 2021.alta-1.9 Despite the advancement in summarisation models, evaluation metrics for ***** opinionated ***** text summaries lag behind and still rely on lexical-matching metrics such as ROUGE. | ||
| 2021.emnlp-main.726 In this work, we introduce the Aspect Sentiment Quad Prediction (ASQP) task, aiming to jointly detect all sentiment elements in quads for a given ***** opinionated ***** sentence, which can reveal a more comprehensive and complete aspect-level sentiment structure | ||
| 2020.findings-emnlp.146 Classifying and resolving coreferences of objects ( e.g. , product names ) and attributes ( e.g. , product aspects ) in *****opinionated***** reviews is crucial for improving the opinion mining performance . | ||
| Authorship | 27 | |
| N19-1068 ***** Authorship ***** verification is the problem of inferring whether two texts were written by the same author. | ||
| L12-1090 *****Authorship***** verification is the task of , given a document and a candi- date author , determining whether or not the document was written by the candi- date author . | ||
| C18-1238 *****Authorship***** attribution typically uses all information representing both content and style whereas attribution based only on stylistic aspects may be robust in cross - domain settings . | ||
| W17-2401 *****Authorship***** attribution is a natural language processing task that has been widely studied , often by considering small order statistics . | ||
| 2020.acl-main.203 *****Authorship***** attribution aims to identify the author of a text based on the stylometric analysis . | ||
| perspectives | 27 | |
| 2021.naacl-main.344 We argue that identifying and abstracting such natural language ***** perspectives ***** from editorials is a crucial step toward studying the implicit argumentation structure in news editorials. | ||
| L10-1465 Finally, we shed light on the ***** perspectives ***** of the given work clearly outlining the challenges. | ||
| 2020.lrec-1.173 This large corpus of more than 380k annotated messages opens ***** perspectives ***** for online abuse detection and especially for context-based approaches. | ||
| N19-1053 We construct PERSPECTRUM, a dataset of claims, ***** perspectives ***** and evidence, making use of online debate websites to create the initial data collection, and augmenting it using search engines in order to expand and diversify our dataset. | ||
| D17-1165 Words associated with topics or ***** perspectives ***** follow different generative routes | ||
| treebank annotation | 27 | |
| L10-1461 Consistency in ***** treebank annotation ***** is a must for making data as error-free as possible and for providing quality assurance. | ||
| L10-1512 It is observed from the results that this semi-automated approach when carried out with experienced and trained human annotators improves the overall quality of ***** treebank annotation ***** and also speeds up the process. | ||
| L08-1213 The testsuite provides a well thought-out error classification, which enables us to compare parser output for parsers trained on treebanks with different encoding schemes and provides interesting insights into the impact of ***** treebank annotation ***** schemes on specific constructions like PP attachment or non-constituent coordination. | ||
| L08-1361 This is due to some aspects of the ***** treebank annotation ***** that to our knowledge have never before been published. | ||
| L16-1368 The cross-lingually and cross-theoretically focused survey is intended as an aid to accessing treebanks and an aid for further work on ***** treebank annotation ***** | ||
| difficulty | 27 | |
| 2021.emnlp-main.281 In SEC, the data from different language learners are naturally distributed at different ***** difficulty ***** levels (some errors made by beginners are obvious to correct while some made by fluent speakers are hard), and we expect that designing a curriculum correspondingly for model learning may also help its training and bring about better performance. | ||
| 2021.wassa-1.13 Our axes of analysis include Task ***** difficulty ***** on CL, comparing CL pacing techniques, and qualitative analysis by visualizing the movement of attention scores in the model as curriculum phases progress. | ||
| 2020.bea-1.20 In addition, ***** difficulty ***** was best predicted using signal from the item stem (the description of the clinical case), while all parts of the item were important for predicting the response time. | ||
| N19-1132 However, the evaluation remains incomplete because the task ***** difficulty ***** varies depending on the test corpus and conditions such as the proficiency levels of the writers and essay topics. | ||
| 2021.eval4nlp-1.1 Comparing systems along these ***** difficulty ***** bins enables us to produce a finer-grained analysis of their relative merits, which we illustrate on two use-cases: a comparison of systems participating in a multi-label text classification task (CLEF eHealth 2018 ICD-10 coding), and a comparison of neural models trained for biomedical entity detection (BioCreative V chemical-disease relations dataset) | ||
| frames | 27 | |
| W16-3809 This paper proposes a type system for ***** frames ***** that shows whether two ***** frames ***** are variants of a given alternation. | ||
| L14-1386 About half of them correspond to ***** frames ***** already described in FrameNet; some new ***** frames ***** were also defined and part of these might be specific to the field of the environment. | ||
| L16-1601 Given full coverage is not reachable for a relatively “new” FrameNet project, we advocate that focusing on specific notional domains allowed us to obtain full lexical coverage for the ***** frames ***** of these domains, while partially reflecting word sense ambiguities. | ||
| C16-1068 However, the neighboring words, or ***** frames *****, are rarely repeated exactly in the data. | ||
| L10-1378 Second, it can merge word senses that belong to ***** frames ***** related by specified relations | ||
| variational autoencoders | 27 | |
| P18-1104 We investigate several conditional ***** variational autoencoders ***** training on these conversations, which allow us to use emojis to control the emotion of the generated text. | ||
| W17-4308 Advances in neural variational inference have facilitated the learning of powerful directed graphical models with continuous latent variables, such as ***** variational autoencoders *****. | ||
| D17-1043 Advances in neural variational inference have facilitated the learning of powerful directed graphical models with continuous latent variables, such as ***** variational autoencoders *****. | ||
| D19-1239 In this paper, we introduce a data augmentation approach that leverages ***** variational autoencoders ***** to learn high-quality data distributions from a large unlabeled dataset, and subsequently, to automatically generate a large labeled training set from a small set of labeled samples. | ||
| P17-1061 Unlike past work that has focused on diversifying the output of the decoder from word-level to alleviate this problem, we present a novel framework based on conditional ***** variational autoencoders ***** that capture the discourse-level diversity in the encoder | ||
| speakers | 27 | |
| L10-1466 Second, we added different levels of annotations, recognition of named entities and annotation of personal information about ***** speakers *****. | ||
| L08-1212 The test results show that every emotion is readily recognized far above chance level for both ***** speakers *****. | ||
| P17-2080 Moreover, the dialog states for both ***** speakers ***** are modeled separately in order to reflect personal features | ||
| W17-1606 Speakers' dialect and gender was controlled for by using videos uploaded as part of the “accent tag challenge”, where ***** speakers ***** explicitly identify their language background. | ||
| 2020.peoples-1.7 Furthermore, we propose contextual augmentation of pretrained language models for emotion recognition in conversations, which is to consider not only previous utterances, but also conversation-related information such as ***** speakers *****, speech acts and topics. | ||
| commonsense validation | 27 | |
| 2020.semeval-1.70 This paper introduces our system for ***** commonsense validation ***** and explanation. | ||
| 2020.semeval-1.52 Intuitively, ***** commonsense validation ***** requires additional knowledge beyond the given statements. | ||
| 2020.semeval-1.79 We propose a reinforcement learning model based on MTL(Multi-Task Learning) to enhance the prediction ability of ***** commonsense validation *****. | ||
| 2020.semeval-1.72 This paper presents the work of the NLP@JUST team at SemEval-2020 Task 4 competition that related to ***** commonsense validation ***** and explanation (ComVE) task. | ||
| 2020.semeval-1.50 From our experiments, we can draw the following three main conclusions: a) Neural language model fully qualified for ***** commonsense validation ***** and explanation. | ||
| recently | 27 | |
| L14-1017 The resultant data has also been ***** recently ***** used in disfluency studies across domains. | ||
| 2006.amta-papers.28 Discriminative training methods have ***** recently ***** led to significant advances in the state of the art of machine translation (MT). | ||
| E17-1019 Moreover, we explore the utilization of the ***** recently ***** proposed Word Mover's Distance (WMD) document metric for the purpose of image captioning. | ||
| C18-1283 The ***** recently ***** increased focus on misinformation has stimulated research in fact checking, the task of assessing the truthfulness of a claim. | ||
| 2021.emnlp-main.351 Pretrained language models (PLM) have ***** recently ***** advanced graph-to-text generation, where the input graph is linearized into a sequence and fed into the PLM to obtain its representation. | ||
| test suite | 27 | |
| 2020.coling-main.269 The size and detail of annotations make the ***** test suite ***** a valuable resource for natural language processing applications with syntactic and semantic tasks. | ||
| 2021.wmt-1.115 We are using a semi-automated ***** test suite ***** in order to provide a fine-grained linguistic evaluation for state-of-the-art machine translation systems. | ||
| 2020.wmt-1.38 Two systems (Tohoku and Huoshan) appear to have significantly better ***** test suite ***** accuracy than the others, although the best system of WMT20 is not significantly better than the one from WMT19 in a macro-average. | ||
| L10-1014 We found that they had twenty features in total and that seven were shared between the two models, suggesting that there is a core of feature types that may be applicable to ***** test suite ***** construction for any similar type of application. | ||
| L16-1453 For German, we use an existing ***** test suite ***** of V-Prt split constructions, while for English, we build a new and comparable ***** test suite ***** from raw data. | ||
| automatic text summarization | 27 | |
| I17-2033 An ***** automatic text summarization ***** system can automatically generate a short and brief summary that contains a main concept of an original document. | ||
| L06-1117 In this paper we present a novel method for ***** automatic text summarization ***** through text extraction, using computational semantics. | ||
| W17-4508 Recent advances in ***** automatic text summarization ***** have used deep neural networks to generate high-quality abstractive summaries, but the performance of these models strongly depends on large amounts of suitable training data. | ||
| L14-1245 We present results from an eye tracking study of ***** automatic text summarization *****. | ||
| 2021.eacl-main.160 Manual evaluation is essential to judge progress on ***** automatic text summarization *****. | ||
| characters | 27 | |
| 2021.acl-long.121 This paper presents a novel Multi-metadata Embedding based Cross-Transformer (MECT) to improve the performance of Chinese NER by fusing the structural information of Chinese ***** characters *****. | ||
| 2020.lrec-1.852 To extract ***** characters ***** from target books, manually created dictionaries of ***** characters ***** are employed because some ***** characters ***** appear as common nouns not as named entities. | ||
| N18-2106 We present an initial approach for this problem, which finds correspondences between narratives in terms of plot events, and resemblances between ***** characters ***** and their social relationships. | ||
| W18-5031 In this study, our goal is to collect a large number of question-answer pairs for a particular character by using role play-based question-answering in which multiple users play the roles of certain ***** characters ***** and respond to questions by online users. | ||
| W18-3709 When using inputs as simple as Chinese ***** characters *****, the ensembled system achieves a precision at 86.56% in the detection of erroneous sentences, and a precision at 51.53% in the correction of errors of Selection and Missing types. | ||
| sentiment polarity | 27 | |
| P17-1155 We show that while the scores of n-gram based automatic measures are similar for all interpretation models, SIGN's interpretations are scored higher by humans for adequacy and ***** sentiment polarity *****. | ||
| D19-1342 If a real-world sentiment classification system ignores the existence of conflict opinions when it is designed, it will incorrectly mixed conflict opinions into other ***** sentiment polarity ***** categories in action. | ||
| 2021.emnlp-main.362 In the cases of aspect-based sentiment analysis, violation of the above issues may change the aspect and ***** sentiment polarity *****. | ||
| 2020.coling-main.70 Aspect-level sentiment classification (ASC) aims to detect the ***** sentiment polarity ***** of a given opinion target in a sentence. | ||
| S18-1194 The experiments show that the first framework is more effective and ***** sentiment polarity ***** is useful. | ||
| selection | 27 | |
| D19-1236 The review and ***** selection ***** process for scientific paper publication is essential for the quality of scholarly publications in a scientific field. | ||
| N18-1082 In addition, recent studies aiming at solving prepositional attachment and preposition ***** selection ***** problems depend heavily on external linguistic resources and use dataset-specific word representations. | ||
| L10-1434 Our second interest lies in the actual comparison of the models: How does a very simple distributional model compare to much more complex approaches, and which representation of ***** selection *****al preferences is more appropriate, using (i) second-order properties, (ii) an implicit generalisation of nouns (by clusters), or (iii) an explicit generalisation of nouns by WordNet classes within clusters? | ||
| C16-2028 TextPro-AL is a web-based application integrating four components: a machine learning based NLP pipeline, an annotation editor for task definition and text annotations, an incremental re-training procedure based on active learning ***** selection ***** from a large pool of unannotated data, and a graphical visualization of the learning status of the system. | ||
| 2020.scai-1.2 In this paper , we show that question rewriting ( QR ) of the conversational context allows to shed more light on this phenomenon and also use it to evaluate robustness of different answer *****selection***** approaches . | ||
| unstructured text | 27 | |
| C18-2007 This task can be automated by defining similarity among documents which is a nontrivial task since these documents are often stored in an ***** unstructured text ***** format. | ||
| W16-4404 There are some open domain question answering systems, such as IBM Waston, which take the ***** unstructured text ***** data as input, in some ways of humanlike thinking process and a mode of artificial intelligence. | ||
| S19-2222 Suggestion mining task aims to extract tips, advice, and recommendations from ***** unstructured text *****. | ||
| W19-3643 However, many medical reports available in current clinical practice system are not yet ready for analysis using either statistics or machine learning as they are in ***** unstructured text ***** format. | ||
| K18-1013 A NLU embedding model can facilitate analyzing and understanding relationships between ***** unstructured text *****s and their corresponding structured semantic knowledge, essential for both researchers and practitioners of NLU. | ||
| weakly supervised | 27 | |
| 2020.findings-emnlp.185 In addition, we propose a ***** weakly supervised ***** pretraining, where labels for text classification are obtained automatically from an existing approach. | ||
| Q13-1016 We further introduce a ***** weakly supervised ***** training procedure that estimates LSP's parameters using annotated referents for entire statements, without annotated referents for individual words or the parse structure of the statement. | ||
| 2021.dialdoc-1.10 We can leverage these signals to generate the ***** weakly supervised ***** training data for learning dialog policy and reward estimator, and make the policy take actions (generates responses) which can foresee the future direction for a successful (rewarding) conversation. | ||
| L16-1684 Although methods using resources from related languages outperform ***** weakly supervised ***** methods using just a few training examples, we can still reach a promising accuracy with methods abstaining additional resources. | ||
| 2020.signlang-1.35 With the written language translations as labels, we train a ***** weakly supervised ***** keyword search model for sign language and further improve the retrieval performance with two context modeling strategies. | ||
| automatically generated | 27 | |
| 2020.semeval-1.18 We use existing semantically annotated datasets, and propose to approximate similarity through ***** automatically generated ***** lexical substitutes in context. | ||
| 2020.latechclfl-1.5 We propose a simple concatenation approach that improves the quality of ***** automatically generated ***** title translations for artworks, by leveraging textual information extracted from Iconclass. | ||
| W19-5105 After training, we query the model for the plausibility of ***** automatically generated ***** novel combinations and verify whether the classifications are accurate. | ||
| L16-1006 The best result is obtained using the ***** automatically generated ***** Dialectal Hashtag Lexicon and the Arabic translations of the NRC Emotion Lexicon (accuracy of 66.6%). | ||
| R19-1045 This virtual dialogue content is provided in the form of answers derived from the found and selected documents split into fragments, and questions that are ***** automatically generated ***** for these answers based on the initial text. | ||
| level | 27 | |
| D18-1207 We propose two techniques to improve the ***** level ***** of abstraction of generated summaries. | ||
| 2008.amta-papers.19 We also build a cascaded translation model that dynamically shifts translation units from phrase ***** level ***** to word and morpheme phrase ***** level *****s. | ||
| 2021.emnlp-main.777 At the script ***** level *****, most existing studies only consider a single event sequence corresponding to one common protagonist. | ||
| W18-1601 For detection of stylistic variation, we use relative entropy, measuring the difference between probability distributions at different linguistic ***** level *****s (here: lexis and grammar). | ||
| L06-1198 The corpus features five different annotation layers , ranging from the annotation of morphological boundaries at the word level , over the annotation of part - of - speech tags and phrase chunks at the syntactic level to the annotation of named entities at the semantic *****level***** and coreferential relations at the discourse level . | ||
| ranking | 27 | |
| D19-1170 In addition, an arithmetic expression re***** ranking ***** mechanism is proposed to rank expression candidates for further confirming the prediction. | ||
| W18-6312 We reassess a recent study (Hassan et al., 2018) that claimed that machine translation (MT) has reached human parity for the translation of news from Chinese into English, using pairwise ***** ranking ***** and considering three variables that were not taken into account in that previous study: the language in which the source side of the test set was originally written, the translation proficiency of the evaluators, and the provision of inter-sentential context. | ||
| 2021.blackboxnlp-1.43 Rather than build a WSD system as in previous work, we investigate contextualized embedding neighborhoods directly, formulating a query-by-example nearest neighbor retrieval task and examining ***** ranking ***** performance for words and senses in different frequency bands. | ||
| W18-3921 We obtained an F1 macro score of 0.836, ***** ranking ***** 5th in the task. | ||
| 2021.emnlp-main.279 Tag recommendation relies on either a *****ranking***** function for top - k tags or an autoregressive generation method . | ||
| neural dialogue generation | 27 | |
| N18-2008 Despite myriad efforts in the literature designing ***** neural dialogue generation ***** systems in recent years, very few consider putting restrictions on the response itself. | ||
| 2020.findings-emnlp.368 In this paper, we propose a meta-learning based semi-supervised explicit dialogue state tracker (SEDST) for ***** neural dialogue generation *****, denoted as MEDST. | ||
| P18-1138 End-to-end ***** neural dialogue generation ***** has shown promising results recently, but it does not employ knowledge to guide the generation and hence tends to generate short, general, and meaningless responses. | ||
| 2020.findings-emnlp.70 Extensive experimental results show that the proposed group-wise contrastive learning framework is suited for training a wide range of ***** neural dialogue generation ***** models with very favorable performance over the baseline training approaches. | ||
| 2021.acl-long.272 *****Neural dialogue generation***** models trained with the one-hot target distribution suffer from the over-confidence issue, which leads to poor generation diversity as widely reported in the literature. | ||
| adversarial domain adaptation | 27 | |
| P19-1556 Here we investigate the use of gradient reversal on ***** adversarial domain adaptation ***** to explicitly learn both shared and unshared (domain specific) representations between two textual domains. | ||
| P19-1211 In addition to ***** adversarial domain adaptation ***** (ADA), we introduce the use of artificial titles and sequential training to capture the grammatical style of the unlabeled target domain. | ||
| D19-1171 To minimize this cost, recent works thus often used alternative methods, e.g., ***** adversarial domain adaptation *****. | ||
| 2020.acl-main.681 Our work leverages the ***** adversarial domain adaptation ***** (ADA) framework to introduce domain-invariance. | ||
| 2021.eacl-main.258 We test the efficacy of three marginal alignment techniques: (i) ***** adversarial domain adaptation ***** (ADA), (ii) domain adaptive fine-tuning (DAFT), and (iii) a new instance weighting technique based on language model likelihood scores (LIW). | ||
| interactive | 27 | |
| 2020.sigdial-1.29 A total of 20 papers from the last two years are surveyed to analyze three types of evaluation protocols: automated, static, and ***** interactive *****. | ||
| P19-1514 In this work, we propose a novel approach to support value extraction scaling up to thousands of attributes without losing performance: (1) We propose to regard attribute as a query and adopt only one global set of BIO tags for any attributes to reduce the burden of attribute tag or model explosion; (2) We explicitly model the semantic representations for attribute and title, and develop an attention mechanism to capture the ***** interactive ***** semantic relations in-between to enforce our framework to be attribute comprehensive. | ||
| 2020.coling-main.13 Besides, to ***** interactive *****ly extract the inter-aspect relations for the specific aspect, an inter-aspect GCN is adopted to model the representations learned by aspect-focused GCN based on the inter-aspect graph which is constructed by the relative dependencies between the aspect words and other aspects. | ||
| C16-1258 When processing arguments in online user ***** interactive ***** discourse, it is often necessary to determine their bases of support. | ||
| 2020.intexsempar-1.4 We introduce a neural semantic parsing system that learns new high-level abstractions through decomposition: users ***** interactive *****ly teach the system by breaking down high-level utterances describing novel behavior into low-level steps that it can understand. | ||
| cross - lingual supervision | 27 | |
| K19-1097 Recent studies have shown that CLSA can be performed in a fully unsupervised manner, without exploiting either target language supervision or *****cross-lingual supervision*****. | ||
| D18-1024 Unsupervised MWE (UMWE) methods acquire multilingual embeddings without *****cross-lingual supervision*****, which is a significant advantage over traditional supervised approaches and opens many new possibilities for low-resource languages. | ||
| D18-1268 Supervised methods for this problem rely on the availability of *****cross-lingual supervision*****, either using parallel corpora or bilingual lexicons as the labeled data for training, which may not be available for many low resource languages. | ||
| 2021.mrl-1.4 Various approaches requiring only weak *****cross-lingual supervision***** were proposed, but current methods still fail to learn good CLWEs for languages with only a small monolingual corpus. | ||
| D17-1207 However, in order to connect the separate spaces, *****cross-lingual supervision***** encoded in parallel data is typically required. | ||
| adversarial perturbation | 27 | |
| 2020.acl-main.245 Despite excellent performance on many tasks, NLP systems are easily fooled by small *****adversarial perturbations***** of inputs. | ||
| 2020.emnlp-main.495 In particular, we propose a tree-based autoencoder to embed the discrete text data into a continuous representation space, upon which we optimize the *****adversarial perturbation*****. | ||
| P19-1147 Experimental results show that, compared to recurrent neural models, self-attentive models are more robust against *****adversarial perturbation*****. | ||
| N19-1314 We further use this framework to demonstrate that adding additional constraints on attacks allows for *****adversarial perturbations***** that are more meaning-preserving, but nonetheless largely change the output sequence. | ||
| P19-1020 A regularization technique based on *****adversarial perturbation*****, which was initially developed in the field of image processing, has been successfully applied to text classification tasks and has yielded attractive improvements. | ||
| sentiment transfer | 27 | |
| 2020.aacl-main.33 Unsupervised style transfer in text has previously been explored through the *****sentiment transfer***** task. | ||
| 2020.findings-emnlp.61 We present a method for creating parallel data to train Seq2Seq neural networks for *****sentiment transfer*****. | ||
| D19-1406 Second, starting with certain values of bilingual evaluation understudy (BLEU) between input and output and accuracy of the *****sentiment transfer***** the optimization of these two standard metrics diverge from the intuitive goal of the style transfer task. | ||
| 2020.coling-main.197 We use *****sentiment transfer***** as our case study for style transfer analysis. | ||
| 2021.acl-long.293 We demonstrate that training on unlabeled Amazon reviews data results in a model that is competitive on *****sentiment transfer*****, even compared to models trained fully on labeled data. | ||
| public | 27 | |
| 2020.wnut-1.80 Extracting structured knowledge involving self - reported events related to the COVID-19 pandemic from Twitter has the potential to inform surveillance systems that play a critical role in *****public***** health . | ||
| W19-3024 This work aims to infer mental health status from *****public***** text for early detection of suicide risk . | ||
| 2020.lrec-1.115 DEbateNet - migr15 is a manually annotated dataset for German which covers the *****public***** debate on immigration in 2015 . | ||
| 2021.smm4h-1.4 The global growth of social media usage over the past decade has opened research avenues for mining health related information that can ultimately be used to improve *****public***** health . | ||
| 2020.findings-emnlp.312 As users engage in *****public***** discourse , the rate of voluntarily disclosed personal information has seen a steep increase . | ||
| associative | 26 | |
| 2020.acl-main.679 Our results highlight interesting differences between humans and language models: language models are more sensitive to number or gender alternations and synonym replacements than humans, and humans are more stable and consistent in their predictions, maintain a much higher absolute performance, and perform better on non-***** associative ***** instances than ***** associative ***** ones. | ||
| 2020.gamnlp-1.4 After comparing both game designs, the Cohen kappa of ***** associative ***** lists in various configurations is computed in order to assess likeness and differences of the data they provide. | ||
| 1963.earlymt-1.12 Several smallscale experimental ***** associative ***** networks have been built, and are briefly described in the paper; one such device will be demonstrated in the course of the oral presentation of the paper. | ||
| N18-2001 However, semantic similarity exhibited in these word embeddings is not suitable for resolving bridging anaphora, which requires the knowledge of ***** associative ***** similarity (i.e., relatedness) instead of semantic similarity information between synonyms or hypernyms. | ||
| K18-1029 Using optimal experiment design techniques, we compare a range of models varying in the type of ***** associative ***** information deployed and in level of pragmatic sophistication against human behavior | ||
| extract | 26 | |
| P19-1136 In contrast to previous baselines, we consider the interaction between named entities and relations via a 2nd-phase relation-weighted GCN to better ***** extract ***** relations. | ||
| D19-1029 In this work, we propose a new sequence labeling framework (as well as a new tag schema) to jointly ***** extract ***** the fact and condition tuples from statement sentences. | ||
| 2012.amta-papers.16 In contrast to the conventional Hiero (Chiang, 2007) rule ***** extract *****ion algorithm , our methods ***** extract ***** compact models reducing model size by 17.8% to 57.6% without impacting translation quality across several language pairs. | ||
| 2020.findings-emnlp.72 To address the issue, we present a novel view of ABSA as an opinion triplet ***** extract *****ion task, and propose a multi-task learning framework to jointly ***** extract ***** aspect terms and opinion terms, and simultaneously parses sentiment dependencies between them with a biaffine scorer. | ||
| D19-1024 We design a novel model to better ***** extract ***** key information from textual descriptions | ||
| automating | 26 | |
| 2020.winlp-1.35 ExtremeGradient Boost outperformed other classifiers in ***** automating ***** the task of assigning ICD-10 codes based on the three narrative text fields with an accuracy of 79%, precision of75%, and recall of 78%. | ||
| W17-4902 However, applying stylistic variations is still by and large a manual process, and there have been little efforts towards ***** automating ***** it. | ||
| 2020.latechclfl-1.2 Since the manual creation of such annotations requires a lot of effort, ***** automating ***** the process with NLP methods would be convenient. | ||
| 2020.nlp4convai-1.9 To alleviate this problem, we explore ***** automating ***** the process of creating dialogue templates by using unsupervised methods to cluster historical utterances and selecting representative utterances from each cluster. | ||
| 2020.lrec-1.52 Despite advances in natural language processing, ***** automating ***** clinical note generation from a clinic visit conversation is a largely unexplored area of research | ||
| Often | 26 | |
| W16-4112 ***** Often *****, relevant corpora consist only of easy-to-read texts with no rank information or empirical readability scores, making only binary approaches, such as classification, applicable. | ||
| 2020.acl-main.509 ***** Often ***** levels of agreement/disagreement are implicit in the text, and must be predicted to analyze collective opinions. | ||
| 2021.ranlp-1.42 ***** Often *****, the LI and POS tagging tasks are interdependent in the code-mixing scenario. | ||
| W17-5201 ***** Often ***** quoted as a challenge to sentiment analysis, sarcasm involves use of words of positive or no polarity to convey negative sentiment. | ||
| 2020.acl-main.399 ***** Often *****, the conclusion remains implicit, though, since it is self-evident in a discussion or left out for rhetorical reasons | ||
| alternations | 26 | |
| L12-1080 This kind of sense ***** alternations ***** very often presents semantic underspecificacion between its two possible selected senses. | ||
| 2020.acl-main.679 Our results highlight interesting differences between humans and language models: language models are more sensitive to number or gender ***** alternations ***** and synonym replacements than humans, and humans are more stable and consistent in their predictions, maintain a much higher absolute performance, and perform better on non-associative instances than associative ones. | ||
| 2020.coling-main.357 Due to the success of semi-supervised and unsupervised features, our approach can easily be transferred to further ***** alternations *****. | ||
| N18-4003 We discuss the influence that such ***** alternations ***** have on frame induction, compare several possible frame structures for verbs in the causative alternation, and propose a systematic analysis of alternating verbs that encodes their similarities as well as their differences. | ||
| L06-1164 Valences are specified for each sense of each verb, alongside with an illustrative example, possible argument ***** alternations ***** and a set of multiword expressions in which the respective verb occurs with the respective sense | ||
| subtrees | 26 | |
| Q13-1024 Dependency cohesion refers to the observation that phrases dominated by disjoint dependency ***** subtrees ***** in the source language generally do not overlap in the target language. | ||
| W19-4504 In this paper, we investigate similarities between discourse and argumentation structures by aligning ***** subtrees ***** in a corpus containing both annotations. | ||
| 2021.emnlp-main.332 Then, we aggregate the embeddings of ***** subtrees ***** by reconstructing the split ASTs to get the representation of the complete AST. | ||
| 2021.naacl-main.373 On the task of constructing ***** subtrees ***** of English WordNet, the model achieves 66.7 ancestor F1, a 20.0% relative increase over the previous best published result on this task. | ||
| N19-1018 Instead of storing ***** subtrees ***** in a stack –i.e. a data structure with linear-time sequential access– the proposed system uses a set of parsing items, with constant-time random access | ||
| predictors | 26 | |
| P17-1023 We present a model that learns individual ***** predictors ***** for object names that link visual and distributional aspects of word meaning during training. | ||
| 2021.eacl-srw.24 Registers, that is, text varieties such as blogs or news are one of the primary ***** predictors ***** of linguistic variation and thus affect the automatic processing of language. | ||
| 2021.wanlp-1.12 The dynamic aspect is achieved by utilizing ***** predictors ***** and features over NER algorithm results that identify which have performed better on a specific task in real-time. | ||
| D17-2005 SGNMT implements a number of search strategies for traversing the space spanned by the ***** predictors ***** which are appropriate for different predictor constellations. | ||
| 2020.acl-main.764 Experimenting on~9 different NLP tasks, we find that our ***** predictors ***** can produce meaningful predictions over unseen languages and different modeling architectures, outperforming reasonable baselines as well as human experts | ||
| formality | 26 | |
| P19-1609 We conduct experiments across various seq2seq text generation tasks including machine translation, ***** formality ***** style transfer, sentence compression and simplification. | ||
| 2020.coling-main.203 Conventional approaches for ***** formality ***** style transfer borrow models from neural machine translation, which typically requires massive parallel data for training. | ||
| 2021.nlp4convai-1.21 While these attributes are crucial for a successful dialogue, it is also desirable to simultaneously accomplish specific stylistic goals, such as response length, point-of-view, descriptiveness, sentiment, ***** formality *****, and empathy. | ||
| 2021.acl-short.62 Scarcity of parallel data causes ***** formality ***** style transfer models to have scarce success in preserving content. | ||
| 2020.coling-main.384 An important part of communication, however, takes place at the non-propositional level (e.g., politeness, ***** formality *****, emotions), and it is far from clear whether current MT methods properly translate this information | ||
| bigram | 26 | |
| N19-1050 Each of BiRD's 3,345 English term pairs involves at least one ***** bigram *****. | ||
| N19-1098 In this paper, we show how training word embeddings jointly with ***** bigram ***** and even trigram embeddings, results in improved unigram embeddings. | ||
| 2021.semeval-1.71 The ***** bigram ***** association measures were found useful, but to a limited extent. | ||
| E17-2003 Experiments on a small task show the issues raised by an unigram noise distribution, and that a context dependent noise distribution, such as the ***** bigram ***** distribution, can solve these issues and provide stable and data-efficient learning. | ||
| W19-2513 Our approach is based on count statistics from Google n - grams , which are converted into a likelihood ratio test computed from interpolated trigram and *****bigram***** probabilities . | ||
| outlier | 26 | |
| N19-1051 However, the problem of detecting both ***** outlier ***** types has received relatively little attention in NLP, particularly for dialog systems. | ||
| P18-2088 In this paper, we report experiments with a rank-based metric for WE, which performs comparably to vector cosine in similarity estimation and outperforms it in the recently-introduced and challenging task of ***** outlier ***** detection, thus suggesting that rank-based measures can improve clustering quality. | ||
| 2021.acl-long.273 Since the distribution of ***** outlier ***** utterances is arbitrary and unknown in the training stage, existing methods commonly rely on strong assumptions on data distribution such as mixture of Gaussians to make inference, resulting in either complex multi-step training procedures or hand-crafted rules such as confidence threshold selection for ***** outlier ***** detection. | ||
| D17-1291 Experiments conducted on two real-world textual data sets show that our method can achieve an up to 135% improvement over baselines in terms of recall at top-1% of the ***** outlier ***** ranking | ||
| 2021.emnlp-main.427 We address the sampling bias and *****outlier***** issues in few - shot learning for event detection , a subtask of information extraction . | ||
| CEFR | 26 | |
| 2020.lrec-1.890 It is called MALT-IT2, and it automatically classifies inputted texts according to the ***** CEFR ***** level they are more likely to belong to. | ||
| W18-0515 In this paper, we explore universal ***** CEFR ***** classification using domain-specific and domain-agnostic, theory-guided as well as data-driven features. | ||
| W17-5018 On the one hand, we describe the compilation of a learner corpus of short answers graded with ***** CEFR ***** levels by three certified Cambridge examiners. | ||
| W18-0508 The word lists used map each word to a single ***** CEFR ***** level, and the task consists of predicting ***** CEFR ***** levels for unseen words | ||
| L14-1083 In this paper we present FLELex , the first graded lexicon for French as a foreign language ( FFL ) that reports word frequencies by difficulty level ( according to the *****CEFR***** scale ) . | ||
| parametric | 26 | |
| D19-1421 The corresponding objective function for MLE is derived from the Kullback-Leibler (KL) divergence between the empirical probability distribution representing the data and the ***** parametric ***** probability distribution output by the model. | ||
| L14-1299 The novel aspect of our method is the use of informative, ***** parametric ***** alignment models which are refined iteratively as they are tested against the data. | ||
| L16-1548 The method has since become the standard for high quality speech output by computer although much of the current research is devoted to ***** parametric ***** or hybrid methods that employ smaller amounts of data and can be more easily tunable to individual voices. | ||
| L16-1199 We find that compositional distributional models, especially ***** parametric ***** ones, perform way above non-compositional alternatives on the task. | ||
| 2021.naacl-main.468 Our results indicate that retrieve-and-read can be a viable option even in a highly constrained serving environment such as edge devices, as we show that it can achieve better accuracy than a purely ***** parametric ***** model with comparable docker-level system size | ||
| dev | 26 | |
| 2020.osact-1.18 Overall, our best classifier (that combines both CNN and RNN in a joint architecture) achieved 0.73 macro-F1 score on the ***** dev ***** set, which significantly outperforms the majority-class baseline that achieves 0.49, proving the effectiveness of our “quick and simple” approach. | ||
| 2020.emnlp-main.40 English ***** dev ***** accuracy is often uncorrelated (or even anti-correlated) with target language accuracy, and zero-shot performance varies greatly at different points in the same fine-tuning run and between different fine-tuning runs. | ||
| 2013.iwslt-evaluation.11 Demonstrating the latter method is superior in the current task, we obtained a WER of 28.16% on the ***** dev ***** set and 36.21% on the test set. | ||
| 2021.emnlp-main.459 Prior work has relied on English ***** dev ***** data to select among models that are fine-tuned with different learning rates, number of steps and other hyperparameters, often resulting in suboptimal choices. | ||
| 2020.wnut-1.51 The solution submitted to final test leaderboard is a fine tuned RoBERTa model which achieves F1 score of 90.8% and 89.4% on the ***** dev ***** and test data respectively | ||
| propositional | 26 | |
| L16-1606 Hence besides the semantic role labeling of verbs, the argument structure of 1300 unique ***** propositional ***** nouns and 300 unique ***** propositional ***** adjectives were annotated in the sentences, too. | ||
| 1998.amta-papers.11 Analysis aims to find the ***** propositional ***** structure of the input utterance without constructing a deep syntactic tree, instead it utilizes a weak interaction between syntax and semantics. | ||
| 2021.law-1.13 For this purpose, we rely primarily on the semantic representations generated by the state of the art VerbNet parser (Gung, 2020), and extract the entities (event participants) and their states, based on the semantic predicates of the generated VerbNet semantic representation, which is in ***** propositional ***** logic format. | ||
| 2021.cmcl-1.28 Regression analyses show that surprisal estimates calculated from the full parser make a significant contribution to predicting self-paced reading times over those from the parser without syntactic category information, as well as a significant contribution to predicting eye-gaze durations over those from the parser without ***** propositional ***** content information | ||
| 2021.blackboxnlp-1.19 Tsetlin Machine ( TM ) is an interpretable pattern recognition algorithm based on *****propositional***** logic , which has demonstrated competitive performance in many Natural Language Processing ( NLP ) tasks , including sentiment analysis , text classification , and Word Sense Disambiguation . | ||
| Random | 26 | |
| 2020.semeval-1.58 The system for Subtask 1: Sentence Classification is based on a transformer architecture where we use transfer learning to fine-tune a pretrained model on the downstream task, and the one for Subtask 3: Relation Classification uses a ***** Random ***** Forest classifier with handcrafted dedicated features. | ||
| W18-5522 Various features computed using these probabilities are finally used by a ***** Random ***** Forest classifier to determine the overall truthfulness of the claim. | ||
| L16-1722 It relies on a ***** Random ***** Forest algorithm and nine unsupervised corpus-based features. | ||
| 2020.wosp-1.11 We used ***** Random ***** Forest with cost-sensitive learning for classification of sentences encoded into a vector of dimension 300. | ||
| R19-1158 Our results on the test set, which has been manually labeled, show that performing morphological analysis improves the classification performance of the traditional machine learning algorithms ***** Random ***** Forest, Naive Bayes, and Support Vector Machines | ||
| Sinhala | 26 | |
| 2021.ranlp-1.82 This paper proposes a Neural Machine Translation(NMT) model to translate the ***** Sinhala *****-English code-mixed text to the ***** Sinhala ***** language. | ||
| 2021.ranlp-1.129 Using a dataset belonging to English, ***** Sinhala *****, and Tamil, which belong to three different language families, we show that these task-specific supervised distance learning metrics outperform their unsupervised counterparts, for document alignment. | ||
| 2020.lrec-1.579 Accuracy reported in terms of average BLEU score for English, ***** Sinhala ***** and Tamil languages were 22.97%, 24.49% and 20.74%, respectively. | ||
| 2020.lrec-1.231 This paper presents the first ever comprehensive evaluation of different types of word embeddings for *****Sinhala***** language . | ||
| W16-3718 This paper presents a new comprehensive multi - level Part - Of - Speech tag set and a Support Vector Machine based Part - Of - Speech tagger for the *****Sinhala***** language . | ||
| sociolinguistic | 26 | |
| L16-1547 The ***** sociolinguistic ***** context that motivates the present development is explained. | ||
| 2020.lrec-1.507 The corpus, which was built to inform a ***** sociolinguistic ***** study on language variation and code-switching, consists of 10 hours of recorded speech (87k tokens) between 45 Vietnamese-English bilinguals living in Canberra, Australia. | ||
| L12-1043 The corpus will be designed as a representation of contemporary spontaneous spoken language used in informal, real-life situations on the area of the whole Czech Republic and thus balanced in the main ***** sociolinguistic ***** categories of speakers. | ||
| 2001.mtsummit-road.9 Test data embodying geographic and ***** sociolinguistic ***** differences were obtained from a synchronous Chinese corpus of news media texts. | ||
| 2021.latechclfl-1.14 Tracing the influence of individuals or groups in social networks is an increasingly popular task in *****sociolinguistic***** studies . | ||
| fastText | 26 | |
| 2021.ranlp-1.34 We also investigate more modern approaches like ***** fastText *****, which makes use of subword information. | ||
| 2021.starsem-1.4 In this work, we discuss the method to represent a book as a spectrum of concepts based on the association score between its content embedding and a global embedding (i.e. ***** fastText *****) for a set of semantically linked word clusters. | ||
| 2020.coling-main.481 However, in settings with small training datasets a simple method like ***** fastText ***** coupled with domain-specific word embeddings performs equally well or better than BERT, even when pre-trained on domain-specific data. | ||
| S18-1078 Also, we have made an effort to demonstrate how ***** fastText ***** framework can be useful in case of emoji prediction. | ||
| W19-4329 Most word embedding algorithms such as word2vec or ***** fastText ***** construct two sort of vectors: for words and for contexts | ||
| cooccurrence | 26 | |
| 2012.amta-papers.15 The IBM schemes use weighted ***** cooccurrence ***** counts to iteratively improve translation and alignment probability estimates. | ||
| 2020.coling-main.106 Word embeddings are trained to predict word ***** cooccurrence ***** statistics, which leads them to possess different lexical properties (syntactic, semantic, etc.) | ||
| 1997.iwpt-1.25 Our parser uses statistical ***** cooccurrence ***** data to compute the lexical associations | ||
| W17-2313 Vector space methods that measure semantic similarity and relatedness often rely on distributional information such as *****cooccurrence***** frequencies or statistical measures of association to weight the importance of particular cooccurrences . | ||
| 2020.coling-main.354 Lexical semantics theories differ in advocating that the meaning of words is represented as an inference graph , a feature mapping or a *****cooccurrence***** vector , thus raising the question : is it the case that one of these approaches is superior to the others in representing lexical semantics appropriately ? | ||
| metaphoric | 26 | |
| W18-0903 However, recent work has shown the predictive power of syntactic constructions in determining ***** metaphoric ***** source and target domains (Sullivan 2013). | ||
| K17-1036 This paper explores the information-theoretic measure entropy to detect ***** metaphoric ***** change, transferring ideas from hypernym detection to research on language change. | ||
| L16-1724 Since we are convinced, along with (Shutoff, 2015), that metaphor detection systems should be concerned only with the identification of highly ***** metaphoric ***** expressions, we believe that POM could be profitably employed by these systems to a priori exclude expressions that, due to the verb they include, can only have low degrees of ***** metaphoric *****ity | ||
| W18-0912 I attribute this increase in accuracy to the use of constructional cues, extracted from the raw text of ***** metaphoric ***** instances | ||
| 2020.lrec-1.227 This paper examines the procedure for lexico - semantic annotation of the Basic Corpus of Polish Metaphors that is the first step for annotating *****metaphoric***** expressions occurring in it . | ||
| expressive | 26 | |
| L14-1581 Our project aims at full body ***** expressive ***** interactions between a user and an autonomous virtual agent. | ||
| L12-1520 The aim of these corpora is to study ***** expressive ***** storytelling behaviour, and to help in designing ***** expressive ***** prosodic and co-verbal variations for the artificial storyteller). | ||
| 2021.naacl-main.259 We focus on variations that learn ***** expressive ***** prior distributions over the latent variable. | ||
| N19-1127 MTSA 1) captures both pairwise (token2token) and global (source2token) dependencies by a novel compatibility function composed of dot-product and additive attentions, 2) uses a tensor to represent the feature-wise alignment scores for better ***** expressive ***** power but only requires parallelizable matrix multiplications, and 3) combines multi-head with multi-dimensional attentions, and applies a distinct positional mask to each head (subspace), so the memory and computation can be distributed to multiple heads, each with sequential information encoded independently. | ||
| Q14-1028 The high-level idea of our approach is to harvest ***** expressive ***** phrases (as tree fragments) from existing image descriptions, then to compose a new description by selectively combining the extracted (and optionally pruned) tree fragments | ||
| Levenshtein | 26 | |
| E17-4002 An approach for spelling-variant detection is presented, where pairs of potential spelling variants are generated with ***** Levenshtein ***** distance and subsequently filtered by supervised machine learning. | ||
| 2020.sigmorphon-1.17 The overall average performance of our submission ranks the first in both average accuracy and ***** Levenshtein ***** distance from the gold inflection among all submissions including those using external resources. | ||
| W19-4226 Every participating team improved in accuracy over the baselines for the inflection task (though not ***** Levenshtein ***** distance), and every team in the contextual analysis task improved on both state-of-the-art neural and non-neural baselines. | ||
| L10-1303 We compare our system to a baseline based on ***** Levenshtein ***** distance and find that, when evaluated on single-error queries, our system performs 28% better than the baseline (overall MRR) and is twice as good at returning the correct dictionary form as the top-ranked result. | ||
| W16-4803 Computational approaches for dialectometry employed *****Levenshtein***** distance to compute an aggregate similarity between two dialects belonging to a single language group . | ||
| Hypernym | 26 | |
| S18-1152 This paper describes 300-sparsians's participation in SemEval-2018 Task 9: ***** Hypernym ***** Discovery, with a system based on sparse coding and a formal concept hierarchy obtained from word embeddings. | ||
| 2021.emnlp-main.179 Specifically, with the help of a knowledge base, we introduce two auxiliary training objectives: 1) Interpret Masked Word, which conjectures the meaning of the masked entity given the context; 2) ***** Hypernym ***** Generation, which predicts the hypernym of the entity based on the context | ||
| S18-1149 *****Hypernym***** Discovery is the task of identifying potential hypernyms for a given term . | ||
| S18-1148 *****Hypernym***** discovery aims to discover the hypernym word sets given a hyponym word and proper corpus . | ||
| 2016.jeptalnrecital-recital.4 *****Hypernym***** extraction from Wikipdia The volume of available documents on the Web continues to increase , the texts contained in these documents are rich information describing concepts and relationships between concepts specific to a particular field . | ||
| constituent | 26 | |
| C16-1041 First, a partial ***** constituent ***** tree is derived from a dependency tree with a very simple deterministic algorithm that is both language and dependency type independent. | ||
| L14-1720 The probability of a ***** constituent ***** in an unknown (or unanalysed) compound forming a combined ***** constituent ***** with either of its neighbours is estimated, with the use of data on the ***** constituent ***** structure of over 240 thousand compounds from the Database of Modern Icelandic Inflection, and word frequencies from Ïslenskur orðasjöður, a corpus of approx. | ||
| P19-1457 Neural models have been investigated for sentiment classification over ***** constituent ***** trees. | ||
| P19-1230 Our parser achieves new state-of-the-art performance for both parsing tasks on Penn Treebank (PTB) and Chinese Penn Treebank, verifying the effectiveness of joint learning ***** constituent ***** and dependency structures. | ||
| 2021.conll-1.23 The most straightforward approach to joint word segmentation ( WS ) , part - of - speech ( POS ) tagging , and *****constituent***** parsing is converting a word - level tree into a char - level tree , which , however , leads to two severe challenges . | ||
| determining | 26 | |
| 2021.louhi-1.3 Understanding the expressions of these social supports in an online COVID- 19 forum is important for: (a) the forum and its members to provide the right type of support to individuals and (b) ***** determining ***** the long term effects of the COVID-19 pandemic on the well-being of the public, thereby informing interventions. | ||
| 2021.mwe-1.8 The paper's aim is to account for the properties ***** determining ***** these patterns on the basis of a corpus study on German LVCs of the type `stehen unter' NP' (`stand under NP'). | ||
| 2020.lrec-1.741 Despite the extensive use of HMMs for sign recognition, ***** determining ***** the HMM structure has still remained as a challenge, especially when the number of signs to be modeled is high. | ||
| D19-6303 This paper describes a method of inflecting and linearizing a lemmatized dependency tree by: (1) ***** determining ***** a regular expression and substitution to describe each productive wordform rule; (2) learning the dependency distance tolerance for each head-dependent pair, resulting in an edge-weighted directed acyclic graph (DAG); and (3) topologically sorting the DAG into a surface realization based on edge weight. | ||
| P19-1456 We find that as the distance between a pair of claims increases along the argument path, ***** determining ***** the relative specificity of a pair of claims becomes easier and ***** determining ***** their relative stance becomes harder | ||
| probing | 26 | |
| 2021.acl-long.145 We call this ***** probing ***** setup Worm's Eye. | ||
| 2020.acl-main.375 We apply a probe for extracting directed dependency trees to BERT and ELMo models trained on 13 different languages, ***** probing ***** for two different syntactic annotation styles: Universal Dependencies (UD), prioritizing deep syntactic relations, and Surface-Syntactic Universal Dependencies (SUD), focusing on surface structure. | ||
| 2020.acl-main.698 2019, we propose two new ***** probing ***** tasks analyzing factual knowledge stored in Pretrained Language Models (PLMs). | ||
| 2021.eacl-main.295 In this work, we examine this ***** probing ***** paradigm through a case study in Natural Language Inference, showing that models can learn to encode linguistic properties even if they are not needed for the task on which the model was trained. | ||
| 2020.findings-emnlp.125 This technique reveals that in BERT, layers with high ***** probing ***** performance on downstream GLUE tasks are neither necessary nor sufficient for high accuracy on those tasks | ||
| variation | 26 | |
| L10-1574 The relationship between the abstract and the concrete, which is at the basis of the Conceptual Metaphor perspective, can be considered strictly related to the ***** variation ***** of the ontological values found in our analysis of the PNs and their belonging classes which are codified in the ItalWordNet database. | ||
| L08-1225 The principle differences between tagsets are evidenced by ***** variation ***** in categories in one corpus in the same contexts where another corpus exhibits only a single tag. | ||
| 2020.nuse-1.1 Comparing these approaches is complicated by ***** variation ***** in the systems' use of gold vs. computed labels, as well as ***** variation ***** in the document clustering pre-processing step. | ||
| P18-1142 However, the effectiveness of cross-lingual transfer can be challenged by ***** variation ***** in syntactic structures. | ||
| L08-1535 Since Japanese words often have ***** variation ***** in orthography and the vocabulary of Japanese consists of words of several different origins, it sometimes happens that more than one writing form corresponds to the same lemma and that a single writing form corresponds to two or more lemmas with different readings and/or meanings | ||
| multilingual corpus | 26 | |
| N19-1388 We report results on the publicly available TED talks ***** multilingual corpus ***** where we show that massively multilingual many-to-many models are effective in low resource settings, outperforming the previous state-of-the-art while supporting up to 59 languages in 116 translation directions in a single model. | ||
| L12-1338 We designed a common questionnaire of stress-inducing and non-stress-inducing questions in English, Mandarin and Cantonese and collected a first ever, ***** multilingual corpus ***** of natural stress emotion. | ||
| 2021.acl-short.37 Our model is trained on a large ***** multilingual corpus ***** of mention pairs derived from Wikipedia hyperlinks, and performs nearest neighbor inference on an index of 700 million mentions. | ||
| 2020.lrec-1.863 The paper presents the Bulgarian MARCELL corpus, part of a recently developed ***** multilingual corpus ***** representing the national legislation in seven European countries and the NLP pipeline that turns the web crawled data into structured, linguistically annotated dataset | ||
| 2021.ranlp-1.154 We will present a ***** multilingual corpus ***** of disinformation and debunks which contains text, concept tags, images and videos as well as various methods for searching and leveraging the content. | ||
| continuous | 26 | |
| L16-1087 We applied a fast unsupervised method for learning ***** continuous ***** representations of words in vector space. | ||
| D19-3020 This paper presents a novel open-source web-based rumour analysis tool that can ***** continuous ***** learn from journalists. | ||
| 2021.eacl-demos.23 The resulting sense clusters offer uniquely detailed insights into lexical change over ***** continuous ***** intervals with model transparency and provenance. | ||
| 2014.amta-researchers.17 In contrast to other ***** continuous ***** space approach, RBM based models can easily be integrated into the decoder and are able to directly learn a hidden representation of the n-gram | ||
| P17-2058 We demonstrate that a *****continuous***** relaxation of the argmax operation can be used to create a differentiable approximation to greedy decoding in sequence - to - sequence ( seq2seq ) models . | ||
| computational argumentation | 26 | |
| 2021.acl-long.126 Approaches to ***** computational argumentation ***** tasks such as stance detection and aspect detection have largely focused on the text of independent claims, losing out on potentially valuable context provided by the rest of the collection. | ||
| E17-1017 Research on ***** computational argumentation ***** faces the problem of how to automatically assess the quality of an argument or argumentation. | ||
| W17-5106 The framework and the argument search engine are intended as an environment for collaborative research on ***** computational argumentation ***** and its practical evaluation | ||
| W19-4502 We present a model to tackle a fundamental but understudied problem in ***** computational argumentation *****: proposition extraction. | ||
| W17-5208 Mining arguments from natural language texts, parsing argumentative structures, and assessing argument quality are among the recent challeng-es tackled in ***** computational argumentation *****. | ||
| distributional similarity | 26 | |
| L14-1031 The second method makes use of recent advances in ***** distributional similarity ***** representation to transfer existing norms to their closest neighbors in a high-dimensional vector space. | ||
| 2020.lt4hala-1.10 For ***** distributional similarity *****, we consider the cosine similarity of PPMI vectors of Hebrew roots and also, in a somewhat novel approach, apply Word2Vec to a Biblical corpus reduced to its lexemes. | ||
| L14-1464 Our evaluation shows that ***** distributional similarity ***** as a re-ranking feature is more robust than language model scores and leads to an improved ranking of the synonym candidates. | ||
| L16-1236 Some of them are based on the computation of ***** distributional similarity ***** coefficients which identify pairs of sibling words or co-hyponyms, while others are based on asymmetric co-occurrence and identify pairs of parent-child words or hypernym-hyponym relations. | ||
| W18-5455 We hypothesize that end-to-end neural image captioning systems work seemingly well because they exploit and learn `***** distributional similarity *****' in a multimodal feature space, by mapping a test image to similar training images in this space and generating a caption from the same space. | ||
| images | 26 | |
| 2021.dravidianlangtech-1.43 As memes are in ***** images ***** forms with embedded text, it can quickly spread hate, offence and violence. | ||
| 2021.naacl-industry.31 We conduct experiments to generate radiology reports from medical ***** images ***** of chest x-rays using MIMIC-CXR. | ||
| C18-1321 Apart from textual view cued by both the semantic and syntactic information, a complimentary view extracted from ***** images ***** contained in the web-snippets is also utilized in the current framework. | ||
| L14-1272 In addition to the dataset design, we present the first results that we obtained by applying clustering methods to the annotated dataset in order to extract the entity ***** images *****. | ||
| 2020.acl-main.673 Similar problems have been studied extensively for other forms of data, such as ***** images ***** and videos. | ||
| speech corpora | 26 | |
| 2020.coling-main.519 This is orders of magnitude larger than previous ***** speech corpora ***** used for search and summarization. | ||
| L16-1210 The Mixer series of ***** speech corpora ***** were collected over several years, principally to support annual NIST evaluations of speaker recognition (SR) technologies. | ||
| L14-1059 Researchers working with ***** speech corpora ***** are often faced with multiple tools and formats, and they need to work with ever-increasing amounts of data in a collaborative way. | ||
| L14-1493 More specifically, we randomly choose transcriptions from various ***** speech corpora ***** as text stimuli with which to conduct a rating experiment on speaking style perception; then, using the features extracted from those stimuli and the rating results, we construct an estimation model of speaking style by a multi-regression analysis. | ||
| L08-1529 In this paper, we propose IrcamCorpusTools, an open and easily extensible platform for analysis, query and visualization of ***** speech corpora *****. | ||
| multilingual parsing | 26 | |
| K17-3027 We present the Open University's submission to the CoNLL 2017 Shared Task on ***** multilingual parsing ***** from raw text to Universal Dependencies. | ||
| 2020.udw-1.21 We evaluated the treebank on the dependency parsing task using a pretrained ***** multilingual parsing ***** model, and the results are comparable with other low-resourced treebanks with no training set. | ||
| K18-2021 We present the contribution of the ONLP lab at the Open University of Israel to the UD shared task on ***** multilingual parsing ***** from raw text to Universal Dependencies. | ||
| N19-1393 Our experiments on ***** multilingual parsing ***** for 40 languages show that typological information can indeed guide parsers to share information between similar languages beyond simple language identification. | ||
| K18-2024 We participated in the CoNLL 2018 Shared Task on ***** multilingual parsing ***** from raw text to universal dependencies as the BOUN team. | ||
| online forums | 26 | |
| 2021.louhi-1.3 In ***** online forums ***** focused on health and wellbeing, individuals tend to seek and give the following social support: emotional and informational support. | ||
| W18-5109 We observe that the purpose of conversations in ***** online forums ***** tend to be more constructive and informative than those in Wikipedia page edit comments which are geared more towards adversarial interactions, and that this may explain the lower levels of abuse found in our forum data than in Wikipedia comments. | ||
| P18-2025 Automatically making comments thus become a valuable functionality for ***** online forums *****, intelligent chatbots, etc. | ||
| 2021.semeval-1.137 Toxic language is often present in ***** online forums *****, especially when politics and other polarizing topics arise, and can lead to people becoming discouraged from joining or continuing conversations. | ||
| W18-5602 Existing approaches which analyze such user-generated content in ***** online forums ***** heavily rely on feature engineering of both documents and users, and often overlook the relationships between posts within a common discussion thread. | ||
| intent detection | 26 | |
| 2021.naacl-industry.38 Secondly, even with large training data, the ***** intent detection ***** models can see a different distribution of test data when being deployed in the real world, leading to poor accuracy. | ||
| 2021.eacl-main.79 We then test these models equipped with the same transformer-based encoder on the ***** intent detection ***** task, known for having a large amount of classes. | ||
| 2020.acl-main.99 Coupled with a density-based outlier detection algorithm, SEG achieves competitive results on three real task-oriented dialogue datasets in two languages for unknown ***** intent detection *****. | ||
| 2020.challengehml-1.7 Our experimental results outperformed text-only baselines as we achieved improved performances for ***** intent detection ***** with multimodal approach. | ||
| 2020.emnlp-main.411 Our extensive experiments on a large-scale multi-domain ***** intent detection ***** task show that our method achieves more stable and accurate in-domain and OOS detection accuracy than RoBERTa-based classifiers and embedding-based nearest neighbor approaches. | ||
| contextual emotion detection | 26 | |
| S19-2031 We have developed a Snapshot Ensemble of 1D Hierarchical Convolutional Neural Networks to extract features from 3-turn conversations in order to perform ***** contextual emotion detection ***** in text. | ||
| S19-2042 Our model was evaluated on the data provided by the SemEval-2019 shared task on ***** contextual emotion detection ***** in text. | ||
| S19-2036 This paper describes our transfer learning-based approach to ***** contextual emotion detection ***** as part of SemEval-2019 Task 3. | ||
| S19-2061 This paper presents our ***** contextual emotion detection ***** system in approaching the SemEval2019 shared task 3: EmoContext: Contextual Emotion Detection in Text. | ||
| R19-1091 This paper describes a new approach for the task of ***** contextual emotion detection *****. | ||
| lexical knowledge | 26 | |
| L14-1627 We discuss some easy to compute statistics to demonstrate the variation and differences in the test sets and provide some baseline experiments where we test the effect of additional ***** lexical knowledge ***** on the out-of-domain performance of two state-of-the-art dependency parsers. | ||
| L06-1270 The alignment of multilingual ***** lexical knowledge ***** sources has various applications ranging from knowledge acquisition to semantic validation of interlingual equivalence of presumably the same meaning express in different languages. | ||
| D18-1170 Traditional supervised methods only use labeled data (context), while missing rich ***** lexical knowledge ***** such as the gloss which defines the meaning of a word sense. | ||
| W18-0519 Our approach is based on combining multiple low-level features, such as character n-grams, with high-level semantic features that are either automatically learned using word embeddings or extracted from a ***** lexical knowledge ***** base, namely WordNet. | ||
| 2020.sigdial-1.19 This paper reports the results of our analysis on how user impression changes depending on the types of questions to acquire ***** lexical knowledge *****, that is, explicit and implicit questions, and the correctness of the content of the questions. | ||
| link prediction | 26 | |
| 2020.coling-main.153 By combining relation prediction and relevance ranking tasks with our target ***** link prediction *****, the proposed model can learn more relational properties in KGs and properly perform even when lexical similarity occurs. | ||
| D19-1522 TuckER outperforms previous state-of-the-art models across standard ***** link prediction ***** datasets, acting as a strong baseline for more elaborate models. | ||
| N19-1104 Then we conduct the ***** link prediction ***** tasks on standard data sets to evaluate GRank. | ||
| 2021.acl-long.147 We study the problem of generating data poisoning attacks against Knowledge Graph Embedding (KGE) models for the task of ***** link prediction ***** in knowledge graphs. | ||
| 2020.acl-main.209 To this end, we propose an evaluation protocol and a methodology for creating the open ***** link prediction ***** benchmark OlpBench. | ||
| papers | 26 | |
| 2010.amta-***** papers *****.6 In this paper, we present the insights gained from a detailed study of coupling a highly modular English-Hindi RBMT system with a standard phrase-based SMT system. | ||
| 2008.amta-***** papers *****.19 We also build a cascaded translation model that dynamically shifts translation units from phrase level to word and morpheme phrase levels. | ||
| 2020.sigdial-1.29 A total of 20 ***** papers ***** from the last two years are surveyed to analyze three types of evaluation protocols: automated, static, and interactive. | ||
| L12-1501 This will ensure that Language Resources be identified , accessed and disseminated in a unique manner , thus allowing them to be recognized with proper references in all activities concerning Human Language Technologies as well as in all documents and scientific *****papers***** . | ||
| 2020.acl-main.323 While much previous research ( Sutskever et al . , 2013 ; Duchi et al . , 2011 ; Kingma and Ba , 2015 ) focuses on accelerating convergence and reducing the effects of the learning rate , comparatively few *****papers***** concentrate on the effect of batch size . | ||
| capturing discriminative | 26 | |
| S18-1168 In this paper we present three unsupervised models for ***** capturing discriminative ***** attributes based on information from word embeddings, WordNet, and sentence-level word co-occurrence frequency. | ||
| S18-1163 This paper describes BomJi, a supervised system for ***** capturing discriminative ***** attributes in word pairs (e.g. | ||
| S18-1169 This paper describes the system that we submitted for SemEval-2018 task 10: ***** capturing discriminative ***** attributes. | ||
| S18-1164 This paper presents a comparison of several approaches for ***** capturing discriminative ***** attributes and considers an impact of concatenation of several word embeddings of different nature on the classification performance. | ||
| S18-1167 We participated to the SemEval-2018 shared task on *****capturing discriminative***** attributes ( Task 10 ) with a simple system that ranked 8th amongst the 26 teams that took part in the evaluation . | ||
| computer science | 26 | |
| W18-2401 In recent years, the journalists and ***** computer science *****s speak to each other to identify useful technologies which would help them in extracting useful information. | ||
| K17-1021 In this paper, we introduce a new dataset for summarisation of ***** computer science ***** publications by exploiting a large resource of author provided summaries and show straightforward ways of extending it further. | ||
| L16-1231 Experiment shows that the result of ontology learning from corpus of ***** computer science ***** can be improved via the relation instances extracted from DBpedia in the same field. | ||
| W17-3101 Automatic detection of depression has attracted increasing attention from researchers in psychology, ***** computer science *****, linguistics, and related disciplines. | ||
| W16-4904 We evaluate the proposed technique across two datasets from different domains, namely, ***** computer science ***** and English reading comprehension, that additionally vary between highschool level and undergraduate students. | ||
| latent variable | 26 | |
| D19-1124 Existing works simply assume the Gaussian priors of the ***** latent variable *****, which are incapable of representing complex ***** latent variable *****s effectively. | ||
| 2020.coling-main.344 Both models consist of two parts: an encoder enhanced by deep neural networks (DNN) that can utilize the contextual information to encode the input into ***** latent variable *****s, and a decoder which is a generative model able to reconstruct the input. | ||
| 2020.coling-main.102 Recent work proposed to cooperate variational inference on a target-related ***** latent variable ***** to introduce the diversity. | ||
| 2020.emnlp-main.274 The proposed architecture is designed to model each aspect of goal-oriented dialogs using inter-connected ***** latent variable *****s and learns to generate coherent goal-oriented dialogs from the latent spaces. | ||
| 2020.spnlp-1.10 However , we observe no correlation between rankings of models across different families : ( 1 ) among non - autoregressive latent variable models , a flexible prior distribution is better at density estimation but gives worse generation quality than a simple prior , and ( 2 ) autoregressive models offer the best translation performance overall , while *****latent variable***** models with a normalizing flow prior give the highest held - out log - likelihood across all datasets . | ||
| web search | 26 | |
| 2021.acl-demo.6 Then, we incorporate rich Web content for synonym detection and concept selection via a ***** web search ***** API. | ||
| 2020.lrec-1.857 As a byproduct of our study, we create two new datasets comprised of spelling errors generated by children from hand-written essays and ***** web search ***** inquiries, which we make available to the research community. | ||
| E17-1105 We outline what natural language challenges must be faced at web scale in order to stepwise bring argument relevance to ***** web search ***** engines. | ||
| L16-1106 Our corpus can serve as a benchmark for term importance methods aimed at improving search engine quality and as an initial step toward developing a dataset of gold linguistic analysis of ***** web search ***** queries. | ||
| L06-1345 As corpus query system, we are using the Corpus Workbench developed at the University of Stuttgart together with a ***** web search ***** interface developed at Aksis, University of Bergen. | ||
| neural semantic | 26 | |
| 2020.intexsempar-1.4 We introduce a ***** neural semantic ***** parsing system that learns new high-level abstractions through decomposition: users interactively teach the system by breaking down high-level utterances describing novel behavior into low-level steps that it can understand. | ||
| D18-2002 Experiments on four different semantic parsing and code generation tasks show that our system is generalizable, extensible, and effective, registering strong results compared to existing ***** neural semantic ***** parsers. | ||
| 2020.lrec-1.714 In addition to the dataset, we propose a novel ***** neural semantic ***** parser as a strong baseline model. | ||
| 2021.nlp4convai-1.5 To evaluate the impact of this understudied problem, we propose an experimental setup for simulating changes to a ***** neural semantic ***** parser. | ||
| 2021.repl4nlp-1.24 Specially, ***** neural semantic ***** parsers (NSPs) effectively translate natural questions to logical forms, which execute on KB and give desirable answers. | ||
| standard | 26 | |
| 2010.amta-papers.6 In this paper, we present the insights gained from a detailed study of coupling a highly modular English-Hindi RBMT system with a ***** standard ***** phrase-based SMT system. | ||
| 2020.lrec-1.752 Predictably, performance was twice as good in tweets with ***** standard ***** orthography than in tweets with spelling/casing irregularities or lack of sentence separation, the effect being more marked for morphology than for syntax. | ||
| W18-5427 However, ***** standard ***** attention models are of limited interpretability for tasks that involve a series of inference steps. | ||
| 2020.lrec-1.641 The texts come from different sources: daily newspaper articles, Czech translation of the Wall Street Journal, transcribed dialogs and a small amount of user-generated, short, often non-***** standard ***** language segments typed into a web translator. | ||
| C16-1078 The system was evaluated using ***** standard ***** IR metrics on the new benchmark, and we saw that lexical-semantical rerankers improve significantly over a purely surface-oriented system, but must be carefully tailored for each individual construction. | ||
| privacy | 26 | |
| D19-1240 User's ***** privacy ***** concerns mandate data publishers to protect ***** privacy *****. | ||
| L08-1219 Corpora of multi-modal conversational speech are rare and frequently difficult to use due to ***** privacy ***** and copyright restrictions. | ||
| 2021.acl-long.532 Thus, we present the PrivaSeer Corpus of 1,005,380 English language website ***** privacy ***** policies collected from the web. | ||
| 2020.coling-main.79 In particular, we focus on the detection of unfair clauses in ***** privacy ***** policies and terms of service. | ||
| 2020.findings-emnlp.55 Furthermore, differential ***** privacy ***** is introduced to protect participants in the training process, in a manageable manner. | ||
| statistical language | 26 | |
| L12-1025 As an alternative, and perhaps less traditional approach, we also use surface information to build ***** statistical language ***** models of the referring expressions that are most likely to occur in the corpus, and let the model probabilities guide attribute selection. | ||
| W16-4502 A 12-gram ***** statistical language ***** model was selected as a baseline to oppose three neural network based models of different characteristics. | ||
| 2011.iwslt-evaluation.7 This framework not only allows to achieve state-of-the-art results for this language pair, but is also appealing due to its conceptual simplicity and its use of well understood ***** statistical language ***** models. | ||
| L06-1014 In this paper building ***** statistical language ***** models for Persian language using a corpus and incorporating them in Persian continuous speech recognition (CSR) system are described. | ||
| L10-1449 At the core of the system is a state of the art ***** statistical language ***** classification technology for mapping from user's text input to system responses. | ||
| customer | 26 | |
| 2004.amta-papers.27 This paper describes our experience in deploying this system and the (positive) ***** customer ***** response to the availability of machine translated articles, as well as other uses of MSR-MT either planned or underway at Microsoft. | ||
| 2020.emnlp-main.149 Take ***** customer ***** service and court debate dialogue as examples, compatible logics can be observed across different dialogue instances, and this information can provide vital evidence for utterance generation. | ||
| 2020.ecomnlp-1.4 We propose a novel way of conversational recommendation, where instead of asking questions to the user to acquire their preferences; the recommender tracks their conversation with other people, including ***** customer ***** support agents (CSA), and joins the conversation only when it is time to introduce a recommendation. | ||
| 2021.ecnlp-1.6 Specifically, in the shopping domain, ***** customer *****s tend to mention the entities implicitly (e.g., “organic milk”) rather than use the entity names explicitly, leading to a large number of candidate products. | ||
| L14-1240 This corpus is one of the first lexical resources focusing on real world applications that analyze the voice of the ***** customer ***** which is crucial for various industrial use cases. | ||
| short | 26 | |
| W18-1704 Additional tests, which take advantage of the fact that the length of compressions can be modulated, still improve ROUGE scores with ***** short *****er output sentences. | ||
| 2021.emnlp-main.200 These directed subgraphs are considered to well preserve extra but relevant content to the ***** short ***** input text, and then they are decoded by the employed pre-trained model to generate coherent long text. | ||
| 2020.lrec-1.641 The texts come from different sources: daily newspaper articles, Czech translation of the Wall Street Journal, transcribed dialogs and a small amount of user-generated, ***** short *****, often non-standard language segments typed into a web translator. | ||
| 2021.acl-***** short *****.29 In this paper, we introduce a new metric UMIC, an Unreferenced Metric for Image Captioning which does not require reference captions to evaluate image captions. | ||
| 2016.amta-researchers.15 Most diacritics in Arabic represent *****short***** vowels . | ||
| noise contrastive estimation | 26 | |
| 2021.naacl-main.86 The choice of negative examples is important in ***** noise contrastive estimation *****. | ||
| 2021.eacl-main.151 Due to the discreteness of text data, we adopt ***** noise contrastive estimation ***** (NCE) to train the energy-based model. | ||
| D18-1405 *****Noise Contrastive Estimation***** (NCE) is a powerful parameter estimation method for log-linear models, which avoids calculation of the partition function or its derivatives at each training step, a computationally demanding step in many cases. | ||
| D19-1421 They are derived as power generalizations of a softmax approximated via Importance Sampling, and *****Noise Contrastive Estimation*****, for accelerated learning. | ||
| D17-1198 Specifically, we show that with minor modifications to word2vec's algorithm, we get principled language models that are closely related to the well-established *****Noise Contrastive Estimation***** (NCE) based language models. | ||
| corpus pattern analysis | 26 | |
| W16-4506 This work focuses on WSD in verbs, based on two different approaches – verbal patterns based on ***** corpus pattern analysis ***** and verbal word senses from valency frames. | ||
| L12-1175 We have created and have been using VPS-30-En to explore the interannotator agreement potential of the *****Corpus Pattern Analysis*****. | ||
| C16-2023 Our approach is based on Frame Semantics and *****Corpus Pattern Analysis***** in order to provide a precise semantic interpretation of datetime expressions. | ||
| 2020.isa-1.6 This short research paper presents the results of a corpus-based metonymy annotation exercise on a sample of 101 Croatian verb entries – corresponding to 457 patters and over 20,000 corpus lines – taken from CROATPAS (Marini & Ježek, 2019), a digital repository of verb argument structures manually annotated with Semantic Type labels on their argument slots following a methodology inspired by *****Corpus Pattern Analysis***** (Hanks, 2004 & 2013; Hanks & Pustejovsky, 2005). | ||
| L16-1137 This data set has been created to observe the interannotator agreement on PDEV patterns produced using the *****Corpus Pattern Analysis***** (Hanks, 2013). | ||
| semantic orientation | 26 | |
| 2020.lrec-1.623 It contains 6,470 entries, both single and multi-word expressions, each with tags denoting their ***** semantic orientation ***** and intensity. | ||
| L12-1656 The possibility to discriminate between objective and subjective expressions contributes to the identification of a document's ***** semantic orientation ***** and to the detection of the opinions and sentiments expressed by the authors or attributed to other participants in the document. | ||
| L10-1448 Since this test dataset is small, we conduct a further evaluation on artificial examples of amelioration and pejoration, and again find evidence that our proposed method is able to identify changes in ***** semantic orientation *****. | ||
| L06-1250 We describe and compare different methods for creating a dictionary of words with their corresponding ***** semantic orientation ***** (SO). | ||
| W19-2718 We have also extracted the subjectivity of several rhetorical relations and the results show the effect of sentiment words in relations and the influence of each relation in the *****semantic orientation***** value. | ||
| audio | 26 | |
| L06-1485 Its features include synchronized multi-channel ***** audio ***** and video playback, compatibility with several corpora, platform independence, and mixed display of capabilities and a well-defined method for layering datasets. | ||
| 2020.semeval-1.99 Information on social media comprises of various modalities such as textual, visual and ***** audio *****. | ||
| 2020.lrec-1.804 This paper presents a dataset of transcribed high-quality ***** audio ***** of English sentences recorded by volunteers speaking with different accents of the British Isles. | ||
| 2020.acl-main.215 Our model treats speech and its own textual representation as two separate modalities or views, as it jointly learns from streamed ***** audio ***** and its noisy transcription into text via automatic speech recognition. | ||
| 2020.emnlp-main.445 We introduce a method of training on silent EMG by transferring ***** audio ***** targets from vocalized to silent signals. | ||
| comparable | 26 | |
| R19-1072 Evaluation on Malayalam Wikipedia data shows that our approach is correct and the results, though not as good as Tamil, but ***** comparable *****. | ||
| 2021.nlp4posimpact-1.7 Our system achieves the highest BLEU score and ***** comparable ***** SARI score in comparison to other systems. | ||
| 2020.acl-main.277 Experimental results on three widely-used benchmark datasets show that our proposed model achieves more than 4 times speedup while maintaining ***** comparable ***** performance compared with the corresponding autoregressive model. | ||
| L12-1529 However, as parallel corpora are a scarce resource, in recent years the extraction of dictionaries using ***** comparable ***** corpora has obtained increasing attention. | ||
| P19-1118 Mining parallel sentences from *****comparable***** corpora is important . | ||
| sentiment treebank | 26 | |
| D17-1056 Experimental results show that the proposed method can improve conventional word embeddings and outperform previously proposed sentiment embeddings for both binary and fine-grained classification on Stanford *****Sentiment Treebank***** (SST). | ||
| D19-1343 Experiments on Stanford *****Sentiment Treebank***** (SST) for sentiment classification and EmoBank for regression show that the proposed method improved the performance of tree-LSTM and other neural network models. | ||
| W17-5220 Our models are experimented on both the SemEval'16 Task 4 dataset and the Stanford *****Sentiment Treebank***** and show comparative or better results against the existing state-of-the-art systems. | ||
| P19-1342 Tree-LSTMs have been used for tree-based sentiment analysis over Stanford *****Sentiment Treebank*****, which allows the sentiment signals over hierarchical phrase structures to be calculated simultaneously. | ||
| 2020.emnlp-main.600 Starting with a small set of data, our results show an increased performance with MCTS of 26% on the TREC-6 Questions dataset, and 10% on the Stanford *****Sentiment Treebank***** SST-2 dataset. | ||
| qa - srl | 26 | |
| N18-2089 A qualitative analysis demonstrates that the crowd-generated question-answer pairs cover the vast majority of predicate-argument relationships in existing datasets (including PropBank, NomBank, and *****QA-SRL*****) along with many previously under-resourced ones, including implicit arguments and relations. | ||
| 2021.eacl-main.222 We introduce a new dataset by converting the *****QA-SRL***** 2.0 dataset to a large-scale OIE dataset LSOIE. | ||
| 2021.emnlp-main.778 Our setting exploits *****QA-SRL*****, utilizing question-answer pairs to capture predicate-argument relations, facilitating laymen annotation of cross-text alignments. | ||
| 2020.acl-main.626 Question-answer driven Semantic Role Labeling (*****QA-SRL*****) was proposed as an attractive open and natural flavour of SRL, potentially attainable from laymen. | ||
| K19-1042 In this work, we examine LTAL for learning semantic representations, such as *****QA-SRL*****. | ||
| enhanced dependency | 26 | |
| W18-6012 We evaluate two cross-lingual techniques for adding *****enhanced dependencies***** to existing treebanks in Universal Dependencies. | ||
| 2021.iwpt-1.24 We describe the NUIG solution for IWPT 2021 Shared Task of *****Enhanced Dependency***** (ED) parsing in multiple languages. | ||
| 2020.iwpt-1.19 This paper presents our *****enhanced dependency***** parsing approach using transformer encoders, coupled with a simple yet powerful ensemble algorithm that takes advantage of both tree and graph dependency parsing. | ||
| 2021.iwpt-1.22 We carry out additional post-deadline experiments which include using Trankit for pre-processing, XLM-RoBERTa LARGE, treebank concatenation, and multitask learning between a basic and an *****enhanced dependency***** parser. | ||
| 2020.iwpt-1.18 This paper describes our system to predict *****enhanced dependencies***** for Universal Dependencies (UD) treebanks, which ranked 2nd in the Shared Task on Enhanced Dependency Parsing with an average ELAS of 82.60%. | ||
| generic | 26 | |
| 2021.acl-long.57 In this paper , we propose Inverse Adversarial Training ( IAT ) algorithm for training neural dialogue systems to avoid *****generic***** responses and model dialogue history better . | ||
| 2021.mtsummit-asltrw.1 We address the problem of language model customization in applications where the ASR component needs to manage domain - specific terminology ; although current state - of - the - art speech recognition technology provides excellent results for *****generic***** domains , the adaptation to specialized dictionaries or glossaries is still an open issue . | ||
| N18-4011 Most of the health documents , including patient education materials and discharge notes , are usually flooded with medical jargons and contain a lot of *****generic***** information about the health issue . | ||
| 2020.lrec-1.34 We introduce in this paper a *****generic***** approach to combine implicit crowdsourcing and language learning in order to mass - produce language resources ( LRs ) for any language for which a crowd of language learners can be involved . | ||
| 2021.acl-long.475 The availability of large - scale datasets has driven the development of neural models that create *****generic***** summaries from single or multiple documents . | ||
| data - to - text | 26 | |
| W18-6543 Neural approaches to *****data - to - text***** generation generally handle rare input items using either delexicalisation or a copy mechanism . | ||
| D19-1052 Traditionally , most *****data - to - text***** applications have been designed using a modular pipeline architecture , in which non - linguistic input data is converted into natural language through several intermediate transformations . | ||
| D19-1299 Recent neural models for *****data - to - text***** generation rely on massive parallel pairs of data and text to learn the writing knowledge . | ||
| C18-1082 We present an evaluation of PASS , a *****data - to - text***** system that generates Dutch soccer reports from match statistics which are automatically tailored towards fans of one club or the other . | ||
| 2021.eacl-main.61 Recent advancements in *****data - to - text***** generation largely take on the form of neural end - to - end systems . | ||
| fake | 26 | |
| 2020.findings-emnlp.216 With the epidemic of COVID-19 , verifying the scientifically false online information , such as *****fake***** news and maliciously fabricated statements , has become crucial . | ||
| 2021.eacl-main.287 Nowadays , *****fake***** news is spreading in various ways , and this fake information is causing a lot of social damages . | ||
| 2021.naacl-main.462 Translated texts have been used for malicious purposes , i.e. , plagiarism or *****fake***** reviews . | ||
| D18-1393 Amidst growing concern over media manipulation , NLP attention has focused on overt strategies like censorship and *****fake***** news . | ||
| 2021.nlp4if-1.1 False information spread via the internet and social media influences public opinion and user activity , while generative models enable *****fake***** content to be generated faster and more cheaply than had previously been possible . | ||
| Neural machine translation ( NMT | 26 | |
| 2020.acl-main.34 *****Neural machine translation ( NMT***** ) encodes the source sentence in a universal way to generate the target sentence word - by - word . | ||
| 2020.emnlp-main.212 *****Neural machine translation ( NMT***** ) has achieved great success due to the ability to generate high - quality sentences . | ||
| W18-2712 *****Neural machine translation ( NMT***** ) has significantly improved the quality of automatic translation models . | ||
| 2020.lrec-1.454 *****Neural machine translation ( NMT***** ) needs large parallel corpora for state - of - the - art translation quality . | ||
| 2020.loresmt-1.5 *****Neural machine translation ( NMT***** ) is a widely accepted approach in the machine translation ( MT ) community , translating from one natural language to another natural language . | ||
| kappa | 25 | |
| D19-6216 Our baselines show promising results with content point accuracy and ***** kappa ***** values at 0.86 and 0.71 on the test set. | ||
| 2020.lrec-1.157 Overall, the BERT model achieved the best root mean squared error and quadratic weighted ***** kappa ***** scores. | ||
| C18-1254 Overall, across child and adult samples, including verbs and prepositions, the ***** kappa ***** score for sense is 72.6, for the number of semantic-role-bearing arguments, the ***** kappa ***** score is 77.4, for identical semantic role labels on a given argument, the ***** kappa ***** score is 91.1, for the span of semantic role labels, and the ***** kappa ***** for agreement is 93.9. | ||
| L06-1053 The reliability of tagging is evaluated by comparing the tagging among some annotators using ***** kappa ***** value. | ||
| D18-1090 In order to address this issue, we propose a reinforcement learning framework for essay scoring that incorporates quadratic weighted ***** kappa ***** as guidance to optimize the scoring system | ||
| Track | 25 | |
| S18-1011 Though 40 participants registered for the task, only one team submitted output, achieving 0.55 F1 in ***** Track ***** 1 (parsing) and 0.70 F1 in ***** Track ***** 2 (intervals). | ||
| W18-6527 We describe in detail how the datasets were derived from the Universal Dependencies V2.0, and report on an evaluation of the Deep ***** Track ***** input quality. | ||
| 2021.wassa-1.30 In the Shared Task leaderboard, we secured the fourth rank in ***** Track ***** 1 and the second rank in ***** Track ***** 2. | ||
| S17-2014 We describe our system (DT Team) submitted at SemEval-2017 Task 1, Semantic Textual Similarity (STS) challenge for English (***** Track ***** 5). | ||
| W19-4418 We introduce the AIP-Tohoku grammatical error correction (GEC) system for the BEA-2019 shared task in ***** Track ***** 1 (Restricted ***** Track *****) and ***** Track ***** 2 (Unrestricted ***** Track *****) using the same system architecture | ||
| propositions | 25 | |
| L08-1352 In the experiments we look for the target word in ***** propositions ***** made from the associated words thanks to 5 different resources. | ||
| 2020.lrec-1.226 Second, the system evaluation across time, with ***** propositions ***** of how a lifelong learning intelligent system should be evaluated when including human assisted learning or not. | ||
| D19-1653 However, existing research lacks empirical investigations on highly semantic aspects of elementary units (EUs), such as ***** propositions ***** for a persuasive online argument. | ||
| L14-1138 In this paper, we describe Refractive, an open-source tool to extract ***** propositions ***** from a parsed corpus based on the Hadoop variant of MapReduce. | ||
| W19-0602 Consequently, images can be judged as evidence for ***** propositions ***** | ||
| classifications | 25 | |
| 2020.acl-main.380 With the recent proliferation of the use of text ***** classifications *****, researchers have found that there are certain unintended biases in text classification datasets. | ||
| 2020.acl-main.33 Comprehensive experiments show that our model outperforms the state-of-the-art pre-trained model on both single- and multi-label ***** classifications *****, sentence and document ***** classifications *****, and ***** classifications ***** in three different languages. | ||
| 2005.mtsummit-papers.21 The traditional grammars and some of the research works have discussed the topic to some extent, particularly from the point of view of their descriptions and ***** classifications *****. | ||
| 2021.eacl-tutorials.2 There is also a growing body of recent work arguing that following the convention and training with adjudicated labels ignores any uncertainty the labellers had in their ***** classifications *****, which results in models with poorer generalisation capabilities. | ||
| 2021.ranlp-1.63 Concept normalization of clinical texts to standard medical ***** classifications ***** and ontologies is a task with high importance for healthcare and medical research | ||
| topology | 25 | |
| K19-1003 To perform element-wise cross-task embedding projection, we invent locally linear mapping which assumes and preserves the local ***** topology ***** across the semantic spaces before and after the projection. | ||
| 2021.emnlp-main.391 The ***** topology ***** of each graph models similarity relations among words, and is estimated jointly with the graph embedding. | ||
| 2021.acl-long.198 By virtue of the line graph, messages propagate more efficiently through not only connections between nodes, but also the ***** topology ***** of directed edges. | ||
| 2021.acl-short.46 However, modeling cardinality based on aggregating a set of transformations with the same ***** topology ***** has been proven more effective than going deeper or wider when increasing capacity. | ||
| C16-1274 Since existing attentive models exert attention on the sequential structure, we propose a way to incorporate attention into the tree ***** topology ***** | ||
| transcribing | 25 | |
| 2016.iwslt-1.17 This evaluation campaign focuses on ***** transcribing ***** spontaneous speech from Skype recordings. | ||
| 2021.mtsummit-at4ssl.9 We present a number of methodological recommendations concerning the online evaluation of avatars for text-to-sign translation, focusing on the structure, format and length of the questionnaire, as well as methods for eliciting and faithfully ***** transcribing ***** responses | ||
| L16-1314 This is achieved by limiting human effort to ***** transcribing ***** parts for which automatic transcription quality is insufficient. | ||
| 2021.naacl-main.149 We propose a Transformer-based sequence-to-sequence model for automatic speech recognition (ASR) capable of simultaneously ***** transcribing ***** and annotating audio with linguistic information such as phonemic transcripts or part-of-speech (POS) tags. | ||
| L08-1124 The transcription factor for ***** transcribing ***** the broadcast news data has been reduced using a method such as Quick Rich Transcription (QRTR) as well as reducing the number of quality controls performed on the data | ||
| contextualised | 25 | |
| 2020.pam-1.17 Specifically, we aim to i) investigate a recent model of polyseme sense clustering proposed by Ortega-Andres & Vicente (2019) through analysing empirical evidence of word sense grouping in human similarity judgements, ii) extend the evaluation of context-sensitive word embedding systems by examining whether they encode differences in word sense similarity and iii) compare the word sense similarities of both methods to assess their correlation and gain some intuition as to how well ***** contextualised ***** word embeddings could be used as surrogate word sense similarity judgements in linguistic experiments. | ||
| R19-1115 This paper evaluates the impact of several ***** contextualised ***** word embeddings on unsupervised STS methods and compares it with the existing supervised/unsupervised STS methods for different datasets in different languages and different domains | ||
| 2021.eacl-main.310 Results obtained using four types of probing measures with models like ELMo, BERT and some of its variants, indicate that idiomaticity is not yet accurately represented by ***** contextualised ***** models | ||
| 2021.nodalida-main.4 We present the ongoing NorLM initiative to support the creation and use of very large ***** contextualised ***** language models for Norwegian (and in principle other Nordic languages), including a ready-to-use software environment, as well as an experience report for data preparation and training. | ||
| 2020.wnut-1.35 In the first phase, we experiment with various ***** contextualised ***** word embeddings (like Flair, BERT-based) and a BiLSTM-CRF model to arrive at the best-performing architecture | ||
| trigram | 25 | |
| L08-1323 This paper presents a context sensitive spell checking system that uses mixed ***** trigram ***** models, and introduces a new empirically grounded method for building confusion sets. | ||
| P17-1184 Our results extend previous findings that character representations are effective across typologies, and we find that a previously unstudied combination of character ***** trigram ***** representations composed with bi-LSTMs outperforms most others. | ||
| 2021.blackboxnlp-1.38 Although the best performing BERT model scores 94%, and the ***** trigram ***** scores 75% classification accuracy under the traditional metric, performance drops precipitously to 38% for BERT and 30% for the ***** trigram ***** model under the ADC. | ||
| W19-3607 We developed Amharic language models of bigram and ***** trigram ***** for the training purpose. | ||
| L16-1678 Model parameters we experiment with affect the vectorial word representations used by the model; we apply different word vector initializations, defined by Word2vec and GloVe embeddings and enrich the representation of words by vectors assigned ***** trigram ***** features | ||
| clickbait | 25 | |
| 2020.coling-main.425 In this work, we model ***** clickbait ***** strength prediction as a regression problem. | ||
| 2020.acl-main.456 Through both automatic and human evaluation, we demonstrate that TitleStylist can generate relevant, fluent headlines with three target styles: humor, romance, and ***** clickbait *****. | ||
| 2020.coling-main.6 This task differs from related tasks such as summarization and ***** clickbait ***** identification by several aspects. | ||
| R17-1045 Come read the shocking truth about fake news and ***** clickbait ***** in the Bulgarian cyberspace. | ||
| 2020.ccl-1.106 Extensive experiments on two benchmark datasets show that our approach can effectively improve the performance of *****clickbait***** detection and consistently outperform many baseline methods . | ||
| transitive | 25 | |
| 2014.lilt-9.11 As I will show, the table has the underlying form of a syllogistic fragment and relies on a sort of generalized ***** transitive ***** reasoning. | ||
| K19-1062 The neural network automatically learns representations that account for long-term contexts to provide robust features for the structured model, while the SSVM incorporates domain knowledge such as ***** transitive ***** closure of temporal relations as constraints to make better globally consistent decisions. | ||
| 2021.eacl-main.215 To understand if and how morphosyntactic alignment affects contextual embedding spaces, we train classifiers to recover the subjecthood of mBERT embeddings in ***** transitive ***** sentences (which do not contain overt information about morphosyntactic alignment) and then evaluate them zero-shot on in***** transitive ***** sentences (where subjecthood classification depends on alignment), within and across languages. | ||
| 2020.alta-1.5 Nen verbal morphology is particularly complex ; a *****transitive***** verb can take up to 1,740 unique forms . | ||
| 2014.lilt-9.8 The relational syllogistic is an extension of the language of Classical syllogisms in which predicates are allowed to feature *****transitive***** verbs with quantified objects . | ||
| emergent | 25 | |
| 2020.acl-main.407 Equipped with new ways to measure compositionality in ***** emergent ***** languages inspired by disentanglement in representation learning, we establish three main results: First, given sufficiently large input spaces, the ***** emergent ***** language will naturally develop the ability to refer to novel composite concepts. | ||
| 2020.emnlp-main.270 We argue that UGI techniques should be part of the standard toolkit for analysing ***** emergent ***** languages and release a comprehensive library to facilitate such analysis for future researchers. | ||
| W19-4811 We aggregate the behavior of these units into language-level metrics which quantify the challenges that taggers face on languages with different morphological properties, and identify links between synthesis and affixation preference and ***** emergent ***** behavior of the hidden tagger layer. | ||
| 2021.acl-srw.6 Our results suggest that noise on a speaker is one of the factors for ZLA or at least causes ***** emergent ***** languages to approach ZLA, while noise on a listener and a channel is not. | ||
| 2021.emnlp-demo.30 Our open-domain question-answering system can further act as a model for the quick development of similar systems that can be adapted and modified for other developing ***** emergent ***** domains | ||
| CBOW | 25 | |
| 2020.coling-main.608 We tackle this inefficiency by introducing the Attention Word Embedding (AWE) model, which integrates the attention mechanism into the ***** CBOW ***** model. | ||
| E17-1006 We use ***** CBOW ***** word embeddings to represent word meaning and learn a compositionality function that combines the individual constituents into a phrase representation, thus capturing the compositional attribute meaning. | ||
| C16-1073 In this paper, we present an extension to the ***** CBOW ***** model which not only improves the quality of embeddings but also makes embeddings suitable for polysemy. | ||
| 2020.lrec-1.165 I am making the corpus and the trained ***** CBOW ***** word embeddings freely available for research purposes. | ||
| S19-1003 Indeed, standard models such as ***** CBOW ***** and fasttext are specific choices along each of these axes | ||
| induced | 25 | |
| 2020.coling-main.517 Our study empirically analyses the effectiveness of the ***** induced ***** emotion lexicons by measuring translation precision and correlations with existing emotion lexicons, along with measurements on a downstream task of sentence emotion prediction. | ||
| P17-1022 While these policies are effective for many tasks, interpretation of their ***** induced ***** communication strategies has remained a challenge. | ||
| L12-1400 We then show that the use of the third language of the corpus ― Italian ― as a pivot language can improve the precision of the ***** induced ***** lexicon, without loss in terms of quality of the extracted pairs. | ||
| L16-1437 An important limitation of existing evaluation systems is that they are unable to distinguish candidate-reference differences that arise due to acceptable linguistic variation from the differences ***** induced ***** by MT errors. | ||
| L14-1345 An evaluation of the ***** induced ***** word sense clusters in a word sense disambiguation task showed that they were no better than random clusters of equivalent granularity | ||
| E2E | 25 | |
| 2020.lrec-1.556 We also explore the capability of an ***** E2E ***** system to do structured NER. | ||
| P19-1256 Experiments on the ***** E2E ***** challenge dataset show that our proposed framework can reduce more than 50% relative unaligned noise from the original data-text pairs. | ||
| 2021.acl-long.200 Our results on the MuST-C benchmark with Transformer demonstrate the effectiveness of context to ***** E2E ***** ST. | ||
| 2021.emnlp-main.345 In this work, we extend the boundaries of ***** E2E ***** learning for KGQA to include the training of an ER component. | ||
| 2020.emnlp-main.90 Existing datasets, such as WIKIBIO, WebNLG, and ***** E2E *****, basically have a good alignment between an input triple/pair set and its output text | ||
| COLING | 25 | |
| W16-3920 In this paper, we present our approach for named entity recognition in Twitter messages that we used in our participation in the Named Entity Recognition in Twitter shared task at the ***** COLING ***** 2016 | ||
| L12-1383 The toolkit was previously demonstrated at ***** COLING ***** 2008, but has since seen substantial changes including: (1) incorporation of a new time expression tagger, (2)~embracement of stand-off annotation, (3) application to the medical domain and (4) introduction of narrative containers. | ||
| J74-3001 International Conference - *****COLING***** 76 ( Dr. | ||
| 2020.vardial-1.1 This paper presents the results of the VarDial Evaluation Campaign 2020 organized as part of the seventh workshop on Natural Language Processing ( NLP ) for Similar Languages , Varieties and Dialects ( VarDial ) , co - located with *****COLING***** 2020 . | ||
| R19-1089 With recent efforts in drawing attention to the task of replicating and/or reproducing results , for example in the context of *****COLING***** 2018 and various LREC workshops , the question arises how the NLP community views the topic of replicability in general . | ||
| typed | 25 | |
| P17-1195 A feature of the grammar is that it is extensively ***** typed ***** on the basis of a formal ontology for pre-university math. | ||
| L06-1078 Amongst its capabilities are the possibility to create lexicon structures, manipulate content and use of ***** typed ***** relations. | ||
| W19-4007 The nodes in this graph are synthesis operations and their ***** typed ***** arguments, and labeled edges specify relations between the nodes. | ||
| L10-1538 This is the reason why we propose a preliminary formal annotation model, represented with ***** typed ***** feature structures. | ||
| D19-1544 Semantic role labeling (SRL) involves extracting propositions (i.e. predicates and their ***** typed ***** arguments) from natural language sentences | ||
| stylometric | 25 | |
| 2020.lrec-1.123 We show that, for ***** stylometric ***** methods based on the most frequent words, we can do without translations. | ||
| 2020.figlang-1.36 The component models consist of an LSTM with hashtag and emoji representations; a CNN-LSTM with casing, stop word, punctuation, and sentiment representations; an MLP based on Infersent embeddings; and an SVM trained on ***** stylometric ***** and emotion-based features. | ||
| W19-2507 We develop a ***** stylometric ***** feature set for ancient Greek that enables identification of texts as prose or verse. | ||
| 2021.wassa-1.16 In this paper, we describe experiments designed to evaluate the impact of ***** stylometric ***** and emotion-based features on hate speech detection: the task of classifying textual content into hate or non-hate speech classes. | ||
| 2020.wnut-1.30 Instead, we directly embed documents in a ***** stylometric ***** space by relying on a reference set of authors and the intra-author consistency property which is one of two components in our definition of writing style | ||
| homogeneous | 25 | |
| 2021.acl-long.457 In SMedBERT, the mention-neighbour hybrid attention is proposed to learn heterogeneous-entity information, which infuses the semantic representations of entity types into the ***** homogeneous ***** neighbouring entity structure. | ||
| W16-3901 The solution is not obvious: we cannot control for all factors, and it is not clear how to best go beyond the current practice of training on ***** homogeneous ***** data from a single domain and language. | ||
| L08-1109 We consider collections of corpora that are ***** homogeneous ***** with respect to topic (i.e. about the same subject), or genre (written for the same audience or from the same source) and use a combination of stylistic and lexical features of the texts to automatically identify pieces of text in these collections that break the homogeneity. | ||
| 1997.iwpt-1.5 We show that controlled disjunctions can implement different kind of ambiguities in a consistent and ***** homogeneous ***** way. | ||
| L10-1283 After collecting these vectors we apply forms of the K-means algorithm on the resulting vector space to produce clusters of distinct senses, so that standard uses produce large ***** homogeneous ***** clusters while rare and novel uses appear in small or heterogeneous clusters | ||
| persuasive | 25 | |
| 2020.nlpcss-1.12 The higher ***** persuasive *****ness can be related to multiple aspects, including linguistic features of the comments, the user's motivation to participate, ***** persuasive ***** skills the user learns over time, and the user's identity and credibility established in the community through participation. | ||
| W19-4017 This paper investigates the use of explicitly signalled discourse relations in ***** persuasive ***** texts. | ||
| W17-5113 Our approach has been evaluated on two English corpora, the first of which contains 90 ***** persuasive ***** essays, while the second is a collection of 340 documents from user generated content. | ||
| 2021.argmining-1.15 We utilize multi-task learning to improve argument mining in ***** persuasive ***** online discussions, in which both micro-level and macro-level argumentation must be taken into consideration. | ||
| L08-1157 This paper presents resources and strategies for ***** persuasive ***** natural language processing | ||
| module | 25 | |
| 2020.coling-main.341 In each layer of DP-GCN, we employ a selection ***** module ***** to concentrate on nodes expressing the target relation by a set of binary gates, and then augment the pruned tree with a pruned semantic graph to ensure the connectivity. | ||
| D19-6401 Our results show that one is indeed able to simultaneously learn both internal ***** module ***** structure and ***** module ***** sequencing without extra supervisory signals for ***** module ***** execution sequencing. | ||
| P19-1614 In this work, we build-up on the language model based methods and augment them with a commonsense knowledge hunting (using automatic extraction from text) ***** module ***** and an explicit reasoning ***** module *****. | ||
| 2020.acl-main.539 Ablation experiments suggest that both the heterogeneous graph and the ***** module ***** network are important to obtain strong results. | ||
| 2021.emnlp-main.55 The ***** module *****s are shared across multiple programs, enabling compositionality as well as efficient learning of ***** module ***** parameters | ||
| morphological analyzers | 25 | |
| 2020.lrec-1.480 We use an existing state-of-the-art morphological disambiguation system to investigate the effects of different data sizes and different combinations of ***** morphological analyzers ***** for Modern Standard Arabic, Egyptian Arabic, and Gulf Arabic. | ||
| 2006.bcs-1.5 Xerox Arabic Finite State Morphology and Buckwalter Arabic Morphological Analyzer are two of the best known, well documented, ***** morphological analyzers ***** for Modern Standard Arabic (MSA). | ||
| L16-1410 This paper presents a semi-automatic method to derive ***** morphological analyzers ***** from a limited number of example inflections suitable for languages with alphabetic writing systems. | ||
| L16-1207 The resources include corpora for each dialect which have been morphologically annotated, and ***** morphological analyzers ***** for each dialect which are derived from these corpora | ||
| 2020.acl-main.732 Our joint models significantly outperform the baselines and are comparable to the state-of-the-art models that are more complex relying on ***** morphological analyzers ***** and/or a lot more data (e.g. | ||
| derivational morphology | 25 | |
| D17-1074 However, due to semantic, historical, and lexical considerations involved in ***** derivational morphology *****, future work will be needed to achieve performance parity with inflection-generating systems. | ||
| 2001.jeptalnrecital-poster.12 In the original Lesk's algorithm, the comparison is trivial: two words are either the same lexeme or not; our modification consists in fuzzy (weighted) comparison using a large synonym dictionary and a simple ***** derivational morphology ***** system. | ||
| L14-1068 Knowledge about ***** derivational morphology ***** has been proven useful for a number of natural language processing (NLP) tasks. | ||
| L06-1475 Arabic has a rich morphological system combining templatic and affixational paradigms for both inflectional and ***** derivational morphology *****. | ||
| L10-1041 This paper deals with the ***** derivational morphology ***** of automatic word form recognition | ||
| latent semantic | 25 | |
| 2006.jeptalnrecital-recitalposter.7 It is investigated whether this technique is also able to cluster nouns according to ***** latent semantic ***** dimensions in a reduced adjective space. | ||
| D17-1290 We measure the sensitivity of two ***** latent semantic ***** methods to the presence of different levels of document repetition. | ||
| K19-1027 Besides, a variational inference network (VIN) is proposed to constrain the corresponding sentences in two languages have the same or similar ***** latent semantic ***** code. | ||
| L12-1470 In addition, semantic word relatedness modeled by ***** latent semantic ***** analysis is also included. | ||
| L10-1434 This paper presents a comparison of three computational approaches to selectional preferences: (i) an intuitive distributional approach that uses second-order co-occurrence of predicates and complement properties; (ii) an EM-based clustering approach that models the strengths of predicate–noun relationships by ***** latent semantic ***** clusters (Rooth et al., 1999); and (iii) an extension of the ***** latent semantic ***** clusters by incorporating the MDL principle into the EM training, thus explicitly modelling the predicate–noun selectional preferences by WordNet classes (Schulte im Walde et al., 2008) | ||
| subword regularization | 25 | |
| 2020.acl-main.170 Using BPE-dropout during training and the standard BPE during inference improves translation quality up to 2.3 BLEU compared to BPE and up to 0.9 BLEU compared to the previous ***** subword regularization *****. | ||
| 2021.naacl-main.40 First, we demonstrate empirically that applying existing ***** subword regularization ***** methods (Kudo, 2018; Provilkov et al., 2020) during fine-tuning of pre-trained multilingual representations improves the effectiveness of cross-lingual transfer. | ||
| 2020.acl-main.755 Results show that our proposed metrics reveal a clear trend of improved robustness to perturbations when ***** subword regularization ***** methods are used. | ||
| P18-1007 We present a simple regularization method, ***** subword regularization *****, which trains the model with multiple subword segmentations probabilistically sampled during training | ||
| 2021.americasnlp-1.25 Our best neural machine translation systems used multilingual pretraining, ensembling, finetuning, training on parts of the development data, and *****subword regularization*****. | ||
| hope speech | 25 | |
| 2021.ltedi-1.22 This paper proposes a bidirectional long short-term memory (BiLSTM) with the attention-based approach, in solving the ***** hope speech ***** detection problem. | ||
| 2021.ltedi-1.9 In the second phase, we build a classifier to detect ***** hope speech *****, non ***** hope speech *****, or not lang labels. | ||
| 2021.ltedi-1.20 The goal of this task is to predict the presence of ***** hope speech *****, along with the presence of samples that do not belong to the same language in the dataset. | ||
| 2021.ltedi-1.11 Any tools and methods developed for detection, analysis, and generation of ***** hope speech ***** will be beneficial. | ||
| 2021.ltedi-1.8 This paper reports on the shared task of ***** hope speech ***** detection for Tamil, English, and Malayalam languages. | ||
| social sciences | 25 | |
| 2020.acl-srw.12 To achieve this, I plan to research methods using natural language processing and deep learning while employing models and using analysis concepts from the ***** social sciences *****, where researchers have studied media bias for decades. | ||
| 2021.latechclfl-1.18 Despite the increasing popularity of NLP in the humanities and ***** social sciences *****, advances in model performance and complexity have been accompanied by concerns about interpretability and explanatory power for sociocultural analysis. | ||
| 2021.emnlp-main.788 Understanding differences of viewpoints across corpora is a fundamental task for computational ***** social sciences *****. | ||
| 2020.nlpcss-1.2 Qualitative content analysis is a systematic method commonly used in the ***** social sciences ***** to analyze textual data from interviews or online discussions. | ||
| W18-4506 Qualitative and quantitative evaluations of this technique are shown as applied to two cultural datasets of interest to researchers across the humanities and ***** social sciences *****. | ||
| language engineering | 25 | |
| L12-1406 The annotation tool is implemented as a component of the Ellogon ***** language engineering ***** platform, exploiting its extensive annotation engine, its cross-platform abilities and its linguistic processing components, if such a need arises. | ||
| 1999.mtsummit-1.51 The corpus is aimed as a widely-distributable dataset for ***** language engineering ***** and for translation and terminology studies. | ||
| L12-1634 The premise was not only to compare the results of two quite different methods for our own interest, but also to enable other researchers to choose whichever reclassification better suited their purpose (one being grounded purely in theoretical linguistics and the other in practical ***** language engineering *****). | ||
| L14-1526 The annotation tool is implemented as a component of the Ellogon ***** language engineering ***** platform, exploiting its extensive annotation engine, its cross-platform abilities and its linguistic processing components, if such a need arises. | ||
| L12-1046 We lay ground for a ***** language engineering ***** process by gathering and defining a set of textual characteristics we consider valuable with respect to building natural language processing systems. | ||
| interpretation | 25 | |
| 2020.lrec-1.74 We highlight how thinking aloud affects ***** interpretation ***** of dialogue acts in our setting and how to best capture that information. | ||
| 2021.semeval-1.121 We used a stacked generalisation ensemble of five component models, with two distinct ***** interpretation *****s of the task. | ||
| L14-1460 In addition, improvements in the search possibilities and the display of the results have been implemented, which are especially relevant in the ***** interpretation ***** of the results of complex multi-tier searches. | ||
| P17-1155 We show that while the scores of n-gram based automatic measures are similar for all ***** interpretation ***** models, SIGN's ***** interpretation *****s are scored higher by humans for adequacy and sentiment polarity. | ||
| P17-1115 Accurate identification and ***** interpretation ***** of metonymy can be directly beneficial to various NLP applications, such as Named Entity Recognition and Geographical Parsing. | ||
| paraphrase identification | 25 | |
| Q16-1019 How to model a pair of sentences is a critical issue in many NLP tasks such as answer selection (AS), ***** paraphrase identification ***** (PI) and textual entailment (TE). | ||
| D18-1479 This task enables many potential applications such as question answering and ***** paraphrase identification *****. | ||
| L14-1002 We analyze in this paper a number of data sets proposed over the last decade or so for the task of ***** paraphrase identification *****. | ||
| D18-1210 In our benchmarks on four different tasks, including ontology classification, sentiment analysis, answer sentence selection, and ***** paraphrase identification *****, our proposed model, a modified CNN with context-sensitive filters, consistently outperforms the standard CNN and attention-based CNN baselines. | ||
| D19-1382 For example, PAWS (Paraphrase Adversaries from Word Scrambling) consists of challenging English ***** paraphrase identification ***** pairs from Wikipedia and Quora. | ||
| identifying and categorizing | 25 | |
| S19-2110 OffensEval addresses the problem of ***** identifying and categorizing ***** offensive language in social media in three subtasks; whether or not a content is offensive (subtask A), whether it is targeted (subtask B) towards an individual, a group, or other entities (subtask C). | ||
| N19-1322 As the performance of data-driven approaches for G2P conversion depend largely on pronunciation lexicon on which the system is trained, in this paper, we investigate on developing an improved training lexicon by ***** identifying and categorizing ***** the critical cases in Bangla language and include those critical cases in training lexicon for developing a robust G2P conversion system in Bangla language. | ||
| S19-2011 In the shared task of ***** identifying and categorizing ***** offensive language in social media, we preprocess the dataset according to the language behaviors on social media, and then adapt and fine-tune the Bidirectional Encoder Representation from Transformer (BERT) pre-trained by Google AI Language team. | ||
| 2020.emnlp-main.626 In this paper, we propose the novel modeling approach MedFilter, which addresses these insights in order to increase performance at ***** identifying and categorizing ***** task-relevant utterances, and in so doing, positively impacts performance at a downstream information extraction task. | ||
| I17-1036 This paper tackles the task of event detection, which involves ***** identifying and categorizing ***** events. | ||
| agent | 25 | |
| 1998.amta-papers.15 Such problems include ambiguous attachment of participles, ambiguous scope in coordination, and ambiguous attachment of the ***** agent ***** phrase for double passives. | ||
| 2020.coling-main.96 We introduce Situated Interactive MultiModal Conversations (SIMMC) as a new direction aimed at training ***** agent *****s that take multimodal actions grounded in a co-evolving multimodal input context in addition to the dialog history. | ||
| 2020.sigdial-1.41 The ***** agent ***** has a temporary belief state for the dialogue, and a persistent knowledge store represented as an extensive-form game tree. | ||
| 2021.dialdoc-1.10 We simulate the dialogue between an ***** agent ***** and a user (modelled similar to an ***** agent ***** with supervised learning objective) to interact with each other. | ||
| I17-3015 The proposed framework provides a pioneering example of on-demand knowledge validation in dialog environment to address such needs in AI ***** agent *****s/chatbots. | ||
| multilingual translation | 25 | |
| L08-1579 Recently the LATL has undertaken the development of a ***** multilingual translation ***** system based on a symbolic parsing technology and on a transfer-based translation model. | ||
| 2001.mtsummit-papers.29 This system has the potential to become a framework for ***** multilingual translation ***** systems. | ||
| 2021.iwslt-1.19 We aim at improving ***** multilingual translation ***** and zero-shot performance in the constrained setting (without using any extra training data) through methods that encourage transfer learning and larger capacity modeling with advanced neural components. | ||
| 2020.coling-main.387 We present a method for completing ***** multilingual translation ***** dictionaries. | ||
| 2003.mtsummit-papers.27 This paper describes a framework for ***** multilingual translation ***** using existing translation engines. | ||
| written language | 25 | |
| W18-3910 However, linguistic research suggests that spoken language often differs from ***** written language *****. | ||
| 2020.signlang-1.35 By utilizing such data for the task of keyword search, this work aims to enable information retrieval from sign language with the queries from the translated ***** written language *****. | ||
| 2021.acl-long.247 Secondly, we present a comprehensive taxonomy of labels for annotating misogyny in natural ***** written language *****, and finally, we introduce a high-quality dataset of annotated posts sampled from social media posts. | ||
| L08-1406 ASV Toolbox is a modular collection of tools for the exploration of *****written language***** data both for scientific and educational purposes . | ||
| 2020.bea-1.9 In this paper we present an NLP - based approach for tracking the evolution of *****written language***** competence in L2 Spanish learners using a wide range of linguistic features automatically extracted from students ' written productions . | ||
| masked language | 25 | |
| 2020.emnlp-main.497 In this paper, we show that careful masking strategies can bridge the knowledge gap of ***** masked language ***** models (MLMs) about the domains more effectively by allocating self-supervision where it is needed. | ||
| 2020.blackboxnlp-1.13 We explore the imprint of two specific linguistic alternations, namely passivization and negation, on the representations generated by neural models trained with two different objectives: ***** masked language ***** modeling and translation. | ||
| 2020.emnlp-main.498 We present BAE, a black box attack for generating adversarial examples using contextual perturbations from a BERT ***** masked language ***** model. | ||
| 2021.semeval-1.16 In our experiments, we used a neural system based on the XLM-R, a pre-trained transformer-based ***** masked language ***** model, as a baseline. | ||
| 2020.emnlp-main.699 To tackle cases when no parallel source–target pairs are available, we train ***** masked language ***** models (MLMs) for both the source and the target domain. | ||
| application | 25 | |
| 2021.acl-long.523 We show that Polyjuice produces diverse sets of realistic counterfactuals, which in turn are useful in various distinct ***** application *****s: improving training and evaluation on three different tasks (with around 70% less annotation effort than manual generation), augmenting state-of-the-art explanation techniques, and supporting systematic counterfactual error analysis by revealing behaviors easily missed by human experts. | ||
| P17-1028 We evaluate a suite of methods across two different ***** application *****s and text genres: Named-Entity Recognition in news articles and Information Extraction from biomedical abstracts. | ||
| 2021.emnlp-main.643 Here, we introduce the ***** application ***** of balancing loss functions for multi-label text classification. | ||
| W17-3106 We then illustrate the future possibility of this work with an example of an exposure scenario authored with our ***** application *****. | ||
| L08-1311 First, a general description of the ***** application ***** demands emerging from the eParticipation and eGovernment sectors is offered. | ||
| representational similarity analysis | 25 | |
| W19-4820 ReStA is a variant of the popular ***** representational similarity analysis ***** (RSA) in cognitive neuroscience. | ||
| 2020.acl-main.381 We use two commonly applied analytical techniques, diagnostic classifiers and ***** representational similarity analysis *****, to quantify to what extent neural activation patterns encode phonemes and phoneme sequences. | ||
| 2020.coling-main.151 We present a new approach for detecting human-like social biases in word embeddings using ***** representational similarity analysis *****. | ||
| 2021.blackboxnlp-1.32 To answer these questions, we present a novel experimental design based on ***** representational similarity analysis ***** (RSA) to analyze acoustic word embeddings (AWEs)—vector representations of variable-duration spoken-word segments. | ||
| 2021.insights-1.9 Through *****Representational Similarity Analysis*****, we conclude that more data for fine-tuning yields greater change of the model's representations and thus reduces the influence of initialization. | ||
| academic | 25 | |
| R19-1021 The ability to produce high-quality publishable material is critical to ***** academic ***** success but many Post-Graduate students struggle to learn to do so. | ||
| L16-1072 We have seen that many resources exist which are useful for MT and similar work, but the majority are for (***** academic *****) research or educational use only, and as such not available for commercial use. | ||
| 2020.latechclfl-1.20 However, the TL-Explore is easily extended to other works of literature and is not limited to type of texts, such as ***** academic ***** manuscripts or constitutional documents to name a few. | ||
| L12-1330 This is particularly true in ***** academic ***** research where it has suddenly become possible to collect (high-quality) annotations rapidly without the need of an expert. | ||
| W19-4731 In this study , we propose to focus on understanding the evolution of a specific scientific conceptthat of Circular Economy ( CE)by analysing how the language used in *****academic***** discussions has changed semantically . | ||
| information bottleneck | 25 | |
| 2020.emnlp-main.153 In this paper, we show that it is possible to better manage the trade-off between concise explanations and high task accuracy by optimizing a bound on the *****Information Bottleneck***** (IB) objective. | ||
| 2021.blackboxnlp-1.39 In this paper, we address this gap by leveraging the recently developed *****information bottlenecks***** for attribution (IBA) framework. | ||
| 2021.newsum-1.10 We use the *****Information Bottleneck***** principle to jointly train the extraction and abstraction in an end-to-end fashion. | ||
| 2021.acl-long.112 Our model combines a careful choice of training objective with a principled *****information bottleneck*****, to induce a latent encoding space that disentangles meaning and form. | ||
| D19-1389 The principle of the *****Information Bottleneck***** (Tishby et al., 1999) produces a summary of information X optimized to predict some other relevant information Y. | ||
| representational similarity | 25 | |
| 2021.insights-1.9 Through *****Representational Similarity***** Analysis, we conclude that more data for fine-tuning yields greater change of the model's representations and thus reduces the influence of initialization. | ||
| P19-1283 Here we present two methods based on *****Representational Similarity***** Analysis (RSA) and Tree Kernels (TK) which allow us to directly quantify how strongly the information encoded in neural activation patterns corresponds to information represented by symbolic structures such as syntax trees. | ||
| W19-4820 ReStA is a variant of the popular *****representational similarity***** analysis (RSA) in cognitive neuroscience. | ||
| 2020.coling-main.325 We introduce an approach to address this question using *****Representational Similarity***** Analysis (RSA). | ||
| W19-4330 The representations are produced using a dual-encoder based model trained to maximize the *****representational similarity***** between sentence pairs drawn from parallel data. | ||
| Transformer - based | 25 | |
| 2021.acl-long.152 This paper studies the relative importance of attention heads in *****Transformer - based***** models to aid their interpretability in cross - lingual and multi - lingual tasks . | ||
| 2020.acl-main.216 This paper presents an audio visual automatic speech recognition ( AV - ASR ) system using a *****Transformer - based***** architecture . | ||
| 2021.emnlp-main.127 *****Transformer - based***** models have gained increasing popularity achieving state - of - the - art performance in many research fields including speech translation . | ||
| 2020.findings-emnlp.49 *****Transformer - based***** models have brought a radical change to neural machine translation . | ||
| 2021.eacl-main.157 In this work , we consider the problem of uncertainty estimation for *****Transformer - based***** models . | ||
| Aiming | 24 | |
| L14-1434 ***** Aiming ***** to solve that problem, this work investigates how some semantic relations can be automatically extracted from Portuguese texts. | ||
| 2020.wmt-1.100 ***** Aiming ***** at finding better semantic representation for semantic MT evaluation, we first test YiSi-2 with contextual embed- dings extracted from different layers of two different pretrained models, multilingual BERT and XLM-RoBERTa. | ||
| 2021.acl-long.409 ***** Aiming ***** to further close this gap, we propose a model of semantic memory for WSD in a meta-learning setting. | ||
| 2020.findings-emnlp.393 ***** Aiming ***** at these goals, in this paper, we present a new distractor generation scheme with multi-tasking and negative answer training strategies for effectively generating multiple distractors. | ||
| C16-1136 ***** Aiming ***** to resolve high complexities of event descriptions, previous work (Huang and Riloff, 2013) proposes multi-faceted event recognition and a bootstrapping method to automatically acquire both event facet phrases and event expressions from unannotated texts | ||
| semantic annotations | 24 | |
| W17-3529 The first uses existing ***** semantic annotations ***** to generate new questions. | ||
| 2021.emnlp-main.399 Furthermore, public datasets have not considered these complications and the general ***** semantic annotations ***** are lacking which may result in zero-shot problem. | ||
| 2020.isa-1.10 As most information retrieval and extraction tasks are resource intensive, very little work has been done on Kannada NLP, with almost no efforts in discourse analysis and dataset creation for representing events or other ***** semantic annotations ***** in the text. | ||
| L08-1233 Special emphasis is laid on ***** semantic annotations ***** in terms of a large amount of biomedical named entities (almost 100 entity types), semantic relations, as well as discourse phenomena, reference relations in particular. | ||
| L08-1593 We end by explaining how any user who wants to do serious studies using the corpus can collaborate in enhancing the corpus and making their ***** semantic annotations ***** widely available as well | ||
| interoperability | 24 | |
| 2020.iwltp-1.15 We devise five different levels (of increasing complexity) of platform ***** interoperability ***** that we suggest to implement in a wider federation of AI/LT platforms. | ||
| D19-3015 We provide a web-based interface for annotation visualization and document ranking, with a modular backend to support ***** interoperability ***** with existing annotation tools. | ||
| 2020.lrec-1.120 The project focuses on the ***** interoperability ***** and accessibility of data, with particular respect to reusability in the sense of the FAIR Data Principles. | ||
| W16-5210 While an ***** interoperability ***** framework is useful in certain cases, some types of users will not select the framework due to the learning cost and design restrictions. | ||
| L12-1548 The paper discusses advantages of this approach, in particular with respect to ***** interoperability ***** and queriability, which are illustrated for the MASC corpus, an open multi-layer corpus of American English (Ide et al., 2008) | ||
| analyze | 24 | |
| 2020.semeval-1.186 In this paper, we present the task, ***** analyze ***** the results, and discuss the system submissions and the methods they used. | ||
| W16-4603 Specifically, we translate Japanese recipes into English, ***** analyze ***** errors in the translated recipes, and discuss available room for improvements. | ||
| 2021.acl-demo.16 We present Dodrio, an open-source interactive visualization tool to help NLP researchers and practitioners ***** analyze ***** attention mechanisms in transformer-based models with linguistic knowledge. | ||
| L12-1137 This can be significantly improved by applying innovative audio and video processing algorithms, which ***** analyze ***** the recordings and provide automated annotations. | ||
| P18-1249 The use of attention makes explicit the manner in which information is propagated between different locations in the sentence, which we use to both ***** analyze ***** our model and propose potential improvements | ||
| judgements | 24 | |
| 2021.acl-long.521 Compared with seven existing metrics in three common NLG tasks, MARS not only achieves higher correlation with human reference ***** judgements *****, but also differentiates well-formed candidates from adversarial samples to a larger degree. | ||
| L12-1322 We outline a more rigorous approach to collecting human annotations, using as our example a study designed to capture ***** judgements ***** on the meaning of hedge words in medical records. | ||
| L12-1084 To this end, the NKI-CCRT corpus with individual listener ***** judgements ***** on the intelligibility of recordings of 55 speakers treated for cancer of the head and neck will be made available for restricted scientific use. | ||
| W19-0606 Compared with existing resources underpinned by word-level embeddings alone, and frame embeddings built upon pre-trained vectors, our proposed frame embeddings obtained better performance against ***** judgements ***** of an RE expert. | ||
| P17-2065 Applied to images of Shakespeare's First Folio, our model predicts attributions that agree with the manual ***** judgements ***** of bibliographers with an accuracy of 87%, even on text that is the output of OCR | ||
| EMNLP | 24 | |
| D19-1236 Our results show that the authors of a paper can be inferred with accuracy as high as 87% on ACL and 78% on ***** EMNLP ***** for the top 100 most prolific authors. | ||
| W18-6209 In this paper we present our approach to tackle the Implicit Emotion Shared Task (IEST) organized as part of WASSA 2018 at ***** EMNLP ***** 2018. | ||
| 2020.acl-main.261 We examine this question with respect to a paper on automatic legal sentencing from ***** EMNLP ***** 2019 which was a source of some debate, in asking whether the paper should have been allowed to be published, who should have been charged with making such a decision, and on what basis. | ||
| W18-6414 This paper describes the statistical machine translation system built by the MLLP research group of Universitat Politëcnica de Valëncia for the German→English news translation shared task of the ***** EMNLP ***** 2018 Third Conference on Machine Translation (WMT18). | ||
| W18-5917 This paper describes the systems developed for 1st and 2nd tasks of the 3rd Social Media Mining for Health Applications Shared Task at ***** EMNLP ***** 2018 | ||
| cohesion | 24 | |
| 2020.iwdp-1.2 Dissregarding dependencies across sentences will harm translation quality especially in terms of coherence, ***** cohesion *****, and consistency. | ||
| 2020.readi-1.14 Thus, studies on the impact of simplification in text ***** cohesion ***** are lacking. | ||
| L12-1456 The results show that the most common errors are absent ***** cohesion ***** or context and various types of broken or missing anaphoric references. | ||
| 2020.lrec-1.144 This system modifies the coreference chains, which are markers of text ***** cohesion *****, by using rules. | ||
| L16-1656 The comparison shows that knowledge bases not only have coverage gaps; they also do not account for semantic relations that are manifested in particular contexts only, yet still play an important role for text ***** cohesion ***** | ||
| subgraphs | 24 | |
| 2021.emnlp-main.222 In each iteration, we first construct a keyword graph, so the task of assigning pseudo labels is transformed to annotating keyword ***** subgraphs *****. | ||
| 2021.emnlp-main.714 To train most AMR parsers, one needs to segment the graph into ***** subgraphs ***** and align each such subgraph to a word in a sentence; this is normally done at preprocessing, relying on hand-crafted rules. | ||
| 2021.emnlp-main.200 These directed ***** subgraphs ***** are considered to well preserve extra but relevant content to the short input text, and then they are decoded by the employed pre-trained model to generate coherent long text. | ||
| 2020.findings-emnlp.207 To utilize these unexploited graph-level knowledge, we propose an approach to model ***** subgraphs ***** in a medical KG. | ||
| K17-1005 To deal with this problem, we propose graph merging, a new perspective, for building flexible dependency graphs: Constructing complex graphs via constructing simple ***** subgraphs ***** | ||
| align | 24 | |
| D19-1120 Past models ***** align ***** topics across languages by implicitly assuming the documents in different languages are highly comparable, often a false assumption. | ||
| 2020.cmcl-1.7 We found that, across development, children ***** align ***** consistently to adults above chance and that adults ***** align ***** consistently more to children than vice versa (even controlling for language production abilities). | ||
| C18-1123 Attention-based sequence-to-sequence neural network models learn to jointly ***** align ***** and translate. | ||
| L14-1332 A cross-linguistic comparison of English to Chinese and Czech AMRs reveals both cases where the AMRs for the language pairs ***** align ***** well structurally and cases of linguistic divergence. | ||
| 2020.nlpcss-1.14 We show that uncalibrated classifiers (i.e. where the `raw' scores are used) ***** align ***** poorly with human evaluations | ||
| WikiSQL | 24 | |
| 2020.coling-main.31 The proposed method can achieve competitive results on ***** WikiSQL *****, suggesting it being a promising direction for text-to-SQL. | ||
| 2020.emnlp-main.561 Experiments were conducted on two cross-domain datasets, the ***** WikiSQL ***** and the more complex Spider, with five state-of-the-art parsers. | ||
| 2020.acl-main.398 We additionally find that transfer learning, which is trivial in our setting, from ***** WikiSQL ***** to WikiTQ, yields 48.7 accuracy, 4.2 points above the state-of-the-art. | ||
| 2020.aacl-srw.16 ***** WikiSQL ***** and Spider, the large-scale cross-domain text-to-SQL datasets, have attracted much attention from the research community | ||
| D19-1624 Most deep learning approaches for text - to - SQL generation are limited to the *****WikiSQL***** dataset , which only supports very simple queries over a single table . | ||
| radiology | 24 | |
| L16-1725 For this purpose, we handcrafted some linguistic patterns from on a subset of our ***** radiology ***** report corpora. | ||
| 2021.naacl-main.416 On two open ***** radiology ***** report datasets, our system substantially improved the F1 score of a clinical information extraction performance by +22.1 (Delta +63.9%). | ||
| 2021.bionlp-1.35 We also employed a source-specific ensembling technique to accommodate for distinct writing styles from different ***** radiology ***** report sources. | ||
| 2020.findings-emnlp.110 However, past work has shown that typical abstractive methods tend to produce fluent, but clinically incorrect ***** radiology ***** reports | ||
| R17-1025 Radiology reports express the results of a *****radiology***** study and contain information about anatomical entities , findings , measures and impressions of the medical doctor . | ||
| Classical | 24 | |
| 2020.lrec-1.98 However, there is no freely available open-source corpus of these histories, making ***** Classical ***** Chinese low-resource. | ||
| L12-1540 In order to construct an annotated diachronic corpus of Japanese, we propose to create a new dictionary for morphological analysis of Early Middle Japanese (***** Classical ***** Japanese) based on UniDic, a dictionary for Contemporary Japanese. | ||
| L12-1091 This is initial work on a long-term research project to produce annotation schemes, language resources, algorithms, and applications for ***** Classical ***** and Modern Standard Arabic. | ||
| 2020.lrec-1.385 The Calfa project gathers existing resources and updates, enriches and enhances their content to offer the richest database for ***** Classical ***** Armenian today. | ||
| L12-1376 Work related to LAMP includes recent efforts for annotating other ***** Classical ***** languages, such as Ancient Greek and Latin (Bamman, Mambrini and Crane, 2009), as well as commercial systems (e.g. Logos Bible study) that provide access to syntactic tagging for the Hebrew Bible and Greek New Testament (Brannan, 2011) | ||
| feedforward | 24 | |
| 2021.eacl-main.208 These constraints are effective even when using a Deep Averaging Network, a simple ***** feedforward ***** encoding architecture that allows for scaling to large corpora while remaining sufficiently expressive. | ||
| W18-5450 Modern neural architectures go way beyond simple ***** feedforward ***** and recurrent models: they are complex pipelines that perform soft, differentiable computation instead of discrete logic. | ||
| E17-2111 In this work we propose a multimodal approach to topic labelling using a simple ***** feedforward ***** neural network. | ||
| 2021.wmt-1.116 Experiments on top of English-German models, which already have state-of-the-art speed and size, show that two-thirds of ***** feedforward ***** connections can be removed with 0.2 BLEU loss. | ||
| P18-1174 The AL strategy is then learned with a ***** feedforward ***** network, mapping situations to most informative query datapoints | ||
| holistic | 24 | |
| P19-1046 Instead of directly fusing features at ***** holistic ***** level, we conduct fusion hierarchically so that both local and global interactions are considered for a comprehensive interpretation of multimodal embeddings. | ||
| E17-1017 This paper presents the first ***** holistic ***** work on computational argumentation quality in natural language. | ||
| W19-5935 Results showed that interjections and fillers each improved users' ***** holistic ***** ratings, an improvement that further increased if the system used both manipulations. | ||
| W17-5010 We propose that this shifted emphasis be reflected in a new name for the task: `***** holistic ***** error correction' (HEC). | ||
| N18-3015 We introduce a novel ***** holistic ***** approach to post-processing that relies on machine callytranslation | ||
| digitized | 24 | |
| L16-1570 In this paper, we present the experiments we made to recover the original page layout structure into two columns from layout damaged ***** digitized ***** files. | ||
| 2021.nodalida-main.3 In this work, we show the process of building a large-scale training set from digital and ***** digitized ***** collections at a national library. | ||
| 2020.lrec-1.110 In this paper, we report a multilingual ***** digitized ***** version of thousands of such documents searchable through some well-established corpus infrastructures. | ||
| L10-1297 While cabinet protocols are often available in ***** digitized ***** form, so far the only method to access their information content is by keyword-based search, which often returns sub-optimal results. | ||
| 2020.lrec-1.108 The database combines formally interpreted and richly interlinked onomastic data with ***** digitized ***** versions of the medieval manuscripts from which the data originate and information on the tokens' context | ||
| modular | 24 | |
| L10-1575 We share the view that currently used methods for including lexical and terminological information in such hierarchical networks of concepts are not satisfactory, and thus put forward ― as a preliminary step to our annotation goal ― a model for ***** modular ***** representation of conceptual, terminological and linguistic information within knowledge representation systems. | ||
| 2020.emnlp-main.617 We propose MAD-X, an adapter-based framework that enables high portability and parameter-efficient transfer to arbitrary tasks and languages by learning ***** modular ***** language and task representations. | ||
| P18-2039 We present a novel ***** modular ***** framework that divides the knowledge into four categories according to the depth of knowledge they convey. | ||
| C16-1026 Our method is particularly attractive for ***** modular ***** systems that make use of a syntax parser anyway, e.g. as part of an understanding pipeline where predictive parsing improves language modeling at no additional cost. | ||
| 2010.amta-papers.19 In this paper, we present a novel ***** modular ***** approach that utilises state-of-the-art sub-tree alignment and SMT techniques to turn the fuzzy matches from a TM into near-perfect translations | ||
| bias | 24 | |
| 2020.emnlp-main.56 While prior text-to-text natural language generation (NLG) approaches can be used to address this problem, neglecting the confounding ***** bias ***** from the data generation mechanism can limit the model performance, and the ***** bias ***** may pollute the learning outcomes. | ||
| D19-1309 Despite the effectiveness of previous work based on generative models, there remain problems with exposure ***** bias ***** in recurrent neural networks, and often a failure to generate realistic sentences. | ||
| 2021.blackboxnlp-1.16 Our results show that text degeneration is likely to be partly caused by exposure ***** bias *****. | ||
| 2020.socialnlp-1.1 This paper aims to correlate the type of knowledge presupposed in a news article to the ***** bias ***** present in it. | ||
| 2021.acl-long.148 A common factor in ***** bias ***** measurement methods is the use of hand-curated seed lexicons, but there remains little guidance for their selection | ||
| Intent | 24 | |
| D19-1214 ***** Intent ***** detection and slot filling are two main tasks for building a spoken language understanding (SLU) system. | ||
| 2020.aacl-main.57 Slot - filling , Translation , *****Intent***** classification , and Language identification , or STIL , is a newly - proposed task for multilingual Natural Language Understanding ( NLU ) . | ||
| N18-2050 *****Intent***** detection and slot filling are two main tasks for building a spoken language understanding(SLU ) system . | ||
| 2020.emnlp-main.411 *****Intent***** detection is one of the core components of goal - oriented dialog systems , and detecting out - of - scope ( OOS ) intents is also a practically important skill . | ||
| 2021.acl-long.190 *****Intent***** classification is a major task in spoken language understanding ( SLU ) . | ||
| Frame | 24 | |
| W18-3813 Much interest in *****Frame***** Semantics is fueled by the substantial extent of its applicability across languages . | ||
| 2021.eacl-main.206 *****Frame***** identification is one of the key challenges for frame - semantic parsing . | ||
| C16-1334 *****Frame***** semantics is a theory of linguistic meanings , and is considered to be a useful framework for shallow semantic analysis of natural language . | ||
| 2021.eacl-demos.30 In this paper , we introduce FrameForm , an open - source annotation tool designed to accommodate predicate annotations based on *****Frame***** Semantics . | ||
| L06-1354 In the field of Computational Linguistics , many lexical resources have been developed which aim at encoding complex lexical semantic information according to different linguistic models ( WordNet , *****Frame***** Semantics , Generative Lexicon , etc . ) . | ||
| hierarchical clustering | 24 | |
| 2020.lrec-1.303 Specifically, this is a hard ***** hierarchical clustering ***** with a fixed-width beam that employs bi-grams and greedily minimizes global mutual information loss. | ||
| 2021.rocling-1.25 Our method also attempts to reduce redundancy through ***** hierarchical clustering ***** and arrange selected sentences on the proposed orderBERT. | ||
| C16-1251 SenticNet 4 overcomes such limitations by leveraging on conceptual primitives automatically generated by means of ***** hierarchical clustering ***** and dimensionality reduction. | ||
| W17-4216 The framework consists of three parts: 1) extracting important terms, 2) generating concordance for each term with stipulative definitions and explanations, and 3) agglomerating similar information of the term by ***** hierarchical clustering *****. | ||
| 2021.crac-1.11 We present a new approach of joint coreference, including (1) a formal cost function inspired by Dasgupta's cost for ***** hierarchical clustering *****, and (2) a representation for uncertainty of clustering of event and entity mentions, again based on a hierarchical structure | ||
| explicit | 24 | |
| 2020.emnlp-main.695 Given a new sentence, our end-to-end algorithm proposes and scores each mention span against ***** explicit ***** entity representations created from the earlier document context (if any). | ||
| 2020.coling-main.11 With this paper, we propose to make such interpretations of events ***** explicit *****, following theories of cognitive appraisal of events, and show their potential for emotion classification when being encoded in classification models. | ||
| 2021.naacl-main.284 AMBER is trained on additional parallel data using two ***** explicit ***** alignment objectives that align the multilingual representations at different granularities. | ||
| 2021.naacl-main.97 However, these approaches tend to not be interpretable because they do not make the intermediate reasoning steps ***** explicit *****. | ||
| 2021.ranlp-1.99 We also investigate different methods of distinguishing between ***** explicit ***** and implicit abuse and show lexicon-based approaches either over- or under-estimate the proportion of ***** explicit ***** abuse in data sets | ||
| interactions | 24 | |
| 2014.amta-researchers.20 We use mixed-effects models to verify that the ASR errors that compose the WER metric do not contribute equally to translation quality and that ***** interactions ***** exist between ASR errors that cumulatively affect a SMT system's ability to translate an utterance. | ||
| L12-1365 In the study we focus on the most frequently occurring feedback expressions in the ***** interactions ***** and on feedback-related head movements and facial expressions. | ||
| D18-1069 The expressions of agreement/disagreement usually rely on argumentative expressions in text as well as ***** interactions ***** between participants in debates. | ||
| W17-8010 For instance, grapefruit has ***** interactions ***** with several drugs, because its active ingredients inhibit enzymes involved in the drugs metabolism and can then cause an excessive dosage of these drugs. | ||
| 2020.lrec-1.59 Based on the analysis of two ***** interactions *****, two questions are asked: (1) Does smile frame humor | ||
| definitions | 24 | |
| D19-1357 To utilize implicit semantic relations in ***** definitions *****, we use unsupervisedly obtained pattern-based word-pair embeddings that represent semantic relations of word pairs. | ||
| L16-1269 As a result we obtain a multilingual corpus of textual ***** definitions ***** featuring over 38 million ***** definitions ***** in 263 languages, and we make it freely available at http://lcl.uniroma1.it/disambiguated-glosses. | ||
| 2020.computerm-1.8 We showcase the usefulness of the tool on examples from the karstology domain, where in the first use case we visualize the domain knowledge as represented in a manually annotated corpus of domain ***** definitions *****, while in the second use case we show the power of visualization for domain understanding by visualizing automatically extracted knowledge in the form of triplets extracted from the karstology domain corpus. | ||
| W16-5323 While structured resources that can express those relationships in a formal way, such as ontologies, are still scarce, a large number of linguistic resources gathering dictionary ***** definitions ***** is becoming available, but understanding the semantic structure of natural language ***** definitions ***** is fundamental to make them useful in semantic interpretation tasks. | ||
| L16-1398 The key points are clear and freely available ***** definitions *****, accessible documentation and easily usable facilities and guidelines for the metadata creators | ||
| acoustic | 24 | |
| L10-1173 The intended use of the database is ***** acoustic ***** and lexical modeling of these phonetic variations. | ||
| 2019.icon-1.10 Changes in the ***** acoustic ***** features are compared in the five vowels' regions of the English language. | ||
| L14-1443 We present a dataset of telephone conversations in English and Czech, developed for training ***** acoustic ***** models for automatic speech recognition (ASR) in spoken dialogue systems (SDSs). | ||
| 2020.acl-main.1 The results suggest that this is at least partially due to linguistic rather than ***** acoustic ***** properties of the two registers, as we see the same pattern when looking at models trained on ***** acoustic *****ally comparable synthetic speech | ||
| 2013.iwslt-papers.8 In this paper we describe our work on unsupervised adaptation of the *****acoustic***** model of our simultaneous lecture translation system . | ||
| edited | 24 | |
| L12-1360 As we will see, the task requires the revisiting of the initial steps of NLP processing, since UGC (micro-blog, blog, and, generally, Web 2.0 user texts) presents a number of non-standard communicative and linguistic characteristics, and is in fact much closer to oral and colloquial language than to ***** edited ***** text. | ||
| 2020.semeval-1.138 I participated in both of the sub-tasks: Sub-Task 1 “Regression” and Sub-task 2 “Predict the funnier of the two ***** edited ***** versions of an original headline”. | ||
| 2020.semeval-1.142 The target of this task is to assess the funniness changes of news headlines after minor editing and is divided into two subtasks: Subtask 1 is a regression task to detect the humor intensity of the sentence after editing; and Subtask 2 is a classification task to predict funnier of the two ***** edited ***** versions of an original headline. | ||
| 2020.semeval-1.141 This paper describes our participation in SemEval 2020 Task 7 on assessment of humor in ***** edited ***** news headlines, which includes two subtasks, estimating the humor of micro-editd news headlines (subtask A) and predicting the more humorous of the two ***** edited ***** headlines (subtask B). | ||
| 2020.semeval-1.105 The goal of Subtask 1 is to predict the mean funniness of the ***** edited ***** headline given the original and the ***** edited ***** headline | ||
| topic segmentation | 24 | |
| D17-1139 Due to the absence of training data, previous work mainly adopts unsupervised methods to rank semantic coherence between paragraphs for ***** topic segmentation *****. | ||
| L06-1301 We report an evaluation about ***** topic segmentation ***** showing that the results got with the filtered network are the same as the results got with the initial network although the first one is significantly smaller than the second one | ||
| 2021.sigdial-1.18 Dialogue ***** topic segmentation ***** is critical in several dialogue modeling problems. | ||
| W16-5412 As a case study for the corpus, we describe a method combined with LCSeg and TopicTiling for a ***** topic segmentation ***** task. | ||
| 2007.jeptalnrecital-long.1 Our investigation reports on the use of various lexical, acoustic and syntactic features, and makes a comparison of how these features influence performance of automatic ***** topic segmentation *****. | ||
| entity coreference | 24 | |
| L14-1568 In this paper we present CROMER (CROss-document Main Events and entities Recognition), a novel tool to manually annotate event and ***** entity coreference ***** across clusters of documents. | ||
| 2021.nuse-1.4 However, document-level event extraction is a challenging task as it requires the extraction of event and ***** entity coreference *****, and capturing arguments that span across different sentences. | ||
| 2021.codi-sharedtask.2 Our team ranked second for ***** entity coreference ***** resolution, first for bridging resolution, and first for discourse deixis resolution. | ||
| 2021.emnlp-main.106 Performing event and ***** entity coreference ***** resolution across documents vastly increases the number of candidate mentions, making it intractable to do the full n^2 pairwise comparisons. | ||
| W18-6536 The linguistic features are designed to capture information related to named entity recognition, word case, and ***** entity coreference ***** resolution | ||
| morphological tagging | 24 | |
| 2021.calcs-1.10 In this paper, we explore a number of ways of implementing a language-aware ***** morphological tagging ***** method and present our approach for integrating language IDs into a transformer-based framework for CS ***** morphological tagging *****. | ||
| N19-1155 Error analysis indicates that joint ***** morphological tagging ***** and lemmatization is especially helpful in low-resource lemmatization and languages that display a larger degree of morphological complexity. | ||
| D17-1078 Even for common NLP tasks, sufficient supervision is not available in many languages – ***** morphological tagging ***** is no exception. | ||
| W19-4211 Our core approach focuses on the ***** morphological tagging ***** task; part-of-speech tagging and lemmatization are treated as secondary tasks. | ||
| 2021.nodalida-main.2 We first describe the EstBERT pretraining process and then present the models' results based on the finetuned EstBERT for multiple NLP tasks, including POS and ***** morphological tagging *****, dependency parsing, named entity recognition and text classification. | ||
| text corpus | 24 | |
| L06-1266 Korean, and point out a second order machine learning algorithm to unveil term similarity from a given raw ***** text corpus *****. | ||
| I17-2039 Relation Discovery discovers predicates (relation types) from a ***** text corpus ***** relying on the co-occurrence of two named entities in the same sentence. | ||
| 2021.sigmorphon-1.8 We describe the second SIGMORPHON shared task on unsupervised morphology: the goal of the SIGMORPHON 2021 Shared Task on Unsupervised Morphological Paradigm Clustering is to cluster word types from a raw ***** text corpus ***** into paradigms. | ||
| W17-5031 We present a very simple model for text quality assessment based on a deep convolutional neural network, where the only supervision required is one corpus of user-generated text of varying quality, and one contrasting ***** text corpus ***** of consistently high quality. | ||
| W18-0540 We developed solutions following three approaches: (i) a feature engineering method using lexical, n-gram and psycholinguistic features, (ii) a shallow neural network method using only word embeddings, and (iii) a Long Short-Term Memory (LSTM) language model, which is pre-trained on a large ***** text corpus ***** to produce a contextualized word vector. | ||
| lexical features | 24 | |
| 2020.semeval-1.192 Our architectures are built using embeddings from BERT in combination with additional ***** lexical features ***** and extensive label post-processing. | ||
| 2021.internlp-1.2 Through our pilot study, we analyze task success and compare the ***** lexical features ***** of user input. | ||
| D17-1010 Word embeddings improve generalization over ***** lexical features ***** by placing each word in a lower-dimensional space, using distributional information obtained from unlabeled data. | ||
| P17-2003 We show that if coreference resolvers mainly rely on ***** lexical features *****, they can hardly generalize to unseen domains. | ||
| W17-2203 In this paper, we focus on one particular class of ***** lexical features *****, namely emotion information, and investigate the hypothesis that emotion-related information correlates with particular genres. | ||
| web services | 24 | |
| L10-1110 We present the problem of categorizing ***** web services ***** according to a shallow ontology for presentation on a specialist portal, using their WSDL and associated textual documents found by a crawler. | ||
| 2020.iwltp-1.10 Several ***** web services ***** for various natural language processing (NLP) tasks (``NLP-as-a-service” or NLPaaS) have recently been made publicly available. | ||
| L14-1734 Unlike several related on-line processing environments, which predominantly instantiate a distributed architecture of ***** web services *****, LAP achives scalability to potentially very large data volumes through integration with the Norwegian national e-Infrastructure, and in particular job sumission to a capacity compute cluster. | ||
| 2020.coling-main.405 Finally, we report on a ***** web services ***** deployment, along with a web interface which helps users enter morphologically complex words and which retrieves corresponding entries from the lexicon. | ||
| L14-1630 Unlike some other language based infrastructures CLARIN - DK is not solely a repository for upload and storage of data , but also a platform of *****web services***** permitting the user to process data in various ways . | ||
| linguistic research | 24 | |
| W18-3910 However, ***** linguistic research ***** suggests that spoken language often differs from written language. | ||
| L12-1302 This corpus will be of paramount importance for socio***** linguistic research ***** and normalisation studies. | ||
| 2020.aacl-main.84 A collection of 16,207 qualified SMAWs are obtained using this technique along with an annotated corpus of more than 200,000 sentences for ***** linguistic research ***** and applicable inquiries. | ||
| 2021.codi-main.8 Cross-***** linguistic research ***** on discourse structure and coherence marking requires discourse-annotated corpora and connective lexicons in a large number of languages. | ||
| L10-1007 Although corpora opportunities are very useful, there is a need of another kind of software for further improvement of ***** linguistic research ***** as it is impossible to process huge amount of linguistic data manually. | ||
| semantic tagging | 24 | |
| D19-6128 Our results are that, on average, transductive auxiliary task self-training improves absolute accuracy by up to 9.56% over the pure multi-task model for dependency relation tagging and by up to 13.03% for ***** semantic tagging *****. | ||
| 2005.mtsummit-posters.2 This paper presents a Thai word segmentation system using semantic corpus which is composed of four steps: generating all possible candidates, proper noun consideration, ***** semantic tagging ***** and semantic checking. | ||
| D18-1526 We investigate the effects of multi-task learning using the recently introduced task of ***** semantic tagging *****. | ||
| L06-1116 The annotation consists of comprehensive morphological marking, syntactic tagging in the form of a complete dependency tree, and ***** semantic tagging ***** within a restricted semantic dictionary. | ||
| C16-1333 We propose a novel *****semantic tagging***** task , semtagging , tailored for the purpose of multilingual semantic parsing , and present the first tagger using deep residual networks ( ResNets ) . | ||
| content selection | 24 | |
| 2020.aacl-main.52 In this paper, we present empirical results showing that the performance of a cascaded pipeline that separately identifies important content pieces and stitches them together into a coherent text is comparable to or outranks that of end-to-end systems, whereas a pipeline architecture allows for flexible ***** content selection *****. | ||
| 2020.emnlp-main.339 This also holds true for Rotowire table-to-text generation, where our models surpass previously reported metrics for ***** content selection *****, planning and ordering, highlighting the strength of stepwise modeling. | ||
| 2020.acl-main.551 In this paper, we argue that elementary discourse unit (EDU) is a more appropriate textual unit of ***** content selection ***** than the sentence unit in abstractive summarization. | ||
| L10-1406 The current system comprises a series of classifiers that implement major Document Planning subtasks (namely, data interpretation, ***** content selection *****, within- and between-sentence structuring), and a small surface realisation grammar of Brazilian Portuguese. | ||
| 2020.nl4xai-1.3 This paper describes a *****content selection***** module for the generation of explanations in a dialogue system designed for customer care domain . | ||
| dependency structure | 24 | |
| P17-2068 In this work, we construct a corpus that ensures consistency between ***** dependency structure *****s and MWEs, including named entities. | ||
| L06-1070 The ***** dependency structure ***** patterns are generated by using two operations: combining and interpolation, which utilize ***** dependency structure *****s in the searched corpus. | ||
| 2000.amta-papers.5 The approach relies on canonical predicate-argument structures (or ***** dependency structure *****s), which provide a suitable pivot representation for the handling of structural divergences and the recovery of dropped arguments. | ||
| L10-1483 We propose a hierarchical ***** dependency structure ***** annotation schema that is more detailed and more flexible than the known annotation schemata. | ||
| L16-1263 Therefore we converted a phrase structure to a ***** dependency structure ***** after establishing an MWE as a single subtree. | ||
| nested named entity | 24 | |
| N18-1079 We propose a novel recurrent neural network-based approach to simultaneously handle ***** nested named entity ***** recognition and nested entity mention detection. | ||
| P19-1510 We describe NNE—a fine-grained, ***** nested named entity ***** dataset over the full Wall Street Journal portion of the Penn Treebank (PTB). | ||
| P19-1527 We propose two neural network architectures for ***** nested named entity ***** recognition (NER), a setting in which named entities may overlap and also be labeled with more than one label. | ||
| 2020.acl-main.571 In this paper, we propose a novel bipartite flat-graph network (BiFlaG) for ***** nested named entity ***** recognition (NER), which contains two subgraph modules: a flat NER module for outermost entities and a graph module for all the entities located in inner layers. | ||
| 2021.acl-long.275 This paper presents a novel method for ***** nested named entity ***** recognition. | ||
| surface realisation | 24 | |
| 2020.lrec-1.845 Instead of exchanging only individual words, the complete ***** surface realisation ***** of a sentences is altered while still preserving its meaning and function in a conversation. | ||
| L10-1406 The current system comprises a series of classifiers that implement major Document Planning subtasks (namely, data interpretation, content selection, within- and between-sentence structuring), and a small ***** surface realisation ***** grammar of Brazilian Portuguese. | ||
| D19-6301 We report results from the SR'19 Shared Task, the second edition of a multilingual ***** surface realisation ***** task organised as part of the EMNLP'19 Workshop on Multilingual Surface Realisation. | ||
| W18-6508 This paper presents SimpleNLG-NL, an adaptation of the SimpleNLG ***** surface realisation ***** engine for the Dutch language. | ||
| L10-1494 In Natural Language Generation (NLG), template-based ***** surface realisation ***** is an effective solution to the problem of producing surface strings from a given semantic representation, but many applications may not be able to provide the input knowledge in the required level of detail, which in turn may limit the use of the available NLG resources. | ||
| adverse drug | 24 | |
| W19-1903 In this paper we describe an evaluation of the potential of classical information extraction methods to extract drug-related attributes, including ***** adverse drug ***** events, and compare to more recently developed neural methods. | ||
| 2020.findings-emnlp.306 An ***** adverse drug ***** event (ADE) is an injury resulting from medical intervention related to a drug. | ||
| 2021.acl-long.488 Our experimental results show that the framework is highly effective, achieving new state-of-the-art results in two different benchmark datasets: BioRelEx (binding interaction detection) and ADE (***** adverse drug ***** event extraction). | ||
| W19-3207 The goals of the first two tasks are to classify whether a tweet contains mentions of ***** adverse drug ***** reactions (ADR) and extract these mentions, respectively. | ||
| P19-2058 In this work, we focus on extraction information of ***** adverse drug ***** reactions from various sources of biomedical textbased information, including biomedical literature and social media. | ||
| irony | 24 | |
| 2021.ranlp-1.88 In addition, considering emoji position can further improve the performance for the ***** irony ***** detection task compared to the emoji label prediction. | ||
| S18-1105 The system takes as starting point emotIDM, an ***** irony ***** detection model that explores the use of affective features based on a wide range of lexical resources available for English, reflecting different facets of affect. | ||
| E17-1025 Informed by linguistic theories, we propose for the first time a multi-layered annotation schema for ***** irony ***** and its application to a corpus of French, English and Italian tweets. | ||
| S18-1096 We create a targeted feature set and analyse how different features are useful in the task of ***** irony ***** detection, achieving an F1-score of 0.5914. | ||
| P18-2122 This paper addresses the issue of false-alarm hashtags in the self-labeled data for ***** irony ***** detection. | ||
| short text | 24 | |
| 2020.lrec-1.191 The main contribution of this paper is toevaluate the quality of the recently developed ”Spanish Database for cyberbullying prevention” for the purpose of trainingclassifiers on detecting abusive ***** short text *****s. | ||
| R19-1102 Very ***** short text *****s, such as tweets and invoices, present challenges in classification. | ||
| D19-1488 Most existing studies focus on long texts and achieve unsatisfactory performance on ***** short text *****s due to the sparsity and limited labeled data. | ||
| 2020.aacl-main.74 Considering the problem of information ambiguity and incompleteness for ***** short text *****, two kinds of knowledge, factual knowledge graph and conceptual knowledge graph, are introduced to provide additional knowledge for the semantic matching between candidate entity and mention context. | ||
| L12-1052 We performed a user study where subjects read ***** short text *****s translated by three MT systems and one human translation, while we gathered eye tracking data. | ||
| parallel text | 24 | |
| L10-1549 Statistical Machine Translation (MT) systems have achieved impressive results in recent years, due in large part to the increasing availability of ***** parallel text ***** for system training and development. | ||
| 2003.mtsummit-papers.37 This paper reports the results of experiments with resources collected in ten days; about 1.3 million words of ***** parallel text ***** from five types of sources and a bilingual term list with about 20,000 term pairs. | ||
| L12-1531 Since these methods search for possible translation equivalences in a greedy manner, they are unable to consider all possible ***** parallel text *****s in comparable documents. | ||
| 2001.mtsummit-papers.31 The result shows that bilingual term entries extracted from 2,000 pairs of ***** parallel text *****s which share a specific domain with the input texts introduce more improvements than a technical term dictionary with 38,000 entries which covers a broader domain. | ||
| L12-1529 The extraction of dictionaries from *****parallel text***** corpora is an established technique . | ||
| claims | 24 | |
| W19-0503 This allows us to conclude that, despite prior ***** claims *****, truth-theoretic models are good candidates for building graded lexical representations of meaning. | ||
| 2021.starsem-1.9 Specifically, ***** claims ***** are extracted from sentences that are carefully selected to be more informative. | ||
| 2020.lrec-1.611 In this paper we present the Vaccination Corpus, a corpus of texts related to the online vaccination debate that has been annotated with three layers of information about perspectives: attribution, ***** claims ***** and opinions. | ||
| D19-1216 Unfortunately, the number of ***** claims ***** that need to be fact-checked is several orders of magnitude larger than what humans can handle manually. | ||
| W17-5113 Motivated by the importance of topic identification in manual annotation , we examine whether topic modeling can be used for performing unsupervised detection of argumentative sentences , and to what extend topic modeling can be used to classify sentences as *****claims***** and premises . | ||
| detection of hate | 24 | |
| S19-2070 This paper describes the GSI-UPM system for SemEval-2019 Task 5, which tackles multilingual ***** detection of hate ***** speech on Twitter. | ||
| S19-2075 We tackled subtask A - “a binary classification where systems have to predict whether a tweet with a given target (women or immigrants) is hateful or not hateful”, a part of task 5 “Multilingual ***** detection of hate ***** speech against immigrants and women in Twitter (hatEval)”. | ||
| 2021.wassa-1.18 The automatic ***** detection of hate *****/offensive speech remains challenging (with .53 F1). | ||
| S19-2077 This paper describes our contribution to the SemEval-2019 Task 5 on the ***** detection of hate ***** speech against immigrants and women in Twitter (hatEval). | ||
| S19-2008 In this article, we describe our participation in HatEval, a shared task aimed at the ***** detection of hate ***** speech against immigrants and women. | ||
| science | 24 | |
| 2020.acl-main.51 The problem of comparing two bodies of text and searching for words that differ in their usage between them arises often in digital humanities and computational social ***** science *****. | ||
| 2020.coling-main.235 The novel framework shows an interesting perspective on machine reading comprehension and cognitive ***** science *****. | ||
| 2021.emnlp-main.845 Given this, we present a formalization of and study into the problem of exaggeration detection in ***** science ***** communication. | ||
| 2020.acl-srw.12 To achieve this, I plan to research methods using natural language processing and deep learning while employing models and using analysis concepts from the social ***** science *****s, where researchers have studied media bias for decades. | ||
| 2021.emnlp-main.784 Certainty and uncertainty are fundamental to *****science***** communication . | ||
| semantic dependency | 24 | |
| D18-1075 Different from conventional text generation tasks, the mapping between inputs and responses in conversations is more complicated, which highly demands the understanding of utterance-level ***** semantic dependency *****, a relation between the whole meanings of inputs and outputs. | ||
| 2020.emnlp-main.663 We describe a method for developing broad-coverage ***** semantic dependency ***** parsers for languages for which no semantically annotated resource is available. | ||
| S19-2014 The system is applied to the CONLLU format of the input data and is best suited for ***** semantic dependency ***** parsing. | ||
| 2021.eacl-main.66 We illustrate MTI with a system that performs part-of-speech tagging, syntactic dependency parsing and ***** semantic dependency ***** parsing. | ||
| P18-2106 Previous approaches to multilingual ***** semantic dependency ***** parsing treat languages independently, without exploiting the similarities between semantic structures across languages. | ||
| neural text | 24 | |
| D18-1001 We study a specific type of attack: an attacker eavesdrops on the hidden representations of a ***** neural text ***** classifier and tries to recover information about the input text. | ||
| 2020.findings-emnlp.159 In ***** neural text ***** editing, prevalent sequence-to-sequence based approaches directly map the unedited text either to the edited text or the editing operations, in which the performance is degraded by the limited source text encoding and long, varying decoding steps. | ||
| 2020.aacl-main.52 We finally discuss how we can take advantage of a cascaded pipeline in ***** neural text ***** summarization and shed light on important directions for future research. | ||
| 2021.emnlp-main.504 The largest available dataset for enthymemes (Habernal et al., 2018) consists of 1.7k samples, which is not large enough to train a ***** neural text ***** generation model. | ||
| N18-1204 We introduce an approach to ***** neural text ***** generation that explicitly represents entities mentioned in the text. | ||
| neural sequence | 24 | |
| 2021.naacl-main.353 We provide a novel dataset for this task, encompassing over 8,000 comparative entries, and show that ***** neural sequence ***** models outperform conventional methods applied to this task so far. | ||
| D19-1422 For ***** neural sequence ***** labeling, however, BiLSTM-CRF does not always lead to better results compared with BiLSTM-softmax local classification. | ||
| N18-1057 While most previous approaches introduce perturbations using features computed from local context windows, we instead develop error generation processes using a ***** neural sequence ***** transduction model trained to translate clean examples to their noisy counterparts. | ||
| 2020.intexsempar-1.4 Unfortunately, existing methods either rely on grammars which parse sentences with limited flexibility, or ***** neural sequence *****-to-sequence models that do not learn efficiently or reliably from individual examples. | ||
| P18-4013 NCRF++ is designed for quick implementation of different ***** neural sequence ***** labeling models with a CRF inference layer. | ||
| negative | 24 | |
| 2020.semeval-1.159 To utilise both text and image data, a multi-modal CNN-LSTM model is proposed to jointly learn latent features for positive, ***** negative ***** and neutral category predictions. | ||
| W19-5105 We train compositional models on observed compounds, more specifically the composed distributed representations of their constituents across a time-stamped corpus, while giving it corrupted instances (where head or modifier are replaced by a random constituent) as ***** negative ***** evidence. | ||
| 2020.coling-main.357 This is possible because existing alternation datasets contain positive, but no ***** negative ***** instances and are not comprehensive. | ||
| W18-5430 We propose a method that extends an existing human-elicited semantic property dataset with gold ***** negative ***** examples using crowd judgments. | ||
| 2021.acl-short.29 Based on Vision-and-Language BERT, we train UMIC to discriminate ***** negative ***** captions via contrastive learning. | ||
| neural question generation | 24 | |
| 2020.coling-main.249 Empirical evaluation shows our model to outperform the single-hop ***** neural question generation ***** models on both automatic evaluation metrics such as BLEU, METEOR, and ROUGE and human evaluation metrics for quality and coverage of the generated questions. | ||
| 2020.coling-main.509 We therefore employ the transformation rules to generate a large set of sentence-question-answer triples and train a ***** neural question generation ***** model on them to obtain both systematic question type coverage and robustness. | ||
| 2020.acl-main.355 We consider neural table-to-text generation and ***** neural question generation ***** (NQG) tasks for text generation from structured and unstructured data, respectively. | ||
| D17-1219 When incorporated into an existing ***** neural question generation ***** system, the resulting end-to-end system achieves state-of-the-art performance for paragraph-level question generation for reading comprehension. | ||
| 2021.bea-1.17 The state-of-the-art in ***** neural question generation ***** has advanced greatly, due in part to the availability of large datasets of question-answer pairs. | ||
| mathematical | 24 | |
| 2021.sigmorphon-1.2 In doing so, we provide a foundation for future ***** mathematical *****ly grounded investigations of the syntax-prosody interface. | ||
| 1991.iwpt-1.26 This work is currently being applied in the interpretation of hand-sketched ***** mathematical ***** expressions and structured flowcharts on notebook computers and interactive worksurfaces. | ||
| 2021.semspace-1.8 The first considers the question of whether an explicit ***** mathematical ***** representation can be successful using only tools from within linear algebra, or whether other ***** mathematical ***** tools are needed. | ||
| 2020.sdp-1.16 Therefore, a collaboration of natural language processing and formula analyses, so-called ***** mathematical ***** language processing, is necessary to enable computers to understand and retrieve information from the documents. | ||
| 2020.lrec-1.269 Extending machine reading approaches to extract ***** mathematical ***** concepts and their descriptions is useful for a variety of tasks, ranging from ***** mathematical ***** information retrieval to increasing accessibility of scientific documents for the visually impaired. | ||
| kg completion | 24 | |
| W18-3017 Knowledge Graph (KG) embedding projects entities and relations into low dimensional vector space, which has been successfully applied in *****KG completion***** task. | ||
| 2020.emnlp-main.131 This work proposes an adaptive attentional network for few-shot *****KG completion***** by learning adaptive entity and reference representations. | ||
| D19-1268 Most existing *****KG completion***** methods only consider the direct relation between nodes and ignore the relation paths which contain useful information for link prediction. | ||
| 2021.emnlp-main.639 To foster further research, we provide the first unified open-source framework for temporal *****KG completion***** models with full composability, where temporal embeddings, score functions, loss functions, regularizers, and the explicit modeling of reciprocal relations can be combined arbitrarily. | ||
| 2020.findings-emnlp.290 Experiments on the basis of five real-world language-specific KGs show that, by effectively identifying and leveraging complementary knowledge, KEnS consistently improves state-of-the-art methods on *****KG completion*****. | ||
| vqa dataset | 24 | |
| 2020.nlpbt-1.6 Discriminative decoders were designed to achieve such decomposition, and the method was experimentally implemented on *****VQA datasets***** containing full-sentence answers. | ||
| 2020.findings-emnlp.44 We exploit ConceptNet KG for encoding the common sense knowledge and evaluate our methodology on the Outside Knowledge-VQA (OK-VQA) and *****VQA datasets*****. | ||
| D19-1596 We demonstrate that our CTM-based training improves the consistency of VQA models on the Con-*****VQA datasets***** and is a strong baseline for further research. | ||
| 2020.coling-main.169 We exploit ConceptNet as the source of general knowledge and evaluate the performance of our model on the challenging OK-*****VQA dataset*****. | ||
| 2021.naacl-main.192 Crucially, only information that is readily available in any *****VQA dataset***** is used to compute its scores. | ||
| bidirectional rnn | 24 | |
| D18-1532 We present LemmaTag, a featureless neural network architecture that jointly generates part-of-speech tags and lemmas for sentences by using *****bidirectional RNNs***** with character-level and word-level embeddings. | ||
| W17-4104 We use a simple *****bidirectional RNN***** with LSTM nodes and achieve accuracy of 90% or higher. | ||
| C16-1044 We investigate the *****Bidirectional RNN***** and the inclusion of external information (for instance low level information from Part-Of-Speech tags) in the RNN to train a more complex tagger (for instance, a multilingual super sense tagger). | ||
| I17-1027 The model improves identification of clause-level coordination using *****bidirectional RNNs***** incorporating two properties as features. | ||
| 2020.iwslt-1.11 All systems were neural based, including a fully-connected neural network for speech activity detection, a Kaldi factorized time delay neural network with recurrent neural network (RNN) language model rescoring for speech recognition, a *****bidirectional RNN***** with attention mechanism for sentence segmentation, and transformer networks trained with OpenNMT and Marian for machine translation. | ||
| automatic speech recognition ( ASR ) | 24 | |
| L14-1442 This paper presents a plugin that adds *****automatic speech recognition ( ASR )***** functionality to the WaveSurfer sound manipulation and visualisation program . | ||
| 2020.autosimtrans-1.3 In many practical applications , neural machine translation systems have to deal with the input from *****automatic speech recognition ( ASR )***** systems which may contain a certain number of errors . | ||
| 2021.nodalida-main.10 In this paper , we propose spectral modification by sharpening formants and by reducing the spectral tilt to recognize children 's speech by *****automatic speech recognition ( ASR )***** systems developed using adult speech . | ||
| 2018.iwslt-1.28 A spoken language translation ( ST ) system consists of at least two modules : an *****automatic speech recognition ( ASR )***** system and a machine translation ( MT ) system . | ||
| L08-1496 Texts generated by *****automatic speech recognition ( ASR )***** systems have some specificities , related to the idiosyncrasies of oral productions or the principles of ASR systems , that make them more difficult to exploit than more conventional natural language written texts . | ||
| statistical machine translation ( SMT ) | 24 | |
| 2008.iwslt-evaluation.17 This paper gives a description of the *****statistical machine translation ( SMT )***** systems developed at the TALP Research Center of the UPC ( Universitat Polite`cnica de Catalunya ) for our participation in the IWSLT'08 evaluation campaign . | ||
| 2012.amta-tutorials.6 Several studies have recently reported significant productivity gains by human translators when besides translation memory ( TM ) matches they do also receive suggestions from a *****statistical machine translation ( SMT )***** engine . | ||
| 2014.iwslt-evaluation.22 This work describes the *****statistical machine translation ( SMT )***** systems of RWTH Aachen University developed for the evaluation campaign International Workshop on Spoken Language Translation ( IWSLT ) 2014 . | ||
| 2008.amta-papers.19 The continuous emergence of new technical terms and the difficulty of keeping up with neologism in parallel corpora deteriorate the performance of *****statistical machine translation ( SMT )***** systems . | ||
| 2011.iwslt-papers.5 The increasing popularity of *****statistical machine translation ( SMT )***** systems is introducing new domains of translation that need to be tackled . | ||
| rich | 24 | |
| W17-1719 We are developing a broad - coverage deep semantic lexicon for a system that parses sentences into a logical form expressed in a *****rich***** ontology that supports reasoning . | ||
| 2020.clinicalnlp-1.11 Clinical notes contain *****rich***** information , which is relatively unexploited in predictive modeling compared to structured data . | ||
| L12-1411 We present a novel tool for morphological analysis of Serbian , which is a low - resource language with *****rich***** morphology . | ||
| D19-1508 Most previous research focus on investigating the standard electronic medical records for symptom diagnosis , while the dialogues between doctors and patients that contain more *****rich***** information are not well studied . | ||
| L12-1480 Thanks to their *****rich***** morphology , Italian and Spanish allow pro - drop pronouns , i.e. , non lexically - realized subject pronouns . | ||
| Named entity recognition ( NER | 24 | |
| P19-1510 *****Named entity recognition ( NER***** ) is widely used in natural language processing applications and downstream tasks . | ||
| 2020.ccl-1.86 *****Named entity recognition ( NER***** ) aims to identify text spans that mention named entities and classify them into pre - defined categories . | ||
| P19-1585 *****Named entity recognition ( NER***** ) is one of the best studied tasks in natural language processing . | ||
| L14-1339 *****Named entity recognition ( NER***** ) is a knowledge - intensive information extraction task that is used for recognizing textual mentions of entities that belong to a predefined set of categories , such as locations , organizations and time expressions . | ||
| P19-1587 *****Named entity recognition ( NER***** ) is the backbone of many NLP solutions . | ||
| e - commerce | 24 | |
| L14-1309 In recent years we have observed two parallel trends in computational linguistics research and *****e - commerce***** development . | ||
| 2021.eacl-demos.8 With the increasing number of user comments in diverse domains , including comments on online journalism and *****e - commerce***** websites , the manual content analysis of these comments becomes time - consuming and challenging . | ||
| 2021.mtsummit-research.20 Product reviews provide valuable feedback of the customers and however and they are available today only in English on most of the *****e - commerce***** platforms . | ||
| W17-2004 In this paper , we study how humans perceive the use of images as an additional knowledge source to machine - translate user - generated product listings in an *****e - commerce***** company . | ||
| E17-1091 The cataloging of product listings through taxonomy categorization is a fundamental problem for any *****e - commerce***** marketplace , with applications ranging from personalized search recommendations to query understanding . | ||
| NLP datasets | 23 | |
| 2020.acl-main.244 We systematically measure out-of-distribution (OOD) generalization for seven ***** NLP datasets ***** by constructing a new robustness benchmark with realistic distribution shifts. | ||
| 2021.emnlp-main.457 Experiments on noisy and corrupted ***** NLP datasets ***** show that proposed instance-adaptive training frameworks help increase the noise-robustness provided by such losses, promoting the use of the frameworks and associated losses in NLP models trained with noisy data. | ||
| 2021.naacl-main.88 Since there's only a handful of datasets for any NLP problem, meta-learners tend to overfit their adaptation mechanism and, since ***** NLP datasets ***** are highly heterogeneous, many learning episodes have poor transfer between their support and query sets, which discourages the meta-learner from adapting. | ||
| N19-4010 The framework also implements standard model training and hyperparameter selection routines, as well as a data fetching module that can download publicly available ***** NLP datasets ***** and convert them into data structures for quick set up of experiments. | ||
| 2021.gem-1.11 Nevertheless, the adoption of standard documentation practices across the field of NLP promotes more accessible and detailed descriptions of ***** NLP datasets ***** and models, while supporting researchers and developers in reflecting on their work | ||
| transliterations | 23 | |
| L10-1013 We describe ScriptTranscriber, an open source toolkit for extracting ***** transliterations ***** in comparable corpora from languages written in different scripts. | ||
| 2004.amta-papers.20 We use a seed list of proper names and ***** transliterations ***** to train a Machine Transliteration Model. | ||
| L08-1584 HeiNER contains 1,547,586 disambiguated English Named Entities together with translations and ***** transliterations ***** to 15 languages. | ||
| 2021.emnlp-main.384 We present models which complete missing text given ***** transliterations ***** of ancient Mesopotamian documents, originally written on cuneiform clay tablets (2500 BCE - 100 CE). | ||
| 2020.aacl-main.40 In addition to our system, we also release a new English-to-Chinese dataset and propose a novel evaluation metric which considers multiple possible ***** transliterations ***** given a source name | ||
| fragment | 23 | |
| 2020.acl-main.641 To address this concern, we propose to explicitly segment target text into ***** fragment ***** units and align them with their data correspondences. | ||
| 2021.semeval-1.177 The goal of the modifier classification task was to determine whether an associated text ***** fragment ***** served to indicate range, tolerance, mean value, etc. of a quantity. | ||
| P17-1114 Subsequently, a simple feedforward neural network (FFNN) is learned to either reject or predict entity label for each individual text ***** fragment *****. | ||
| 1997.iwpt-1.17 First, we show that an experimental parser runs polynomially in practice on a realistic ***** fragment ***** of Japanese by eliminating spurious ambiguity and excluding genuine ambiguities. | ||
| 2020.semeval-1.187 The purpose of TC task was to identify an applied propaganda technique given propaganda text ***** fragment ***** | ||
| repository | 23 | |
| L08-1035 Finally, we report on our experiences with the application of model, API and ***** repository ***** when developing web applications for collection managers in cultural heritage institutions. | ||
| 2012.amta-government.4 The data production effort began with a survey of Dari documents catalogued in a government ***** repository ***** of material obtained from the field in Afghanistan. | ||
| L12-1483 We explain the underlying motivation for such a distributed ***** repository ***** for metadata storage and give a detailed overview on the META-SHARE application and its various components. | ||
| L16-1223 Finally, we also present the web ***** repository ***** that will make the corpus available to different types of users, and will allow its exploitation for research purposes and other applications (e.g. teaching of LSE or design of tasks for signed language assessment). | ||
| R19-1019 The testing was done using outpatient records from a nation-wide ***** repository ***** available for the period 2011-2016 | ||
| lexicalization | 23 | |
| L12-1259 On the one hand, it should help refine existing compound classifications and better explain ***** lexicalization ***** in both languages. | ||
| W89-0235 Then we show how a general Earley-type TAG parser (Schabes and Joshi, 1988) can take advantage of ***** lexicalization *****. | ||
| N19-1377 We also investigated the effect of ***** lexicalization ***** on language generation, and found that ***** lexicalization ***** schemes that give priority to content words have certain advantages over those focusing on dependency relations. | ||
| Q19-1005 Our experiments show that unlexicalized models systematically achieve higher results than lexicalized models, and provide additional empirical evidence that ***** lexicalization ***** is not necessary to achieve strong parsing results. | ||
| W18-5528 Many approaches to automatically recognizing entailment relations have employed classifiers over hand engineered lexicalized features, or deep learning models that implicitly capture ***** lexicalization ***** through word embeddings | ||
| validating | 23 | |
| 2021.ranlp-1.76 After ***** validating ***** the ranking, we propose methods to use it to quantify the magnitude of bias in political news articles. | ||
| 2021.acl-long.566 MT evaluations in recent papers tend to copy and compare automatic metric scores from previous work to claim the superiority of a method or an algorithm without confirming neither exactly the same training, ***** validating *****, and testing data have been used nor the metric scores are comparable. | ||
| S19-2128 We obtain significantly better results in the leader-board for Sub-task B and decent results for Sub-task A and Subtask C ***** validating ***** the fact that the proposed models can be used for automating the offensive post-detection task in social media. | ||
| W16-5405 The inter-agreement calculated by Cohen's Kappa among raters after ***** validating ***** is 0.685. | ||
| L08-1084 The result of the listening test was compared with an earlier test ***** validating ***** a part of the French corpus | ||
| Intuitively | 23 | |
| 2020.coling-main.475 ***** Intuitively *****, the news from trusted and authoritative sources or shared by many users with a good reputation is more reliable than other news. | ||
| 2020.coling-main.416 ***** Intuitively *****, the closer they are to natural languages, the higher the gains from pretraining on them should be. | ||
| 2021.emnlp-main.766 ***** Intuitively *****, the pointer-generator should outperform neural-retrieval, and retrieve-and-edit should perform best. | ||
| 2021.fever-1.2 ***** Intuitively *****, the retrieval of the correct evidence plays a crucial role in this process. | ||
| I17-2026 ***** Intuitively *****, the learned knowledge of one task should inform the other learning task | ||
| Ideally | 23 | |
| 2021.naacl-main.262 ***** Ideally *****, an AI agent should answer a human-like reply and validate the correctness of any answer. | ||
| D17-1206 ***** Ideally *****, the linguistic levels of morphology, syntax and semantics would benefit each other by being trained in a single model. | ||
| D19-1500 ***** Ideally *****, we would like to empower users to inform themselves about the issues that matter to them, and enable them to selectively explore these issues. | ||
| L06-1165 ***** Ideally *****, the student might benefit the most from having a human expert a teacher or trainer at hand throughout, but human expertise remains a scarce resource. | ||
| 2021.splurobonlp-1.7 ***** Ideally *****, people who navigate together in a complex indoor space share a mental model that facilitates explanation | ||
| logographic | 23 | |
| 2021.cl-3.16 Taxonomies of writing systems since Gelb (1952) have classified systems based on what the written symbols represent: if they represent words or morphemes, they are ***** logographic *****; if syllables, syllabic; if segments, alphabetic; and so forth. | ||
| W17-4109 Chinese script is ***** logographic ***** and many Chinese logograms are composed of common substructures that provide semantic, phonetic and syntactic hints. | ||
| D18-1320 In this work, we propose a multimodal approach to predict the pronunciation of Cantonese ***** logographic ***** characters, using neural networks with a geometric representation of logographs and pronunciation of cognates in historically related languages. | ||
| W18-6303 This study focuses on these differences and uses a simple approach to improve the performance of NMT systems utilizing decomposed sub-character level information for ***** logographic ***** languages | ||
| D19-6404 Chinese characters are unique in its *****logographic***** nature , which inherently encodes world knowledge through thousands of years evolution . | ||
| rerank | 23 | |
| 2020.sustainlp-1.14 Researchers have proposed simple yet effective techniques for the retrieval problem based on using BERT as a relevance classifier to ***** rerank ***** initial candidates from keyword search. | ||
| S17-2050 B. The objective is to ***** rerank ***** questions obtained in web forum as per their similarity to original question. | ||
| P19-1306 By including query likelihood scores as extra features, our model effectively learns to ***** rerank ***** the retrieved documents by using a small number of relevance labels for low-resource language pairs. | ||
| 2021.naacl-main.363 We specifically emphasize on the importance of retrieving evidence jointly by showing several comparative analyses to other methods that retrieve and ***** rerank ***** evidence sentences individually. | ||
| 2021.acl-long.317 Specifically, we first select the candidate answers relevant to the question or the image, then we ***** rerank ***** the candidate answers by a visual entailment task, which verifies whether the image semantically entails the synthetic statement of the question and each candidate answer | ||
| iterations | 23 | |
| 2020.aacl-main.45 The self-supervised refinement achieves most machine translation gains in the first iteration, but following ***** iterations ***** further improve its intrinsic evaluation. | ||
| L06-1272 The paper presents a hybrid sentence aligner that has two alignment ***** iterations *****. | ||
| P18-2048 In this approach, a weight is assigned to each sentence based on the measured difference between the training costs of two ***** iterations *****. | ||
| 2021.mtsummit-research.11 We stop the training of the Denoising UNMT model after a pre-decided number of ***** iterations ***** and resume the training for the remaining ***** iterations *****- which number is also pre-decided- using original sentence as input without adding any noise. | ||
| 2021.acl-srw.23 We show that tokens do not require the same amount of ***** iterations ***** and that difficult or crucial tokens for the task are subject to more ***** iterations ***** | ||
| preserving | 23 | |
| 2021.ranlp-1.27 The style transfer task (here style is used in a broad “authorial” sense with many aspects including register, sentence structure, and vocabulary choice) takes text input and rewrites it in a specified target style ***** preserving ***** the meaning, but altering the style of the source text to match that of the target. | ||
| 2020.findings-emnlp.400 We present and investigate two self-supervised objectives: ***** preserving ***** latent consistency and modeling conversational behavior. | ||
| L10-1064 This resource, combined with a syntactic parser, a semantic disambiguator and some derivational patterns, helps to reformulate an original sentence while keeping the initial meaning in a convincing manner This approach has been evaluated in three different ways: the precision of the derivatives produced from a lemma; its ability to provide well-formed reformulations from an original sentence, ***** preserving ***** the initial meaning; its impact on the results coping with a real issue, \textitie a question answering task . | ||
| 2020.acl-main.138 Extensive experiments on English and German named entity recognition benchmarks confirmed that NAT consistently improved robustness of popular sequence labeling models, ***** preserving ***** accuracy on the original input. | ||
| 2020.privatenlp-1.5 We study the feasibility of learning a language model which is simultaneously high-quality and privacy ***** preserving ***** by tuning a public base model on a private corpus | ||
| FEVEROUS | 23 | |
| 2021.fever-1.12 In this paper, we tackle the ***** FEVEROUS ***** (Fact Extraction and VERification Over Unstructured and Structured information) challenge which consists of an open source baseline system together with a benchmark dataset containing 87,026 verified claims. | ||
| 2021.fever-1.1 Unlike FEVER 2018, ***** FEVEROUS ***** requires partial evidence to be returned for NotEnoughInfo claims, and the claims are longer and thus more complex. | ||
| 2021.fever-1.7 The ***** FEVEROUS ***** shared task introduces a benchmark for fact verification, in which a system is challenged to verify the given claim using the extracted evidential elements from Wikipedia documents. | ||
| 2021.fever-1.10 This paper describes a method for retrieving evidence and predicting the veracity of factual claims , on the *****FEVEROUS***** dataset . | ||
| 2021.fever-1.3 This paper presents an end - to - end system for fact extraction and verification using textual and tabular evidence , the performance of which we demonstrate on the *****FEVEROUS***** dataset . | ||
| coding | 23 | |
| L16-1287 We wished to examine whether human annotators could agree on ***** coding ***** this difficult task and whether Machine Learning (ML) could be applied reliably to replicate the ***** coding ***** process on a much larger scale. | ||
| 2021.ccl-1.106 And then perform Part-of-Speech attention ***** coding ***** for character-level embedding perform semantic Graph Convolutional Network ***** coding *****for the spliced character-word embedding. | ||
| L12-1441 Based on our experience with iterative guideline refinement we propose to carefully characterize the thematic scope of the annotation by positive and negative ***** coding ***** lists and allow for alternative, short vs. long mention span annotations. | ||
| L10-1131 Dybkjaer and Bernsen (2002), point out that ***** coding ***** schemes for multimodal data are used solely by their creators. | ||
| 2020.emnlp-main.289 Targeting this issue, we regard the task as a sequence labeling problem and propose a novel tagging scheme with ***** coding ***** the distance between linked components into the tags, so that emotions and the corresponding causes can be extracted simultaneously | ||
| XLM | 23 | |
| 2021.ranlp-1.27 First, we show that adding “content embeddings” to the ***** XLM ***** which capture human-specified groupings of subject matter can improve performance over the baseline model. | ||
| 2020.multilingualbio-1.5 The corpus has been evaluated with three methods using the BERT model applied to Spanish: Multilingual BERT, BETO and ***** XLM *****. | ||
| 2021.naacl-main.10 We find that BERT and ***** XLM ***** models successfully predict a range of eye tracking features. | ||
| 2021.bsnlp-1.13 In it we evaluate various pre - trained language models on the NER task using three open - source NLP toolkits : character level language model with Stanza , language - specific BERT - style models with SpaCy and Adapter - enabled *****XLM***** - R with Trankit . | ||
| 2021.semeval-1.98 We experiment with *****XLM***** RoBERTa for Word in Context Disambiguation in the Multi Lingual and Cross Lingual setting so as to develop a single model having knowledge about both settings . | ||
| analytic | 23 | |
| D18-2007 In this work, we present a flexible visualization library for creating customized visual ***** analytic ***** environments, in which the user can investigate and interrogate the relationships among the input, the model internals (i.e., attention), and the output predictions, which in turn shed light on the model decision-making process. | ||
| 2010.amta-government.4 This approach will accelerate HLT development, contain sustainment cost, minimize training, and brings the MT, OCR, ASR, audio/video, entity extraction, ***** analytic ***** tools and database under one umbrella, thus reducing the total cost of ownership. | ||
| 2021.sigtyp-1.2 Instead, ***** analytic ***** and synthetic must be viewed as two poles of a continuum and languages may show a mix ***** analytic ***** and synthetic strategies to different degrees. | ||
| W19-4433 We first propose and formalize two novel ***** analytic *****al assessment tasks: ***** analytic ***** score prediction and justification identification, and then provide the first dataset created for ***** analytic ***** short answer scoring research. | ||
| 2021.eacl-srw.17 We expect this research project's outcomes could contribute to the research domains of NLP and AI but also the healthcare field by providing a more accessible and affordable sleep treatment solution and an automated ***** analytic ***** system to lessen the burden of human experts | ||
| diacritic | 23 | |
| N18-4008 In this work, we applied embedding models to the ***** diacritic ***** restoration task and compared their performances to those of n-gram models. | ||
| W18-4004 Our results indicate that the projected models not only outperform the trained ones on the semantic-based tasks of analogy, word-similarity, and odd-word identifying, but they also achieve enhanced performance on the ***** diacritic ***** restoration with learned ***** diacritic ***** embeddings. | ||
| L12-1440 This article presents the problem of ***** diacritic ***** restoration (or diacritization) in the context of spell-checking, with the focus on an orthographically rich language such as Spanish. | ||
| 2020.wanlp-1.13 In NER, our architecture incorporates ***** diacritic ***** and POS embeddings alongside word and character embeddings | ||
| L08-1361 The Arabic Treebank ( ATB ) , released by the Linguistic Data Consortium , contains multiple annotation files for each source file , due in part to the role of *****diacritic***** inclusion in the annotation process . | ||
| delexicalized | 23 | |
| 2020.coling-industry.10 We present a neural model for paraphrasing and train it to generate ***** delexicalized ***** sentences. | ||
| C16-1012 This paper studies cross-lingual transfer for dependency parsing, focusing on very low-resource settings where ***** delexicalized ***** transfer is the only fully automatic option. | ||
| 2021.emnlp-main.558 We propose Group Learning, a knowledge and model distillation approach for fact verification in which multiple student models have access to different ***** delexicalized ***** views of the data, but are encouraged to learn from each other through pair-wise consistency losses. | ||
| 2020.lrec-1.850 These datasets are ***** delexicalized ***** using two methods: one which replaces the lexical entities in an overlap-aware manner, and a second, which additionally incorporates semantic lifting of nouns and verbs to their WordNet hypernym synsets | ||
| 2020.vardial-1.13 We observe that parsing models trained on Occitan dialects achieve better results than a ***** delexicalized ***** model trained on other Romance languages despite the latter training corpus being much larger (20K vs 900K tokens) | ||
| quadratic | 23 | |
| P18-1107 This transformation at most doubles the grammar's rank and cubes its size, but we show that in practice the size increase is only ***** quadratic *****. | ||
| 2020.lrec-1.157 Overall, the BERT model achieved the best root mean squared error and ***** quadratic ***** weighted kappa scores. | ||
| 2021.emnlp-main.127 However, Transformer's ***** quadratic ***** complexity with respect to the input sequence length prevents its adoption as is with audio signals, which are typically represented by long sequences. | ||
| D18-1330 Existing works typically solve a ***** quadratic ***** problem to learn a orthogonal matrix aligning a bilingual lexicon, and use a retrieval criterion for inference. | ||
| D18-1090 In order to address this issue, we propose a reinforcement learning framework for essay scoring that incorporates ***** quadratic ***** weighted kappa as guidance to optimize the scoring system | ||
| audiovisual | 23 | |
| L14-1190 The project aims to collect, share and reuse ***** audiovisual ***** language resources from broadcasters and subtitling companies to develop large vocabulary continuous speech recognisers in specific domains and new languages, with the purpose of solving the automated subtitling needs of the media industry. | ||
| L16-1736 In the paper, we report the methodology of collecting and processing analog ***** audiovisual ***** material, constructing the corpus and describe the properties of the data. | ||
| 2021.triton-1.15 Results show that NMT engines offer good-quality translations, which in turn may benefit translators working with ***** audiovisual ***** entertainment resources. | ||
| L12-1600 The eye-tracking analysis reveals significant changes in the gaze behavior of the human subjects; participants reduce their focus of attention in the ***** audiovisual ***** condition mainly to the region of the face of the politician and scan the upper body, including hands and arms, in the video only condition | ||
| 2009.jeptalnrecital-court.34 Semantic access to multimedia content in *****audiovisual***** archives is to a large extent dependent on quantity and quality of the metadata , and particularly the content descriptions that are attached to the individual items . | ||
| span | 23 | |
| 2020.findings-emnlp.398 In this paper, we propose a novel joint model of syntactic and semantic parsing on both ***** span ***** and dependency representations, which incorporates syntactic information effectively in the encoder of neural network and benefits from two representation formalisms in a uniform way. | ||
| D19-5903 When event mentions are allowed to cover many tokens, annotators may disagree on their ***** span *****, which means that overlapping annotations may then refer to the same event or to different events. | ||
| 2021.conll-1.18 Furthermore, our results show that for certain linguistic phenomena which are not limited to one or two words (such as word ambiguity or gender) but ***** span ***** over several words or even entire phrases (such as negation or relative clause), disagreements do not necessarily represent “errors” or “noise” but are rather inherent to the evaluation process. | ||
| 2021.emnlp-main.395 Using a relatively small number of ***** span ***** constraints we can substantially improve the output from DIORA, an already competitive unsupervised parsing system. | ||
| 2021.emnlp-main.686 However, this decoupling of ***** span ***** detection and classification is problematic from a modelling perspective and ignores global structural correspondences between sentence-level and word-level information present in a given text | ||
| multimodal corpora | 23 | |
| 2021.emnlp-main.633 To this purpose, we propose a method to build synthetic ***** multimodal corpora ***** enabling to train multimodal components for a data-QuestEval metric. | ||
| 2020.framenet-1.4 This paper reports on research aiming to determine appropriate methodology and develop a computational tool to annotate ***** multimodal corpora ***** according to a principled structured semantic representation of events, relations and entities: FrameNet. | ||
| L10-1501 To handle these kind of data sets, his paper introduces the Ariadne Corpus Management System that allows researchers to manage and create ***** multimodal corpora ***** from multiple heteogeneous data sources. | ||
| L10-1058 This paper presents the ***** multimodal corpora ***** that are being collected and annotated in the Nordic NOMCO project | ||
| L04-1166 To investigate the temporal relationships of these knowledge sources, we have collected and are annotating several ***** multimodal corpora ***** with time-aligned features. | ||
| interpersonal | 23 | |
| D19-1179 Stylistic variation in text needs to be studied with different aspects including the writer's personal traits, ***** interpersonal ***** relations, rhetoric, and more. | ||
| 2020.sigdial-1.25 We found that such filtering does help in observing convergence suggesting that studies on ***** interpersonal ***** dynamics should consider such high level dialogue activity types and their related NLP topics as important ingredients of their toolboxes. | ||
| W17-2908 In this paper, we present a novel approach to study and measure ***** interpersonal ***** influence in daily interactions. | ||
| 2021.acl-long.54 Moreover, we formulate a sequential structure prediction task, and propose an α-β-γ strategy to incrementally parse SocAoG for the dynamic inference upon any incoming utterance: (i) an α process predicting attributes and relations conditioned on the semantics of dialogues, (ii) a β process updating the social relations based on related attributes, and (iii) a γ process updating individual's attributes based on ***** interpersonal ***** social relations. | ||
| L14-1225 In this paper, we present a method for extracting and generating sequences of non-verbal signals expressing ***** interpersonal ***** attitudes | ||
| settings | 23 | |
| N18-1080 We further adapt to ***** settings ***** without mention-level annotation by jointly training to predict named entities and adding a corpus of weakly labeled data. | ||
| L16-1221 However, many studies do not include tests in real ***** settings *****, because data collection in this domain is very expensive and challenging and because of the few available data sets. | ||
| W18-1201 Neural machine translation has achieved impressive results in the last few years, but its success has been limited to ***** settings ***** with large amounts of parallel data. | ||
| 2020.wmt-1.64 It is hard to predict in which ***** settings ***** it will be effective, and what limits performance compared to a fully supervised system. | ||
| P19-1095 The majority of the studies focus on ***** settings ***** where both modalities are available in training and evaluation | ||
| coreference resolvers | 23 | |
| P17-1009 While joint models have been developed for many NLP tasks, the vast majority of event ***** coreference resolvers *****, including the top-performing resolvers competing in the recent TAC KBP 2016 Event Nugget Detection and Coreference task, are pipeline-based, where the propagation of errors from the trigger detection component to the event coreference component is a major performance limiting factor. | ||
| 2020.codi-1.16 This is because the CoNLL benchmark fails to evaluate the ability of ***** coreference resolvers ***** that requires linking novel mentions unseen at train time. | ||
| W19-3819 As an example, Google AI Language team recently released a gender-balanced dataset and showed that performance of these ***** coreference resolvers ***** is significantly limited on the dataset. | ||
| L14-1570 The system is developed to provide starting ground for further experiments and generate a reference baseline to be compared with more advanced rule-based and machine learning based future ***** coreference resolvers ***** | ||
| P17-2003 We show that if ***** coreference resolvers ***** mainly rely on lexical features, they can hardly generalize to unseen domains. | ||
| theoretical | 23 | |
| K18-1029 Simple reference games are of central ***** theoretical ***** and empirical importance in the study of situated language use. | ||
| L16-1243 Treebanks are important resources for researchers in natural language processing, speech recognition, ***** theoretical ***** linguistics, etc. | ||
| L10-1245 In section 2, we address ***** theoretical ***** and practical issues, emphasizing the outstanding features of the Creagest Project. | ||
| W18-5819 In contrast, ***** theoretical ***** linguistics usually works with deterministic generalizations. | ||
| L16-1368 The results indicate that there is considerable variation in treatments across treebanks and thereby also, to some extent, across languages and across ***** theoretical ***** frameworks | ||
| layers | 23 | |
| 2020.findings-emnlp.389 Peeking into the inner workings of BERT has shown that its ***** layers ***** resemble the classical NLP pipeline, with progressively more complex tasks being concentrated in later ***** layers *****. | ||
| 2020.lrec-1.825 This is done by introducing 2-dimensional convolutional self-attention into the first ***** layers ***** of the encoder. | ||
| D17-1191 In contradictory to popular beliefs that ResNet only works well for very deep networks, we found that even with 9 ***** layers ***** of CNNs, using identity mapping could significantly improve the performance for distantly-supervised relation extraction. | ||
| 2020.acl-srw.10 We propose a modification to the Transformer model to combine subword-level representations into word-level ones in the first ***** layers ***** of the encoder, reducing the effective length of the sequences in the following ***** layers ***** and providing a natural point to incorporate extra word-level information. | ||
| 2021.naacl-main.162 We first take into consideration all the linguistic information embedded in the past ***** layers ***** and then take a further step to engage the future information which is originally inaccessible for predictions. | ||
| dense | 23 | |
| S19-2004 We propose a combined approach that employs two different types of vector representations: ***** dense ***** representations from hidden layers of a masked language model, and sparse representations based on substitutes for the target word in the context. | ||
| 2021.emnlp-main.78 However, neural models' ***** dense ***** representations are more suitable for re-ranking, due to their inefficiency. | ||
| 2021.wmt-1.116 Because the network is still ***** dense *****, efficient matrix multiply routines are still used and only minimal software changes are required to support variable layer sizes. | ||
| 2021.acl-long.518 In this work, we show for the first time that we can learn ***** dense ***** representations of phrases alone that achieve much stronger performance in open-domain QA. | ||
| 2021.emnlp-main.148 In this paper, we present a novel approach to zero-shot slot filling that extends ***** dense ***** passage retrieval with hard negatives and robust training procedures for retrieval augmented generation models | ||
| length | 23 | |
| L14-1417 Regression models were built to predict ***** length ***** from cognitive abilities and user satisfaction from ***** length *****. | ||
| 2020.tacl-1.11 Across both soft and hard attention, we show strong theoretical limitations of the computational abilities of self-attention, finding that it cannot model periodic finite-state languages, nor hierarchical structure, unless the number of layers or heads increases with input ***** length *****. | ||
| 2020.coling-main.319 The average translation results using our ***** length ***** prediction model were also better than another baseline method using input ***** length *****s for the ***** length ***** constraints. | ||
| W16-4124 To obtain the estimates for data ***** length ***** tending to infinity, we use an extrapolation function given by an ansatz. | ||
| W18-1704 Additional tests, which take advantage of the fact that the ***** length ***** of compressions can be modulated, still improve ROUGE scores with shorter output sentences. | ||
| noise | 23 | |
| 2021.nlp4convai-1.7 By leveraging cross-***** noise ***** robustness transfer, i.e. training on one ***** noise ***** type to improve robustness on another ***** noise ***** type, we design aggregate data-augmentation approaches that increase the model performance across all seven ***** noise ***** types by +10.8% for IC accuracy and +15 points for SL F1 on average. | ||
| 2021.emnlp-main.448 We then show that, in some languages, ***** noise ***** mediates the two forms of generalization: ***** noise ***** applied to input tokens encourages syntactic generalization, while ***** noise ***** in history representations encourages lexical generalization. | ||
| 2021.semeval-1.186 However, the major obstacles to using crowd-sourced labels are ***** noise ***** and errors from non-expert annotations. | ||
| C18-1239 Compared to news titles, full documents can contain more potentially helpful information, but also ***** noise ***** compared to events and sentences, which has been less investigated in previous work. | ||
| 2020.semeval-1.145 At the same time, a new loss function is designed to ensure that the model is not affected by input ***** noise ***** which will improve the robustness of the model | ||
| generate | 23 | |
| N19-2027 We propose the use of a ***** generate *****, filter, and rank framework, in which candidate responses are first filtered to eliminate unacceptable responses, and then ranked to select the best response. | ||
| I17-1079 We show that this can model can be used to (a) ***** generate ***** plausible reviews and estimate nuanced reactions; (b) provide personalized rankings of existing reviews; and (c) recommend existing products more effectively. | ||
| 2021.acl-long.238 Whereas the prior work has mostly focused on proposing QA models for this dataset, our aim is to retrieve as well as ***** generate ***** explanation for a given (question, correct answer choice, incorrect answer choices) tuple from this dataset. | ||
| D19-1307 Human evaluation experiments show that, compared to the state-of-the-art supervised-learning systems and ROUGE-as-rewards RL summarisation systems, the RL systems using our learned rewards during training ***** generate ***** summaries with higher human ratings. | ||
| 2021.wassa-1.6 In social media analysis, this problem surfaces for demographic user classes such as language, topic, or gender, which influence the ***** generate ***** text to a substantial extent | ||
| understanding tasks | 23 | |
| 2021.calcs-1.20 Multilingual language models have shown decent performance in multilingual and cross-lingual natural language ***** understanding tasks *****. | ||
| 2021.eacl-main.159 We introduce a data augmentation technique based on byte pair encoding and a BERT-like self-attention model to boost performance on spoken language ***** understanding tasks *****. | ||
| C18-1187 Further, the use of Aff2Vec representations outperforms baseline embeddings in downstream natural language ***** understanding tasks ***** including sentiment analysis, personality detection, and frustration prediction. | ||
| 2021.emnlp-main.51 We compare our approach, CAL (Contrastive Active Learning), with a diverse set of acquisition functions in four natural language ***** understanding tasks ***** and seven datasets. | ||
| 2020.lrec-1.583 We conduct an in-depth evaluation of nine well known natural language ***** understanding tasks ***** with SentEval. | ||
| computational models | 23 | |
| W18-1401 The challenge for ***** computational models ***** of spatial descriptions for situated dialogue systems is the integration of information from different modalities. | ||
| 2021.cmcl-1.20 In this work we present three ***** computational models ***** to predict clause final verbs in Hindi given its prior arguments. | ||
| 2020.lrec-1.629 Along with a detailed description of the dataset, we evaluate several ***** computational models ***** trained and tested on this data. | ||
| 2020.coling-main.402 However, a large-scale theory-based corpus and corresponding ***** computational models ***** are missing. | ||
| P19-1350 We test what impact task difficulty has on continual learning, and whether the order in which a child acquires question types facilitates ***** computational models *****. | ||
| linguistic structure | 23 | |
| D19-1275 But does this mean that the representations encode ***** linguistic structure ***** or just that the probe has learned the linguistic task? | ||
| 2021.blackboxnlp-1.13 To test what BERT knows about metaphors, we challenge it on a new dataset that we designed to test various aspects of this phenomenon such as variations in ***** linguistic structure *****, variations in conventionality, the boundaries of the plausibility of a metaphor and the interpretations that we attribute to metaphoric expressions. | ||
| P19-4001 They are appealing for two main reasons: they allow incorporating structural bias during training, leading to more accurate models; and they allow discovering hidden ***** linguistic structure *****, which provides better interpretability. | ||
| 2020.findings-emnlp.67 We show that our annotation captures important ***** linguistic structure *****s including predicate-argument structure, modification and ellipsis. | ||
| D18-1457 Advanced neural machine translation (NMT) models generally implement encoder and decoder as multiple layers, which allows systems to model complex functions and capture complicated ***** linguistic structure *****s. | ||
| biomedical translation | 23 | |
| 2020.wmt-1.95 This paper describes the machine translation systems developed by the University of Sheffield (UoS) team for the ***** biomedical translation ***** shared task of WMT20. | ||
| W18-6446 The systems were used for our participation in the WMT18 ***** biomedical translation ***** task and in the shared task on machine translation of news. | ||
| W19-5363 The first two were made with a baseline translator, trained on clean data for the WMT 2019 ***** biomedical translation ***** task. | ||
| W19-5422 This paper describes the machine translation systems developed by the Barcelona Supercomputing (BSC) team for the ***** biomedical translation ***** shared task of WMT19. | ||
| 2021.wmt-1.22 For the ***** biomedical translation ***** task, we have developed resource-heavy systems for the English-French language pair, using both out-of-domain and in-domain corpora. | ||
| sigmorphon | 23 | |
| 2020.***** sigmorphon *****-1.26 Tone is a prosodic feature used to distinguish words in many languages, some of which are endangered and scarcely documented. | ||
| 2020.***** sigmorphon *****-1.1 Most systems, however, are developed using data from just one language such as English. | ||
| 2021.***** sigmorphon *****-1.20 Suggested by previous literature, this class of languages should approach the characterization of natural language word sets. | ||
| 2020.***** sigmorphon *****-1.14 One of the primary goals of our paper was to study the contribution varied components described above towards the performance of our system and perform an analysis on the same. | ||
| 2020.***** sigmorphon *****-1.29 In light of this fact, they are currently incapable of modelling simultaneous phonological processes that would require different tiers. | ||
| introduction | 23 | |
| L10-1476 After a short ***** introduction ***** and a description of related work, we illustrate the annotation process, including a description of the annotation methodology and the developed tool for the annotation process. | ||
| W19-6143 Named Entity Recognition (NER) has greatly advanced by the ***** introduction ***** of deep neural architectures. | ||
| 2006.bcs-1.7 After providing a brief ***** introduction ***** to the transliteration problem, and highlighting some issues specific to Arabic to English translation, a three phase algorithm is introduced as a computational solution to the problem. | ||
| N19-5001 In this tutorial, we provide a gentle ***** introduction ***** to the foundation of deep adversarial learning, as well as some practical problem formulations and solutions in NLP. | ||
| 2020.alta-1.15 Our results show that the probability of the occurrence of repetitive loops is significantly reduced by ***** introduction ***** of an extra neural decoder output. | ||
| knowledge base question | 23 | |
| N19-1299 However, most existing embedding-based methods for ***** knowledge base question ***** answering (KBQA) ignore the subtle inter-relationships between the question and the KB (e.g., entity types, relation paths and context). | ||
| P19-1616 Relation detection is a core step in many natural language process applications including ***** knowledge base question ***** answering. | ||
| 2021.naacl-demos.3 We present NAMER, an open-domain Chinese ***** knowledge base question ***** answering system based on a novel node-based framework that better grasps the structural mapping between questions and KB queries by aligning the nodes in a query with their corresponding mentions in question. | ||
| S18-2007 The first stage of every ***** knowledge base question ***** answering approach is to link entities in the input question. | ||
| 2021.acl-demo.39 We present Retriever-Transducer-Checker (ReTraCk), a neural semantic parsing framework for large scale ***** knowledge base question ***** answering (KBQA). | ||
| logistic regression | 23 | |
| 2021.semeval-1.151 We utilize a fusion of ***** logistic regression *****, decision tree, and fine-tuned DistilBERT for tackling subtask 1. | ||
| D17-1163 We present a newly collected police fatality corpus, which we release publicly, and present a model to solve this problem that uses EM-based distant supervision with ***** logistic regression ***** and convolutional neural network classifiers. | ||
| W18-4418 To further boost the performance, we combine this neural net with three ***** logistic regression ***** classifiers trained on character and word n-grams, and hand-picked syntactic features. | ||
| 2021.vardial-1.3 At the methodological level, we provide ***** logistic regression ***** as a framework to perform bottom-up feature selection in order to quantify differences across language varieties. | ||
| 2021.smm4h-1.14 We investigated various machine learning algorithms (***** logistic regression *****, SVM and Neural Networks) to address text classification. | ||
| automatic construction | 23 | |
| 2020.lrec-1.770 In this article we will describe preliminary work on the TArC semi-***** automatic construction ***** process and some of the first analyses we developed on TArC. | ||
| 2020.lrec-1.827 As a final contribution, we show the usefulness of our ***** automatic construction ***** approach by running state-of-the-art summarizers on the corpora and through a manual evaluation with human annotators. | ||
| L12-1297 The extracted semantic relation is-a is used for the ***** automatic construction ***** of a thesaurus for image indexing and retrieval. | ||
| 2021.bea-1.6 This paper investigates ***** automatic construction ***** of these character sets. | ||
| 2020.lt4hala-1.2 In this paper, we suggest a methodology for ***** automatic construction ***** of Aramaic-Hebrew translation Lexicon. | ||
| language interfaces | 23 | |
| I17-1091 To this end, we introduce a large dataset extracted from the Stack Exchange Data Explorer website, which can be used for training neural natural ***** language interfaces ***** for databases. | ||
| 2020.coling-main.31 Text-to-SQL systems offers natural ***** language interfaces ***** to databases, which can automatically generates SQL queries given natural language questions. | ||
| 2021.hcinlp-1.2 Given the more widespread nature of natural ***** language interfaces *****, it is increasingly important to understand who are accessing those interfaces, and how those interfaces are being used. | ||
| 2005.mtsummit-swtmt.2 However the implementation of natural ***** language interfaces ***** faces often the problem of lack of linguistic and world-knowledge, especially when the application domain is not very specific. | ||
| L10-1567 Voice over IP and cloud computing are poised to greatly reduce this impediment to research on spoken *****language interfaces***** in many domains . | ||
| constituent parsing | 23 | |
| E17-1118 This article introduces a novel transition system for discontinuous lexicalized ***** constituent parsing ***** called SR-GAP. | ||
| C18-1011 We investigate two conceptually simple local neural models for ***** constituent parsing *****, which make local decisions to constituent spans and CFG rules, respectively. | ||
| D17-1072 Finally, we propose three benchmark approaches by casting MWS as ***** constituent parsing ***** and sequence labeling. | ||
| D18-1161 We introduce novel dynamic oracles for training two of the most accurate known shift-reduce algorithms for ***** constituent parsing *****: the top-down and in-order transition-based parsers. | ||
| C16-1057 A chosen baseline dependency parsing model performs only on `carved' sequences at the second stage, which are transformed from coarse ***** constituent parsing ***** outputs at the first stage. | ||
| response selection | 23 | |
| 2021.acl-long.137 We study the learning of a matching model for dialogue ***** response selection *****. | ||
| 2021.naacl-main.264 Instead of relying on more general pretraining objectives from prior work (e.g., language modeling, ***** response selection *****), ConVEx's pretraining objective, a novel pairwise cloze task using Reddit data, is well aligned with its intended usage on sequence labeling tasks. | ||
| K18-1048 Experimental results show that our model outperforms all other state-of-the-art methods for ***** response selection ***** in multi-turn conversations. | ||
| D19-1205 This paper proposes an end-to-end multi-task model for conversation modeling, which is optimized for two tasks, dialogue act prediction and ***** response selection *****, with the latter being the task of interest. | ||
| P19-1006 To address these issues, we propose a Spatio-Temporal Matching network (STM) for ***** response selection *****. | ||
| entity extraction | 23 | |
| 2020.wnut-1.39 This paper presents our teamwork on WNUT 2020 shared task-1: wet lab entity extract, that we conducted studies in several models, including a BiLSTM CRF model and a Bert case model which can be used to complete wet lab ***** entity extraction *****. | ||
| E17-1109 In this paper, we propose a novel approach of feature selection for ***** entity extraction ***** that exploits the concept of deep learning and Particle Swarm Optimization (PSO). | ||
| C16-1225 In this paper, we propose a Korean spatial ***** entity extraction ***** model and a spatial relation extraction model; the spatial ***** entity extraction ***** model uses word vectors to alleviate the over generation and the spatial relation extraction mod-el uses dependency parse labels to find the proper arguments in relations. | ||
| 2021.acl-long.486 Specifically, we design a component to predict potential relations, which constrains the following ***** entity extraction ***** to the predicted relation subset rather than all relations; then a relation-specific sequence tagging component is applied to handle the overlapping problem between subjects and objects; finally, a global correspondence component is designed to align the subject and object into a triple with low-complexity. | ||
| 2020.emnlp-main.678 Consequently, while the methods proposed in literature perform well for generic date-time extraction from texts, they don't fare as well on task specific date-time ***** entity extraction ***** where only a subset of the date-time entities present in the text are pertinent to solving the task. | ||
| semantic knowledge | 23 | |
| W18-3003 (iii) Projecting the vector space using Linear Discriminant Analysis, which eliminates the expanded dimension(s) with ***** semantic knowledge *****. | ||
| 2020.acl-main.83 To bridge the gap, we proposed a novel Frame-based Sentence Representation (FSR) method, which employs frame ***** semantic knowledge ***** to facilitate sentence modelling. | ||
| S18-2009 In this work, we expand an existing emotion lexicon, DepecheMood, by leveraging ***** semantic knowledge ***** from English WordNet (EWN). | ||
| 2021.conll-1.31 In addition, we documented a synergy between these two mechanisms, where their alternation allows the model to converge on more balanced ***** semantic knowledge *****. | ||
| E17-5006 VoxML goes beyond the limitations of existing 3D visual markup languages by allowing for the encoding of a broad range of ***** semantic knowledge ***** that can be exploited by a simulation platform such as VoxSim.VoxSim (Krishnaswamy and Pustejovsky, 2016a; Krishnaswamy and Pustejovsky, 2016b) uses object and event ***** semantic knowledge ***** to generate animated scenes in real time without a complex animation interface. | ||
| automatic text simplification | 23 | |
| R19-1131 We use the state-of-the-art ***** automatic text simplification ***** (ATS) system for lexically and syntactically simplifying source sentences, which are then translated with two state-of-the-art English-to-Serbian MT systems, the phrase-based MT (PBMT) and the neural MT (NMT). | ||
| 2018.jeptalnrecital-court.34 Lexical complexity detection is an important step for ***** automatic text simplification ***** which serves to make informed lexical substitutions. | ||
| L12-1113 The corpus is intended for building ***** automatic text simplification ***** for adult readers. | ||
| 2021.newsum-1.16 Furthermore, we present experiments on ***** automatic text simplification ***** with the pretrained multilingual mBART and a modified version thereof that is more memory-friendly, using both our new data set and existing simplification corpora. | ||
| 2021.nlp4posimpact-1.7 To build automated simplification systems , corpora of complex sentences and their simplified versions is the first step to understand sentence complexity and enable the development of *****automatic text simplification***** systems . | ||
| semantic classification | 23 | |
| L12-1044 We analyse the fundamental distinction between (a) the coding of surface features; (b) form-related ***** semantic classification *****; and (c) semantic annotation in terms of dialogue acts, supported by experimental studies of (a) and (b). | ||
| 2021.americasnlp-1.15 One problem in the task of automatic ***** semantic classification ***** is the problem of determining the level on which to group lexical items. | ||
| L08-1161 We contrasted SUMO with an existing ***** semantic classification ***** which resulted in a further refined and extended SUMO geared for the description of adjectives. | ||
| 2020.lrec-1.558 In this work we present ScienceExamCER, a densely-labeled ***** semantic classification ***** corpus of 133k mentions in the science exam domain where nearly all (96%) of content words have been annotated with one or more fine-grained semantic class labels including taxonomic groups, meronym groups, verb/action groups, properties and values, and synonyms. | ||
| I17-1100 We propose to unify a variety of existing *****semantic classification***** tasks , such as semantic role labeling , anaphora resolution , and paraphrase detection , under the heading of Recognizing Textual Entailment ( RTE ) . | ||
| generating natural | 23 | |
| W19-4103 Sequence-to-Sequence (Seq2Seq) models have witnessed a notable success in ***** generating natural ***** conversational exchanges. | ||
| 2020.inlg-1.20 In this work, we introduce a new dataset and present a neural model for automatically ***** generating natural ***** language summaries for charts. | ||
| W19-4113 In our work, we contribute to the under-explored area of ***** generating natural ***** language explanations for general phenomena. | ||
| N18-1139 In this work, we focus on the task of ***** generating natural ***** language descriptions from a structured table of facts containing fields (such as nationality, occupation, etc) and values (such as Indian, actor, director, etc). | ||
| 2020.tacl-1.2 Abstract meaning representation (AMR)-to-text generation is the challenging task of ***** generating natural ***** language texts from AMR graphs, where nodes represent concepts and edges denote relations. | ||
| social science | 23 | |
| 2020.acl-main.51 The problem of comparing two bodies of text and searching for words that differ in their usage between them arises often in digital humanities and computational ***** social science *****. | ||
| 2020.acl-srw.12 To achieve this, I plan to research methods using natural language processing and deep learning while employing models and using analysis concepts from the ***** social science *****s, where researchers have studied media bias for decades. | ||
| 2021.latechclfl-1.18 Despite the increasing popularity of NLP in the humanities and ***** social science *****s, advances in model performance and complexity have been accompanied by concerns about interpretability and explanatory power for sociocultural analysis. | ||
| 2021.nlp4posimpact-1.2 In contrast, it is of enormous prominence in various ***** social science ***** disciplines, and some of that work follows the ”text-as-data” paradigm, seeking to employ quantitative methods for analyzing large amounts of CC-related text. | ||
| P18-1067 Previous works in computer science, as well as political and ***** social science *****, have shown correlation in text between political ideologies and the moral foundations expressed within that text. | ||
| multilingual text | 23 | |
| 2021.acl-long.350 In recent years, we have seen a colossal effort in pre-training ***** multilingual text ***** encoders using large-scale corpora in many languages to facilitate cross-lingual transfer learning. | ||
| P17-2009 Language identification (LID) is a critical first step for processing ***** multilingual text *****. | ||
| 2021.naacl-industry.16 Transformer-based methods are appealing for ***** multilingual text ***** classification, but common research benchmarks like XNLI (Conneau et al., 2018) do not reflect the data availability and task variety of industry applications. | ||
| I17-4024 We present All-In-1, a simple model for ***** multilingual text ***** classification that does not require any parallel data. | ||
| 2020.coling-main.543 In this paper, we approach the ***** multilingual text ***** classification task in the context of the epidemiological field. | ||
| personality | 23 | |
| 2021.wassa-1.26 We explicitly examine the impact of transcription errors on the downstream performance of a multi-modal system on three related tasks from three datasets: emotion, sarcasm, and ***** personality ***** detection. | ||
| P18-1205 Chit-chat models are known to have several problems: they lack specificity, do not display a consistent ***** personality ***** and are often not very captivating. | ||
| C18-1156 Also, since the sarcastic nature and form of expression can vary from person to person, CASCADE utilizes user embeddings that encode stylometric and ***** personality ***** features of users. | ||
| 2020.emnlp-main.531 In Psychology, persona has been shown to be highly correlated to ***** personality *****, which in turn influences empathy. | ||
| W16-4310 Detecting depression or ***** personality ***** traits, tutoring and student behaviour systems, or identifying cases of cyber-bulling are a few of the wide range of the applications, in which the automatic detection of emotion is a crucial element. | ||
| image caption generation | 23 | |
| 2020.coling-main.280 Standard ***** image caption generation ***** systems produce generic descriptions of images and do not utilize any contextual information or world knowledge. | ||
| C16-1005 Automatic video description generation has recently been getting attention after rapid advancement in ***** image caption generation *****. | ||
| P19-4006 This introductory tutorial addresses the advances in deep Bayesian learning for natural language with ubiquitous applications ranging from speech recognition to document summarization, text classification, text segmentation, information extraction, ***** image caption generation *****, sentence generation, dialogue control, sentiment classification, recommendation system, question answering and machine translation, to name a few. | ||
| D18-1149 We extensively evaluate the proposed model on machine translation (En-De and En-Ro) and ***** image caption generation *****, and observe that it significantly speeds up decoding while maintaining the generation quality comparable to the autoregressive counterpart. | ||
| W18-2803 We propose that such insights could generalize to other models with similar architecture, including some used in computational linguistics for language modeling, machine translation and ***** image caption generation *****. | ||
| margin loss | 23 | |
| P19-2023 First, we empirically show limitations of two popular loss (sum and max-*****margin loss*****) widely used in training text-image embeddings and propose a trade-off: a kNN-*****margin loss***** which 1) utilizes information from hard negatives and 2) is robust to noise as all K-most hardest samples are taken into account, tolerating pseudo negatives and outliers. | ||
| C16-1289 We introduce two max-*****margin losses***** to train the ConvBRNN model: one for the phrase structure inference and the other for the semantic similarity model. | ||
| P19-1548 We use bidirectional long short-term memory (BiLSTM) network with the *****margin loss***** as the feature extractor. | ||
| 2021.acl-long.391 To alleviate this problem, we apply the angular *****margin loss*****, and perform Gaussian linear transformation to achieve balanced label angle variances, i.e., the variance of label angles of texts within the same label. | ||
| 2020.repl4nlp-1.12 In this work, we address this sentence classification task from a representation learning perspective, using both a bidirectional LSTM and BERT optimized with the following metric learning loss functions: contrastive loss, triplet loss, center loss, congenerous cosine loss and additive angular *****margin loss*****. | ||
| feature attribution | 23 | |
| 2021.acl-long.71 Earlier, cooperative game theory inspired axiomatic methods only borrowed axioms from solution concepts (such as Shapley value) for individual *****feature attributions***** and introduced their own extensions to model interactions. | ||
| 2021.deelio-1.13 In this work, we consider Contextual Decomposition (CD) – a Shapley-based input *****feature attribution***** method that has been shown to work well for recurrent NLP models – and we test the extent to which it is useful for models that contain attention operations. | ||
| 2021.emnlp-main.645 Experiments for explanation faithfulness across five datasets, show that models trained with SaLoss consistently provide more faithful explanations across four different *****feature attribution***** methods compared to vanilla BERT. | ||
| P19-1631 *****Feature attribution***** methods, proposed recently, help users interpret the predictions of complex models. | ||
| 2021.acl-demo.30 This includes (1) gradient-based *****feature attribution***** for natural language generation (2) hidden states and their evolution between model layers (3) convenient access and examination tools for neuron activations in the under-explored Feed-Forward Neural Network sublayer of Transformer layers. | ||
| multilingual dependency | 23 | |
| K17-3011 This paper describes the system of the Team Orange-Deskin̈, used for the CoNLL 2017 UD Shared Task in *****Multilingual Dependency***** Parsing. | ||
| K17-3024 Our submitted parsing system is the grandchild of the first transition-based neural network dependency parser, which was the University of Geneva's entry in the CoNLL 2007 *****multilingual dependency***** parsing shared task, with some improvements to speed and portability. | ||
| K17-3025 We present a *****multilingual dependency***** parser with a bidirectional-LSTM (BiLSTM) feature extractor and a multi-layer perceptron (MLP) classifier. | ||
| N19-1393 Our work investigates the use of high-level language descriptions in the form of typological features for *****multilingual dependency***** parsing. | ||
| K17-3015 For this year's *****multilingual dependency***** parsing shared task, we developed a pipeline system, which uses a variety of features for each of its components. | ||
| hybrid | 23 | |
| 2020.iwpt-1.9 We present new experiments that transfer techniques from Probabilistic Context - free Grammars with Latent Annotations ( PCFG - LA ) to two grammar formalisms for discontinuous parsing : linear context - free rewriting systems and *****hybrid***** grammars . | ||
| L12-1231 In recent years , machine translation ( MT ) research has focused on investigating how hybrid machine translation as well as system combination approaches can be designed so that the resulting *****hybrid***** translations show an improvement over the individual component translations . | ||
| C18-1042 In this paper , we propose a *****hybrid***** technique for semantic question matching . | ||
| 2021.emnlp-main.494 We present ReasonBert , a pre - training method that augments language models with the ability to reason over long - range relations and multiple , possibly *****hybrid***** contexts . | ||
| D19-1135 Recent studies have shown that a hybrid of self - attention networks ( SANs ) and recurrent neural networks RNNs outperforms both individual architectures , while not much is known about why the *****hybrid***** models work . | ||
| task - oriented dialog | 23 | |
| D19-1010 Dialog policy decides what and how a *****task - oriented dialog***** system will respond , and plays a vital role in delivering effective conversations . | ||
| 2021.acl-short.83 Slot - filling is an essential component for building *****task - oriented dialog***** systems . | ||
| 2021.naacl-main.238 For *****task - oriented dialog***** systems , training a Reinforcement Learning ( RL ) based Dialog Management module suffers from low sample efficiency and slow convergence speed due to the sparse rewards in RL . | ||
| 2020.acl-demos.39 Traditionally , industry solutions for building a *****task - oriented dialog***** system have relied on helping dialog authors define rule - based dialog managers , represented as dialog flows . | ||
| 2021.dialdoc-1.5 We apply the modular dialog system framework to combine open - domain question answering with a *****task - oriented dialog***** system . | ||
| grammatical error correction ( GEC | 23 | |
| 2020.nlptea-1.1 There are several problems in applying *****grammatical error correction ( GEC***** ) to a writing support system . | ||
| 2020.coling-main.199 Most recent works in the field of *****grammatical error correction ( GEC***** ) rely on neural machine translation - based models . | ||
| 2020.coling-main.573 We propose a reference - less metric trained on manual evaluations of system outputs for *****grammatical error correction ( GEC***** ) . | ||
| 2021.bea-1.8 Document - level context can provide valuable information in *****grammatical error correction ( GEC***** ) , which is crucial for correcting certain errors and resolving inconsistencies . | ||
| Q16-1013 The field of *****grammatical error correction ( GEC***** ) has grown substantially in recent years , with research directed at both evaluation metrics and improved system performance against those metrics . | ||
| fillers | 22 | |
| L04-1245 Issues specific to IE evaluation include: how leniently to assess inexact identification of filler boundaries, the possibility of multiple ***** fillers ***** for a slot, and how the counting is performed. | ||
| D17-1068 We use a syntax-based DSM to build a prototypical representation of verb-specific roles: for every verb, we extract the most salient second order contexts for each of its roles (i.e. the most salient dimensions of typical role ***** fillers *****), and then we compute thematic fit as a weighted overlap between the top features of candidate ***** fillers ***** and role prototypes. | ||
| 2021.latechclfl-1.2 An analysis of the most frequent role ***** fillers ***** show that olfactory descriptions pertain to some typical domains such as religion, food, nature, ancient past, poor sanitation, all supporting the creation of a stereotypical imagery related to Italy. | ||
| L16-1461 Our model specifies verb polarity frames that capture the polarity effects on the ***** fillers ***** of the verb's arguments given a sentence with that verb frame. | ||
| W18-6005 In addition to shorter utterances, both parsers perform better on normalized transcriptions including basic markers of prosody and excluding disfluencies, discourse markers and ***** fillers ***** | ||
| narratives | 22 | |
| W17-1801 Findings provide insights into how readers process roles and emotion in ***** narratives *****. | ||
| W19-5928 In this paper, we perform a quantitative analysis of patients' ***** narratives ***** of their experience with heart failure and explore the different topics that patients talk about. | ||
| 2021.naacl-main.342 In the pursuit of natural language understanding, there has been a long standing interest in tracking state changes throughout ***** narratives *****. | ||
| D19-1509 Despite being considered a necessary component of AI-complete systems, few resources have been developed for evaluating counterfactual reasoning in ***** narratives *****. | ||
| 2020.lrec-1.87 Moreover, responsive utterances can express empathy to ***** narratives ***** and showing an appropriate degree of empathy to ***** narratives ***** is significant for enhancing speaker's motivation | ||
| multilingualism | 22 | |
| P19-2025 Code-Mixing, a progeny of ***** multilingualism ***** is a way in which multilingual people express themselves on social media by using linguistics units from different languages within a sentence or speech context. | ||
| 2020.coling-tutorials.3 In this tutorial, we will cover the latest advances in NMT approaches that leverage ***** multilingualism *****, especially to enhance low-resource translation. | ||
| 2021.acl-long.131 Our survey will be a step to- wards an outcome of mutual benefit for computational scientists and linguists with a shared interest in ***** multilingualism ***** and C-S. | ||
| L16-1260 Code-Switching (CS) between two languages is extremely common in communities with societal ***** multilingualism ***** where speakers switch between two or more languages when interacting with each other. | ||
| 1999.mtsummit-1.3 The 21st century will be the age of ***** multilingualism ***** and multiculturalism, when various languages and cultures are dynamically exchanged on a global scale | ||
| conjunctions | 22 | |
| W19-5353 We concentrate on the English conjunction “but” and its French equivalent “mais” which can be translated into two different German ***** conjunctions *****. | ||
| R19-1111 We evaluate specialised test sets focused on the translation of these two ***** conjunctions *****. | ||
| 2020.emnlp-main.661 Reasoning about conjuncts in conjunctive sentences is important for a deeper understanding of ***** conjunctions ***** in English and also how their usages and semantics differ from conjunctive and disjunctive boolean logic. | ||
| W19-2705 The findings also suggest a complex relationship between the relation types and syntactic categories of discourse markers (subordinating and coordinating ***** conjunctions *****). | ||
| L10-1220 We discuss the tagset design and motivate our classification of Afrikaans word forms, in particular we focus on the categorization of verbs and ***** conjunctions ***** | ||
| Github | 22 | |
| 2020.acl-main.553 The code will be released on ***** Github *****. | ||
| N19-1279 We make our data and code available on ***** Github *****. | ||
| C16-1110 The augmented versions of METEOR, using vector representations, are made available on our ***** Github ***** page. | ||
| 2020.smm4h-1.21 Code of our proposed framework is made available on ***** Github ***** | ||
| 2021.acl-demo.34 We not only released an online platform at the website but also make our evaluation tool an API with MIT Licence at ***** Github ***** and PyPi that allows users to conveniently assess their models offline | ||
| variants | 22 | |
| L08-1052 Our method maps the noun is-a hierarchy of WordNet to Wikipedia categories, identifies the NEs present in the latter and extracts different information from them such as written ***** variants *****, definitions, etc. | ||
| L16-1296 Our toolkit for automatic evaluation showcases quick and detailed comparison of MT system ***** variants ***** through automatic metrics and n-gram feedback, along with manual evaluation via edit-distance, error annotation and task-based feedback. | ||
| 2020.lrec-1.463 We trained and compared system ***** variants ***** on data prepared with the main casing methods available, namely translation of raw data without case normalisation, lowercasing with recasing, truecasing, case factors and inline casing. | ||
| 2021.emnlp-main.803 Specifically, we score the quality of a translation by conditioning on ***** variants ***** of the source that provide contrastive disambiguation cues. | ||
| 2020.coling-main.8 We further investigate effects of four attention ***** variants ***** in generating contextual semantic representations | ||
| limitations | 22 | |
| 2020.iwdp-1.5 Previous neural approaches achieve significant progress for Chinese word segmentation (CWS) as a sentence-level task, but it suffers from ***** limitations ***** on real-world scenario. | ||
| W19-3006 Our results suggest that ***** limitations ***** imposed by heterogeneity inherent to ASD and from developmental change with age can be (at least partially) overcome using domain knowledge, such as understanding spoken language development from childhood through adulthood. | ||
| L12-1104 The most common sources of failure were ***** limitations ***** on inference, errors in coreference (particularly with nominal anaphors), and errors in named entity recognition. | ||
| W16-3718 The currently available tag set for Sinhala has two ***** limitations *****: the unavailability of tags to represent some word classes and the lack of tags to capture inflection based grammatical variations of words. | ||
| P18-1097 Most of the neural sequence-to-sequence (seq2seq) models for grammatical error correction (GEC) have two ***** limitations *****: (1) a seq2seq model may not be well generalized with only limited error-corrected data; (2) a seq2seq model may fail to completely correct a sentence with multiple errors through normal seq2seq inference | ||
| discriminators | 22 | |
| D19-1319 RevGAN utilizes the combination of three novel components, including self-attentive recursive autoencoders, conditional ***** discriminators *****, and personalized decoders. | ||
| 2020.emnlp-main.281 Instead of adopting the classic student-teacher learning of forcing the output of a student network to exactly mimic the soft targets produced by the teacher networks, we introduce two ***** discriminators ***** as in generative adversarial network (GAN) to transfer knowledge from two teachers to the student. | ||
| L10-1416 In this paper, we investigate the use of evolutionary algorithms to learn classifiers to discriminate between definitional and non-definitional sentences in non-technical texts, and show how effective grammar-based definition ***** discriminators ***** can be automatically learnt with minor human intervention. | ||
| D19-1313 In this paper, we propose a novel approach with two ***** discriminators ***** and multiple generators to generate a variety of different paraphrases. | ||
| L14-1131 High frequency qalqalah content words are also found to be statistically significant ***** discriminators ***** or keywords when comparing Meccan and Medinan chapters in the Qur'an using a state-of-the-art Visual Analytics toolkit: Semantic Pathways | ||
| LIWC | 22 | |
| W17-5225 Linguistic Inquiry and Word Count (***** LIWC *****) is a rich dictionary that map words into several psychological categories such as Affective, Social, Cognitive, Perceptual and Biological processes. | ||
| L12-1657 We extract different sets of features using external sources such as ***** LIWC ***** and SentiWordNet as well as using our own written scripts. | ||
| W19-3024 We use a convolutional neural network to incorporate ***** LIWC ***** information at the Reddit post level about topics discussed, first-person focus, emotional experience, grammatical choices, and thematic style. | ||
| D19-5512 In this paper, we compared the outcomes of (1) ***** LIWC *****, (2) machine learning, and (3) a human baseline. | ||
| W18-0617 A correlation analysis was performed to identify the relationship between ***** LIWC ***** variables and number of days prior to suicide | ||
| subgraph | 22 | |
| D19-1612 Our query-focused method constructs length and lexically constrained compressions in linear time, by growing a ***** subgraph ***** in the dependency parse of a sentence. | ||
| W17-2315 Our key contributions are: (1) an empirical validation of our hypothesis that an event is a ***** subgraph ***** of the AMR graph, (2) a neural network-based model that identifies such an event ***** subgraph ***** given an AMR, and (3) a distant supervision based approach to gather additional training data. | ||
| 2020.coling-main.369 In this work, we study several ***** subgraph ***** construction methods and compare their performance across the recommendation task. | ||
| D19-1242 After the ***** subgraph ***** is complete, another graph CNN is used to extract the answer from the ***** subgraph *****. | ||
| C18-1057 To improve the computation efficiency, we approximately perform graph convolution on a ***** subgraph ***** of adjacent entity mentions instead of those in the entire text | ||
| lexeme | 22 | |
| L12-1297 This fact allows us to predict that ***** lexeme *****s with the highest weight are the closest hypernyms of the defined ***** lexeme ***** in the dictionary. | ||
| L12-1211 In SA, documents normally are polarity classified by running them through classifiers trained on document vectors constructed from ***** lexeme ***** features, i.e., words. | ||
| 2020.coling-main.256 Our model exploits the basic linguistic intuition that the ***** lexeme ***** is the key lexical unit of meaning, while inflectional morphology provides additional syntactic information. | ||
| D19-1090 Note the ***** lexeme ***** itself, hablar, is relatively common. | ||
| L06-1116 Semantic tagging is based on an inventory of semantic features (descriptors) and a dictionary comprising about 3,000 entries, with a set of tags assigned to each ***** lexeme ***** and its argument slots | ||
| instances | 22 | |
| 2021.dash-1.7 We introduce Ziva, an interface for supporting domain knowledge from domain experts to data scientists in two ways: (1) a concept creation interface where domain experts extract important concept of the domain and (2) five kinds of justification elicitation interfaces that solicit elicitation how the domain concept are expressed in data ***** instances *****. | ||
| 2021.acl-long.220 Most current methods to ED rely heavily on training ***** instances *****, and almost ignore the correlation of event types. | ||
| L12-1045 We present a 3-step framework that learns categories and their ***** instances ***** from natural language text based on given training examples. | ||
| 2021.eacl-main.69 On both the E2E and Weather benchmarks, we show that this weakly supervised training paradigm is an effective approach under low resource scenarios with as little as 10 data ***** instances *****, and outperforming benchmark systems on both datasets when 100% of the training data is used. | ||
| P19-1481 For a new language, such training ***** instances ***** are hard to obtain making the QG problem even more challenging | ||
| correlated | 22 | |
| 2020.clssts-1.1 Discerning which linguistic parameters ***** correlated ***** with overall performance enabled the evaluation of progress when different languages were measured, and also was an important factor in determining the most effective CLIR pipeline design, customized to handle language-specific properties deemed necessary to address. | ||
| 2021.ranlp-1.54 In this paper, we demonstrate how these methods can be used to display ***** correlated ***** topic models on social media texts using SocialVisTUM, our proposed interactive visualization toolkit. | ||
| P18-2083 We evaluate our method by a quantitative experiment and a human study, showing the ***** correlated ***** topic modeling on phrases is a good and practical way to interpret the underlying themes of a corpus. | ||
| 2016.iwslt-1.8 We experiment with three different scenarios using, i) French, as a source language un***** correlated ***** to the target language, ii) Ukrainian, as a source language ***** correlated ***** to the target one and finally iii) English as a source language un***** correlated ***** to the target language using a relatively large amount of data in respect to the other two scenarios. | ||
| D18-1468 Statistical phylogenetic models have allowed the quantitative analysis of the evolution of a single categorical feature and a pair of binary features , but *****correlated***** evolution involving multiple discrete features is yet to be explored . | ||
| projective | 22 | |
| L14-1378 Our proposed algorithm is able to create both ***** projective ***** and non-***** projective ***** dependency trees. | ||
| P18-1248 We thus obtain the first implementation of global decoding for non-***** projective ***** transition-based parsing, and demonstrate empirically that it is effective than its ***** projective ***** counterpart in parsing a number of highly non-***** projective ***** languages. | ||
| L10-1269 We first describe the automatic conversion of the French Treebank (Abeillë and Barrier, 2004), a constituency treebank, into typed ***** projective ***** dependency trees. | ||
| 2021.eacl-main.254 The method assumes minimal resources and provides maximal flexibility by (a) accepting any pre-trained arc-factored dependency parser; (b) assuming no access to source language data; (c) supporting both ***** projective ***** and non-***** projective ***** parsing; and (d) supporting multi-source transfer. | ||
| 2020.coling-main.225 In this paper we present a parsing model for ***** projective ***** dependency trees which takes advantage of the existence of complementary dependency annotations which is the case in Arabic, with the availability of CATiB and UD treebanks | ||
| evaluate | 22 | |
| C16-1141 In Natural Language Generation (NLG), one important limitation is the lack of common benchmarks on which to train, ***** evaluate ***** and compare data-to-text generators. | ||
| W17-0911 Our experiments ***** evaluate ***** different methods for generating these negative examples, as well as different embedding-based representations of the stories. | ||
| D19-1179 Despite recent attempts on computational modeling of the variation, the lack of parallel corpora of style language makes it difficult to systematically control the stylistic change as well as ***** evaluate ***** such models. | ||
| 2021.dash-1.13 CrossCheck enables users to make informed decisions when choosing between multiple models, identify when the models are correct and for which examples, investigate whether the models are making the same mistakes as humans, ***** evaluate ***** models' generalizability and highlight models' limitations, strengths and weaknesses. | ||
| L08-1379 and Who does ***** evaluate ***** | ||
| persuasion | 22 | |
| 2020.emnlp-main.716 Existing work in Natural Language Processing (NLP) has shown that linguistic features extracted from the debate text and features encoding the characteristics of the audience are both critical in ***** persuasion ***** studies. | ||
| 2021.emnlp-main.725 Our analyses suggest that the effect of social pressure is comparable to the difference between persuasive and non-persuasive language strategies in driving ***** persuasion ***** and that social pressure might be a causal factor for ***** persuasion *****. | ||
| 2020.emnlp-main.605 Grounded in the politeness theory of Brown and Levinson (1978), we propose a generalized framework for modeling face acts in ***** persuasion ***** conversations, resulting in a reliable coding manual, an annotated corpus, and computational models. | ||
| 2021.eacl-main.110 Moreover, we can generalize ARDM to more challenging, non-collaborative tasks such as ***** persuasion *****. | ||
| P19-1566 We designed an online ***** persuasion ***** task where one participant was asked to persuade the other to donate to a specific charity | ||
| imbalance | 22 | |
| 2020.findings-emnlp.202 However, data ***** imbalance ***** makes it difficult to train models accurately. | ||
| 2021.mtsummit-research.9 In cases of data ***** imbalance ***** in terms of translation direction and we find that tagging of translation direction can close the performance gap. | ||
| W18-2305 We propose the use of two semi-supervised machine learning approaches: To mitigate difficulties arising from heterogeneous data sources, overcome data ***** imbalance ***** and create reliable training data we propose using transductive learning from positive and unlabelled data (PU Learning). | ||
| 2020.bucc-1.7 Initially, when searching for parallel sentences between two comparable documents, all the possible sentence pairs between the documents have to be considered, which introduces a great degree of ***** imbalance ***** between parallel pairs and non-parallel pairs. | ||
| 2020.findings-emnlp.6 In this paper , we focus on the *****imbalance***** issue , which is rarely studied in aspect term extraction and aspect sentiment classification when regarding them as sequence labeling tasks . | ||
| filtering | 22 | |
| S19-2158 We also highlight the effectiveness of our ***** filtering ***** strategy for training the neural network on a large but noisy training set. | ||
| 2021.emnlp-main.455 The efficiency and efficacy of GradTS in these case studies illustrate its general applicability in MTL research without requiring manual task ***** filtering ***** or costly parameter tuning. | ||
| C18-1208 The method used for extracting the synonym classes is a semi-automatic process with a substantial amount of manual work during ***** filtering *****, role assignment to classes and individual Class members' arguments, and linking to the external lexical resources. | ||
| 2020.wmt-1.111 The final result shows that, in both ***** filtering ***** and alignment tasks, our system significantly outperforms the LASER-based system. | ||
| W18-6421 Through our experiments, we identified two keys for improving accuracy: ***** filtering ***** noisy training sentences and right-to-left re-ranking | ||
| indicative | 22 | |
| D18-1126 Our starting point is a state-of-the-art attention-based model from prior work; while this model's attention typically identifies context that is topically relevant, it fails to identify some of the most ***** indicative ***** surface strings, especially those exhibiting lexical overlap with the true title. | ||
| P19-1096 However, it suffers from two shortcomings: 1) the emotion must be annotated before cause extraction in ECE, which greatly limits its applications in real-world scenarios; 2) the way to first annotate emotion and then extract the cause ignores the fact that they are mutually ***** indicative *****. | ||
| W19-2710 Thus, this paper adapts the signal identification and anchoring scheme (Liu and Zeldes, 2019) to three more genres, examines the distribution of signaling devices across relations and genres, and provides a taxonomy of ***** indicative ***** signals found in this dataset. | ||
| L08-1067 Since this is the first work of its kind, the method described in this paper should be seen as only a preliminary method, ***** indicative ***** of how better methods can be developed. | ||
| I17-1075 Our model is language independent, and despite minimal feature engineering, it is interpretable and capable of learning location ***** indicative ***** words and timing patterns | ||
| generator | 22 | |
| 2020.acl-main.227 Specifically, we propose a novel guider network to focus on the generative process over a longer horizon, which can assist next-word prediction and provide intermediate rewards for ***** generator ***** optimization. | ||
| 2020.aacl-main.29 To resolve the cold start problem in training, we propose a method using a pseudo data ***** generator ***** which generates pseudo texts and KB triples for learning an initial model. | ||
| D18-1077 To improve basic GANs, we apply feature matching loss in the discriminator, use domain-category analysis as an additional task in the discriminator, and remove the biases in the ***** generator *****. | ||
| D18-1432 We present three enhancements to existing encoder-decoder models for open-domain conversational agents, aimed at effectively modeling coherence and promoting output diversity: (1) We introduce a measure of coherence as the GloVe embedding similarity between the dialogue context and the generated response, (2) we filter our training corpora based on the measure of coherence to obtain topically coherent and lexically diverse context-response pairs, (3) we then train a response ***** generator ***** using a conditional variational autoencoder model that incorporates the measure of coherence as a latent variable and uses a context gate to guarantee topical consistency with the context and promote lexical diversity. | ||
| D19-1554 Due to the discrete generation step in the ***** generator *****, we use policy gradient, a reinforcement learning approach, to train the two modules | ||
| imageability | 22 | |
| L16-1413 We applied this algorithm to abstractness, arousal, ***** imageability ***** and valence. | ||
| L14-1031 These two methods (along with a naive hybrid approach combining the two) have been shown to significantly outperform a state-of-the-art resource expansion system at our pilot task of ***** imageability ***** expansion. | ||
| S18-2004 Moreover, our NWS scores positively correlate with psycholinguistic measures such as concreteness, and ***** imageability ***** implying a close connection to the salience as perceived by humans. | ||
| L16-1571 This study primarily aims to build a Turkish psycholinguistic database including three variables: word frequency, age of acquisition (AoA), and ***** imageability *****, where AoA and ***** imageability ***** information are limited to nouns. | ||
| 2021.cmcl-1.1 While word-embeddings do not explicitly incorporate the concreteness of words into their computations, they have been shown to accurately predict human judgments of concreteness and ***** imageability ***** | ||
| unanswerable | 22 | |
| L14-1397 The question is, in fact remarkably hard to answer, and many linguists consider it ***** unanswerable *****. | ||
| P19-1415 In this work, we propose a data augmentation technique by automatically generating relevant ***** unanswerable ***** questions according to an answerable question paired with its corresponding paragraph that contains the answer. | ||
| 2020.aacl-srw.21 Based on classifying examples as answerable or ***** unanswerable ***** by BERT without the given question, we propose a method based on BERT that splits the training examples from the MRC dataset SQuAD1.1 into those that are “easy to answer” or “hard to answer”. | ||
| D19-1204 Each dialogue simulates a real-world DB query scenario with a crowd worker as a user exploring the DB and a SQL expert retrieving answers with SQL, clarifying ambiguous questions, or otherwise informing of ***** unanswerable ***** questions | ||
| 2021.acl-long.304 Many Question - Answering ( QA ) datasets contain *****unanswerable***** questions , but their treatment in QA systems remains primitive . | ||
| Quaero | 22 | |
| L10-1046 The ***** Quaero ***** project organized a set of evaluations of Named Entity recognition systems in 2009. | ||
| 2013.iwslt-papers.6 Through progressive advances and system combination we reach a word error rate (WER) of 16.5% on the 2012 ***** Quaero ***** evaluation data. | ||
| 2011.iwslt-papers.2 This paper describes our current Spanish speech-to-text (STT) system with which we participated in the 2011 ***** Quaero ***** STT evaluation that is being developed within the ***** Quaero ***** program. | ||
| L12-1479 The ***** Quaero ***** program has organized a set of evaluations for terminology extraction systems in 2010 and 2011 | ||
| 2011.iwslt-evaluation.15 The *****Quaero***** program is an international project promoting research and industrial innovation on technologies for automatic analysis and classification of multimedia and multilingual documents . | ||
| adverbial | 22 | |
| 1963.earlymt-1.21 A floating structure such as a prepositional phrase or ***** adverbial ***** phrase or clause, whose dependency is not determined in the analyzer, is represented as depending upon the nearest preceding structure modifiable by such a floating structure. | ||
| L12-1564 We present an extension of the ***** adverbial ***** entries of the French morphological lexicon DELA (Dictionnaires Electroniques du LADL / LADL electronic dictionaries). | ||
| 2021.latechclfl-1.10 This paper addresses this problem with an unsupervised, rule-based approach for ***** adverbial ***** identification that utilizes dependency tree patterns | ||
| C18-1249 We extend the coverage of an existing grammar customization system to clausal modifiers , also referred to as *****adverbial***** clauses . | ||
| 2020.coling-main.156 In formal semantics , there are two well - developed semantic frameworks : event semantics , which treats verbs and *****adverbial***** modifiers using the notion of event , and degree semantics , which analyzes adjectives and comparatives using the notion of degree . | ||
| gaze | 22 | |
| L10-1407 characteristic features of how it is produced, among which main direction, amplitude, velocity and number of repetitions; 2. cues in other modalities, like direction and duration of ***** gaze *****; 3. | ||
| D19-6408 We show that using globally-aggregated measures that capture the central tendency or variability of ***** gaze ***** data is more beneficial than proposed local views which retain individual participant information. | ||
| 2020.lrec-1.95 Detecting and interpreting the temporal patterns of ***** gaze ***** behavior cues is natural for humans and also mostly an unconscious process. | ||
| R17-1078 We report comparisons between a part-of-speech (POS) and frequency baseline to: i) a prediction model based solely on ***** gaze ***** data and ii) a combined model of ***** gaze ***** data, POS and frequency | ||
| 2020.aacl-main.86 The *****gaze***** behaviour of a reader is helpful in solving several NLP tasks such as automatic essay grading . | ||
| Systematic | 22 | |
| L08-1057 ***** Systematic ***** comparison of verbal systems is conducted by analyzing morpho-syntactic encodings. | ||
| P18-1222 ***** Systematic ***** comparisons are conducted between hyperdoc2vec and several competitors on two tasks, i.e., paper classification and citation recommendation, in the academic paper domain | ||
| 2021.emnlp-main.505 *****Systematic***** compositionality is an essential mechanism in human language , allowing the recombination of known parts to create novel expressions . | ||
| W19-5012 *****Systematic***** reviews are important in evidence based medicine , but are expensive to produce . | ||
| 2020.aacl-main.49 *****Systematic***** Generalization refers to a learning algorithm 's ability to extrapolate learned behavior to unseen situations that are distinct but semantically similar to its training data . | ||
| Basque | 22 | |
| W17-5209 We tackle this challenge in the context of the Iberian Peninsula, releasing the first symbolic syntax-based Iberian system with rules shared across five official languages: ***** Basque *****, Catalan, Galician, Portuguese and Spanish. | ||
| 2020.wmt-1.96 Regarding the techniques used, we base on the findings from our previous works for translating clinical texts into ***** Basque *****, making use of clinical terminology for adapting the MT systems to the clinical domain. | ||
| L16-1001 The corpus is available in eight different languages: ***** Basque *****, Bulgarian, Czech, Dutch, English, German, Portuguese and Spanish. | ||
| L12-1264 The database features 6 target languages: ***** Basque *****, Catalan, English, Galician, Portuguese and Spanish, and includes segments in other (Out-Of-Set) languages, which allow to perform open-set verification tests. | ||
| L10-1271 This evaluation, designed according to the criteria and methodology applied in the NIST Language Recognition Evaluations, involved four target languages: ***** Basque *****, Catalan, Galician and Spanish (official languages in Spain), and included speech signals in other (unknown) languages to allow open-set verification trials | ||
| consistency | 22 | |
| 2020.emnlp-main.10 The second dataset eQASC-perturbed is constructed by crowd-sourcing perturbations (while preserving their validity) of a subset of explanations in QASC, to test ***** consistency ***** and generalization of explanation prediction models. | ||
| 2020.acl-main.429 We find that after fine-tuning BERT and RoBERTa on a negation scope task, the average attention head improves its sensitivity to negation and its attention ***** consistency ***** across negation datasets compared to the pre-trained models. | ||
| 2020.emnlp-main.539 Further evaluations on downstream tasks demonstrate that the profile ***** consistency ***** identification model is conducive for improving dialogue ***** consistency *****. | ||
| 2021.acl-long.374 In addition, ***** consistency ***** constraints between golden and predicted clusters of event mentions have not been considered to improve representation learning in prior deep learning models for ECR. | ||
| 2021.newsum-1.7 We obtain promising results regarding the fluency, ***** consistency ***** and relevance of the summaries produced | ||
| explanations | 22 | |
| 2020.findings-emnlp.390 Our contributions are as follows: (1) We introduce a leakage-adjusted simulatability (LAS) metric for evaluating NL ***** explanations *****, which measures how well ***** explanations ***** help an observer predict a model's output, while controlling for how ***** explanations ***** can directly leak the output. | ||
| W19-4807 Specifically, we present experiments showing that, for CNN-based text classification, ***** explanations ***** generated using “supervised attention” are judged superior to ***** explanations ***** generated using normal unsupervised attention. | ||
| 2021.eacl-main.202 Specifically, we investigate which of the NLG evaluation measures map well to ***** explanations *****. | ||
| 2021.emnlp-main.28 We unpack this apparent discrepancy using machine ***** explanations ***** and find that CAD reduces model reliance on spurious features | ||
| 2021.emnlp-main.373 These results provide new intuitive ***** explanations ***** of existing reports; for example, discarding the learned attention patterns tends not to adversely affect the performance. | ||
| scholarly | 22 | |
| L12-1474 Navigation in large ***** scholarly ***** paper collections is tedious and not well supported in most scientific digital libraries. | ||
| P17-1102 The large and growing amounts of online ***** scholarly ***** data present both challenges and opportunities to enhance knowledge discovery. | ||
| S17-2171 Over 50 million ***** scholarly ***** articles have been published: they constitute a unique repository of knowledge. | ||
| 2021.gwc-1.18 Dictionary-based methods in sentiment analysis have received ***** scholarly ***** attention recently, the most comprehensive examples of which can be found in English. | ||
| C18-2006 First, we filter ***** scholarly ***** tweets from tracking a tweet stream | ||
| 22 | ||
| W18-6103 In this paper, we introduce the first geolocation inference approach for ***** reddit *****, a social media platform where user pseudonymity has thus far made supervised demographic inference difficult to implement and validate. | ||
| 2020.figlang-1.10 We compare the performance based on the approaches and obtained the best F1 scores as 0.722, 0.679 for the twitter forums and ***** reddit ***** forums respectively. | ||
| W19-3403 We develop an unsupervised pipeline to extract schemas and apply our method to Reddit posts to detect schematic structures that are characteristic of different sub***** reddit *****s. | ||
| 2021.acl-short.133 Among social media platforms, Reddit has emerged as the most promising one due to its anonymity and its focus on topic-based communities (sub***** reddit *****s) that can be indicative of someone's state of mind or interest regarding mental health disorders such as r/SuicideWatch, r/Anxiety, r/depression. | ||
| W19-3507 We introduce a novel partitioning approach for characterizing user polarization based on their distribution of participation across interest sub***** reddit *****s. | ||
| dependency graphs | 22 | |
| K18-2004 The novelty of our approach is the use of an additional loss function, which reduces the number of cycles in the predicted ***** dependency graphs *****, and the use of self-training to increase the system performance. | ||
| 2020.lincr-1.6 The planned release of LPPC dataset will include raw text annotated with ***** dependency graphs ***** in the Universal Dependencies standard, a near-natural-sounding synthetic spoken subset as well as EEG recordings. | ||
| 2020.emnlp-main.451 We propose gating mechanisms to dynamically combine information from word ***** dependency graphs ***** and latent graphs which are learned by self-attention networks. | ||
| Q15-1040 We also explore a generalization of our parsing framework to ***** dependency graphs ***** with pagenumber at most k and show that the resulting optimization problem is NP-hard for k 2. | ||
| P17-1193 We consider two restrictions to deep ***** dependency graphs *****: (a) 1-endpoint-crossing and (b) pagenumber-2 | ||
| definition | 22 | |
| 2020.semeval-1.41 Research on ***** definition ***** extraction has been conducted for well over a decade, largely with significant constraints on the type of ***** definition *****s considered. | ||
| 2020.semeval-1.58 Our systems respectively achieve 0.830 and 0.994 F1-scores on the official test set, and we believe that the insights derived from our study are potentially relevant to help advance the research on ***** definition ***** extraction. | ||
| 2020.sdp-1.22 Despite prior work on ***** definition ***** detection, current approaches are far from being accurate enough to use in realworld applications. | ||
| 2020.semeval-1.93 We explore the performance of Bidirectional Encoder Representations from Transformers (BERT) at ***** definition ***** extraction. | ||
| P18-2043 In this work, we study the problem of word ambiguities in ***** definition ***** modeling and propose a possible solution by employing latent variable modeling and soft attention mechanisms | ||
| compound | 22 | |
| 2021.starsem-1.24 Our prediction experiments complement insights from classification using (a) manually designed features to characterise termhood and ***** compound ***** formation and (b) ***** compound ***** and constituent word embeddings. | ||
| 2020.lrec-1.228 Some words can contain hyphens, because they were split at the end of a line or are ***** compound ***** words with a mandatory hyphen. | ||
| L16-1365 Focusing on ***** compound ***** nouns (CN), we then verify in a longitudinal study if there are differences in the distribution and compositionality of CNs in child-directed and child-produced sentences across ages. | ||
| L08-1535 Founded on the fact that words of some goshu classes are more likely to combine into ***** compound ***** words than words of other classes, we employ a statistical model based on CRFs using goshu information. | ||
| 2020.findings-emnlp.191 Using MedICaT, we introduce the task of subfigure to subcaption alignment in ***** compound ***** figures and demonstrate the utility of inline references in image-text matching | ||
| headlines | 22 | |
| S17-2148 This paper describes our system for fine-grained sentiment scoring of news ***** headlines ***** submitted to SemEval 2017 task 5–subtask 2. | ||
| W18-4305 The evaluation result shows that 26% ***** headlines ***** do not in-clude health claims, and all extractors face difficulty separating them from the rest. | ||
| 2020.aespen-1.7 The multi-task convolutional neural network is shown to be capable of recognizing events and event coreferences given the ***** headlines *****' texts and publication dates. | ||
| S17-2140 For news ***** headlines ***** track, an ensemble of regressors was used to predict sentiment score. | ||
| N19-1012 Finally, we develop baseline classifiers that can predict whether or not an edited headline is funny, which is a first step toward automatically generating humorous ***** headlines ***** as an approach to creating topical humor. | ||
| arguments | 22 | |
| 2020.udw-1.4 We use Universal Dependencies treebanks to test whether a well-known typological trade-off between word order freedom and richness of morphological marking of core ***** arguments ***** holds within individual languages. | ||
| L10-1117 Conversely, we show that increasing the size of the input corpus and modifying the extraction procedure to better differentiate prepositional ***** arguments ***** from prepositional modifiers improves performance. | ||
| D17-1142 We observe that the evidence-conclusion discourse relations, also known as ***** arguments *****, often appear in product reviews, and we hypothesise that some argument-based features, e.g. | ||
| L10-1634 Of the two ***** arguments ***** of connectives, called Arg1 and Arg2, we focus on Arg1, which has proven more challenging to identify. | ||
| C16-1258 When processing ***** arguments ***** in online user interactive discourse, it is often necessary to determine their bases of support. | ||
| spelling errors | 22 | |
| P17-2086 In this paper, we explore ***** spelling errors ***** as a source of information for detecting the native language of a writer, a previously under-explored area. | ||
| W19-4407 To address this, we present and release an annotated data set of 6,121 ***** spelling errors ***** in context, based on a corpus of essays written by English language learners. | ||
| 2020.lrec-1.857 As a byproduct of our study, we create two new datasets comprised of ***** spelling errors ***** generated by children from hand-written essays and web search inquiries, which we make available to the research community. | ||
| 2019.icon-1.1 We further attempt to establish that using such a word representation as input makes the model robust to unseen words, particularly arising due to tokenization and ***** spelling errors *****, which is a common problem in systems where a typing interface is one of the input modalities. | ||
| L16-1060 Experimental results showed that we can regard typing-game logs as a source of ***** spelling errors *****. | ||
| historical linguistics | 22 | |
| 2021.eval4nlp-1.11 in ***** historical linguistics ***** and digital humanities, is challenging due to a lack of statistical power. | ||
| W19-4713 Traditional ***** historical linguistics ***** lacks the possibility to empirically assess its assumptions regarding the phonetic systems of past languages and language stages since most current methods rely on comparative tools to gain insights into phonetic features of sounds in proto- or ancestor languages. | ||
| R19-1035 We thus also present the first very Tibetan Treebank in a variety of formats to facilitate research in the fields of NLP, ***** historical linguistics ***** and Tibetan Studies. | ||
| 2020.lrec-1.859 Modelling language change is an increasingly important area of interest within the fields of sociolinguistics and ***** historical linguistics *****. | ||
| 2021.lchange-1.9 Semantic divergence in related languages is a key concern of ***** historical linguistics *****. | ||
| shared task system | 22 | |
| W19-4414 Combining the output of top BEA 2019 ***** shared task system *****s using our approach, currently holds the highest reported score in the open phase of the BEA 2019 shared task, improving F-0.5 score by 3.7 points over the best result reported. | ||
| D18-1515 This baseline model, which is fast to train and uses only language-independent features, outperforms the best ***** shared task system *****s on the task of retrieving relevant previously asked questions. | ||
| W19-6109 Although the performance is rather good (better than both the best ***** shared task system ***** and the average of the best per-language results), further work is needed to improve the generalization power, especially on unseen MWEs. | ||
| 2021.smm4h-1.11 This ***** shared task system ***** description depicts two neural network architectures submitted to the ProfNER track, among them the winning system that scored highest in the two sub-tasks 7a and 7b. | ||
| I17-1013 We report results on data sets provided during the WMT-2016 shared task on automatic post-editing and can demonstrate that dual-attention models that incorporate all available data in the APE scenario in a single model improve on the best ***** shared task system ***** and on all other published results after the shared task. | ||
| service | 22 | |
| L16-1569 To meet these requirements, we have adopted a highly modular micro***** service *****-based architecture. | ||
| L14-1706 It is based on the ***** service *****-oriented architecture (SOA), a more recent, web-oriented version of the pipeline architecture that has long been used in NLP for sequencing loosely-coupled linguistic analyses. | ||
| C18-2020 Therefore, we offer an automatic simultaneous interpretation ***** service ***** for students. | ||
| L10-1039 We propose a language resource management system, called WordNet Management System (WNMS), as a distributed management system that allows the server to perform the cross language WordNet retrieval, including the fundamental web ***** service ***** applications for editing, visualizing and language processing. | ||
| L12-1581 Another outcome of the experiment was the preliminary evaluation of the pronunciation learning ***** service ***** in terms of user satisfaction, which would be difficult to conduct before integrating the HCI part. | ||
| describes | 22 | |
| S18-1073 This paper ***** describes ***** our approach to SemEval-2018 Task 2, which aims to predict the most likely associated emoji, given a tweet in English or Spanish. | ||
| W19-8664 This paper ***** describes ***** our submission to the TL;DR challenge. | ||
| W18-6230 This paper ***** describes ***** an approach to solve implicit emotion classification with the use of pre-trained word embedding models to train multiple neural networks. | ||
| W19-5042 This paper ***** describes ***** our competing system to enter the MEDIQA-2019 competition. | ||
| 2020.semeval-1.107 This paper ***** describes ***** our contribution to SemEval-2020 Task 7: Assessing Humor in Edited News Headlines. | ||
| conversational agent | 22 | |
| 2020.challengehml-1.7 To this end, understanding passenger intents from spoken interactions and vehicle vision systems is an important building block for developing contextual and visually grounded ***** conversational agent *****s for AV. | ||
| 2021.wanlp-1.17 The shortcomings of NLG encoder-decoder models are primarily due to the lack of Arabic datasets suitable to train NLG models such as ***** conversational agent *****s. | ||
| 2020.acl-tutorials.3 We expect that the tutorial will be of interest to researchers in dialogue systems, computational semantics and cognitive modeling, and hope that it will catalyze research and system building that more directly explores the creative, strategic ways ***** conversational agent *****s might be able to seek and offer evidence about their understanding of their interlocutors. | ||
| L08-1159 The modelling of realistic emotional behaviour is needed for various applications in multimodal human-machine interaction such as the design of emotional ***** conversational agent *****s (Martin et al., 2005) or of emotional detection systems (Devillers and Vidrascu, 2007). | ||
| W19-5915 We present Graph2Bots , a tool for assisting *****conversational agent***** designers . | ||
| contextual emotion | 22 | |
| S19-2031 We have developed a Snapshot Ensemble of 1D Hierarchical Convolutional Neural Networks to extract features from 3-turn conversations in order to perform ***** contextual emotion ***** detection in text. | ||
| S19-2042 Our model was evaluated on the data provided by the SemEval-2019 shared task on ***** contextual emotion ***** detection in text. | ||
| S19-2036 This paper describes our transfer learning-based approach to ***** contextual emotion ***** detection as part of SemEval-2019 Task 3. | ||
| S19-2061 This paper presents our ***** contextual emotion ***** detection system in approaching the SemEval2019 shared task 3: EmoContext: Contextual Emotion Detection in Text. | ||
| R19-1091 This paper describes a new approach for the task of ***** contextual emotion ***** detection. | ||
| meeting | 22 | |
| P19-1038 Nowadays, firm CEOs communicate information not only verbally through press releases and financial reports, but also nonverbally through investor ***** meeting *****s and earnings conference calls. | ||
| L08-1187 We are working with large quantities of dialogue speech including business ***** meeting *****s, friendly discourse, and telephone conversations, and have produced web-based tools for the visualisation of non-verbal and paralinguistic features of the speech data. | ||
| L06-1127 This article describes an interface for searching and browsing multimodal recordings of group ***** meeting *****s. | ||
| L06-1309 A set of tools has been developed specifically for these purposes which can be used as a data collection platform for the development of ***** meeting ***** browsers. | ||
| D18-2017 We use multi-party ***** meeting ***** opinion mining based on bipartite graphs to extract opinions and calculate mutual influential factors, using the Lunar Survival Task as a study case. | ||
| open domain question | 22 | |
| W16-4404 There are some ***** open domain question ***** answering systems, such as IBM Waston, which take the unstructured text data as input, in some ways of humanlike thinking process and a mode of artificial intelligence. | ||
| 2021.emnlp-main.421 Many NLG tasks such as summarization, dialogue response, or ***** open domain question ***** answering, focus primarily on a source text in order to generate a target response. | ||
| W18-2608 We describe our experiences in using an ***** open domain question ***** answering model (Chen et al., 2017) to evaluate an out-of-domain QA task of assisting in analyzing privacy policies of companies. | ||
| P19-1612 Recent work on ***** open domain question ***** answering (QA) assumes strong supervision of the supporting evidence and/or assumes a blackbox information retrieval (IR) system to retrieve evidence candidates. | ||
| 2021.acl-long.477 Dense passage retrieval has been shown to be an effective approach for information retrieval tasks such as ***** open domain question ***** answering. | ||
| architecture | 22 | |
| 2021.ecnlp-1.18 Through various exper- iments, we show that this ***** architecture ***** outper- forms a typical slot detector approach, with a gain of +81% in accuracy and +41% in F1 score. | ||
| 2020.emnlp-main.748 We hope that these ***** architecture *****s and experiments may serve as strong points of comparison for future work. | ||
| 2020.clinicalnlp-1.15 We pre-trained several models of common ***** architecture *****s on this dataset and empirically showed that such pre-training leads to improved performance and convergence speed when fine-tuning on downstream medical tasks. | ||
| 2020.wanlp-1.4 We propose a novel ***** architecture ***** for labelling character sequences that achieves state-of-the-art results on the Tashkeela Arabic diacritization benchmark. | ||
| P17-1190 We also introduce a new recurrent ***** architecture *****, the Gated Recurrent Averaging Network, that is inspired by averaging and LSTMs while outperforming them both. | ||
| answers | 22 | |
| W17-5001 We also find that difficulty is mirrored in the amount of variation in student ***** answers *****, which can be computed before grading. | ||
| N19-2018 A capable, automatic Question Answering (QA) system can provide more complete and accurate ***** answers ***** using a comprehensive knowledge base (KB). | ||
| 2021.acl-short.33 A recent study showed that manual summarization of consumer health questions brings significant improvement in retrieving relevant ***** answers *****. | ||
| 2021.semeval-1.105 Furthermore, to incorporate our known knowledge about abstract concepts, we retrieve the definitions of candidate ***** answers ***** from WordNet and feed them to the model as extra inputs. | ||
| 2020.acl-main.74 A given user query can be matched against the questions and/or the ***** answers ***** in the FAQ. | ||
| identifying | 22 | |
| 2020.nlpcss-1.9 While this task has been closely associated with emotion prediction, we argue and show that ***** identifying ***** worry needs to be addressed as a separate task given the unique challenges associated with it. | ||
| N18-4018 While some work has been done on code-mixed social media text and in emotion prediction separately, our work is the first attempt which aims at ***** identifying ***** the emotion associated with Hindi-English code-mixed social media text. | ||
| 2020.findings-emnlp.121 Temporal relation classification is the pair-wise task for ***** identifying ***** the relation of a temporal link (TLINKs) between two mentions, i.e. | ||
| 2020.lrec-1.550 Our focus is directed at the de-identification of emails where personally ***** identifying ***** information does not only refer to the sender but also to those people, locations, dates, and other identifiers mentioned in greetings, boilerplates and the content-carrying body of emails. | ||
| L14-1334 In this paper, we consider the importance of ***** identifying ***** the change of state for events - in particular, clinical events that measure and compare the multiple states of a patients health across time. | ||
| neural sequence labeling | 22 | |
| D19-1422 For ***** neural sequence labeling *****, however, BiLSTM-CRF does not always lead to better results compared with BiLSTM-softmax local classification. | ||
| P18-4013 NCRF++ is designed for quick implementation of different ***** neural sequence labeling ***** models with a CRF inference layer. | ||
| 2020.lrec-1.559 We here present details on the annotation effort, guidelines, inter-annotator agreement and an experimental analysis of the corpus using a ***** neural sequence labeling ***** architecture. | ||
| W18-6112 We report an empirical evaluation of ***** neural sequence labeling ***** models with character embedding to tackle NER task in Indonesian conversational texts. | ||
| W17-5004 We investigate the utility of different auxiliary objectives and training strategies within a *****neural sequence labeling***** approach to error detection in learner writing . | ||
| proposed | 22 | |
| 2021.naacl-main.15 Over the years, many different filtering approaches have been ***** proposed *****. | ||
| 2020.coling-main.278 Our ***** proposed ***** LaAP-Net outperforms existing approaches on three benchmark datasets for the text VQA task by a noticeable margin. | ||
| D19-1566 Experimental results suggest the efficacy of the ***** proposed ***** model for both sentiment and emotion analysis over various existing state-of-the-art systems. | ||
| D19-5809 To properly generate a question coherent to the grounding text and the current conversation history, the ***** proposed ***** framework first locates the focus of a question in the text passage, and then identifies the question pattern that leads the sequential generation of the words in a question. | ||
| 2020.sltu-1.7 Overall, we show that the ***** proposed ***** multilingual graphemic hybrid ASR with various data augmentation can not only recognize any within training set languages, but also provide large ASR performance improvements. | ||
| supervised machine | 22 | |
| C18-1144 Annotated corpora enable ***** supervised machine ***** learning and data analysis. | ||
| W19-2307 Latent space based GAN methods and attention based sequence to sequence models have achieved impressive results in text generation and un***** supervised machine ***** translation respectively. | ||
| 2021.emnlp-main.2 Previous work mainly focuses on improving cross-lingual transfer for NLU tasks with a multilingual pretrained encoder (MPE), or improving the performance on ***** supervised machine ***** translation with BERT. | ||
| 2020.wmt-1.127 We present our submission to the very low resource ***** supervised machine ***** translation task at WMT20. | ||
| W19-5028 However, recent work has demonstrated the potential of ***** supervised machine ***** learning to extract document-level codes directly from the raw text of clinical notes. | ||
| user generated | 22 | |
| 2021.eacl-main.141 We address the problem of unsupervised abstractive summarization of collections of ***** user generated ***** reviews through self-supervision and control. | ||
| W17-4417 In this paper, we describe the Lithium Natural Language Processing (NLP) system - a resource-constrained, high-throughput and language-agnostic system for information extraction from noisy ***** user generated ***** text on social media. | ||
| E17-3007 In the recent years, the amount of ***** user generated ***** contents shared on the Web has significantly increased, especially in social media environment, e.g. | ||
| L14-1574 In order for this ***** user generated ***** content data to be released publicly to the research community some issues first need to be resolved. | ||
| 2012.amta-papers.24 This paper investigates the usefulness of automatic machine translation metrics when analyzing the impact of source reformulations on the quality of machine-translated ***** user generated ***** content. | ||
| related | 22 | |
| L12-1283 This work is part of a project for MWE extraction and characterization using different techniques aiming at measuring the properties ***** related ***** to idiomaticity, as institutionalization, non-compositionality and lexico-syntactic fixedness. | ||
| C16-1095 In the absence of large annotated corpora, parallel corpora, treebanks, bilingual lexica, etc., we found the following to be effective: exploiting distributional regularities in monolingual data, projecting information across closely ***** related ***** languages, and utilizing human linguist judgments. | ||
| N18-1086 We find that our model formulation of latent dependencies with exact marginalization do not lead to better intrinsic language modeling performance than vanilla RNNs, and that parsing accuracy is not cor***** related ***** with language modeling perplexity in stack-based models. | ||
| 2020.coling-main.16 Experiments show that our framework using sentiment-***** related ***** discourse augmentations for sentiment prediction enhances the overall performance for long documents, even beyond previous approaches using well-established discourse parsers trained on human annotated data. | ||
| 2019.iwslt-1.26 We study here a ***** related ***** setting, multi-domain adaptation, where the number of domains is potentially large and adapting separately to each domain would waste training resources. | ||
| active | 22 | |
| 2020.sigdial-1.29 A total of 20 papers from the last two years are surveyed to analyze three types of evaluation protocols: automated, static, and inter***** active *****. | ||
| 2021.eacl-main.229 The framework enables the use of all input reviews by first condensing them into multiple dense vectors which serve as input to an abstr***** active ***** model. | ||
| P19-1514 In this work, we propose a novel approach to support value extraction scaling up to thousands of attributes without losing performance: (1) We propose to regard attribute as a query and adopt only one global set of BIO tags for any attributes to reduce the burden of attribute tag or model explosion; (2) We explicitly model the semantic representations for attribute and title, and develop an attention mechanism to capture the inter***** active ***** semantic relations in-between to enforce our framework to be attribute comprehensive. | ||
| 2020.coling-main.13 Besides, to inter***** active *****ly extract the inter-aspect relations for the specific aspect, an inter-aspect GCN is adopted to model the representations learned by aspect-focused GCN based on the inter-aspect graph which is constructed by the relative dependencies between the aspect words and other aspects. | ||
| W16-3808 As regards the position of an argument in the dependency structure with respect to its predicate , there exist three types of valency filling : *****active***** ( canonical ) , passive , and discontinuous . | ||
| native | 22 | |
| 2020.emnlp-main.162 Trained with these contextually generated vokens, our visually-supervised language models show consistent improvements over self-supervised alter***** native *****s on multiple pure-language tasks such as GLUE, SQuAD, and SWAG. | ||
| 2006.amta-papers.28 Discrimi***** native ***** training methods have recently led to significant advances in the state of the art of machine translation (MT). | ||
| 2020.emnlp-main.352 To address this issue, we propose an alter***** native ***** to the end-to-end classification on vocabulary. | ||
| 2013.iwslt-evaluation.24 Furthermore, we investigated different reordering models as well as an extended discrimi***** native ***** word lexicon. | ||
| D17-1286 We present experiments that show the influence of *****native***** language on lexical choice when producing text in another language in this particular case English . | ||
| hierarchical reinforcement learning | 22 | |
| D18-1253 We then use these subgoals to learn a multi-level policy by ***** hierarchical reinforcement learning *****. | ||
| 2020.aacl-main.77 In this paper, we present ExpanRL, an end-to-end ***** hierarchical reinforcement learning ***** (HRL) model for concept expansion in MOOCs. | ||
| W18-5015 We propose a multimodal ***** hierarchical reinforcement learning ***** framework that dynamically integrates vision and language for task-oriented visual dialog. | ||
| W17-2627 This mechanism is inspired by strategic attentive reader and writer (STRAW) model, a recent neural architecture for planning with ***** hierarchical reinforcement learning ***** that can also learn higher level temporal abstractions. | ||
| 2020.acl-main.27 We further propose a ***** hierarchical reinforcement learning ***** method to resolve the training difficulties of our proposed framework. | ||
| multi - turn response selection | 22 | |
| 2020.acl-main.127 We evaluate the method on single-turn and *****multi-turn response selection***** tasks for retrieval-based dialog systems. | ||
| 2021.naacl-main.122 During the *****multi-turn response selection*****, BERT focuses on training the relationship between the context with multiple utterances and the response. | ||
| P19-1006 Evaluation on two large-scale *****multi-turn response selection***** tasks has demonstrated that our proposed model significantly outperforms the state-of-the-art model. | ||
| 2020.coling-main.437 *****Multi-turn response selection***** has been extensively studied and applied to many real-world applications in recent years. | ||
| P19-1001 In particular, people study the problem by investigating context-response matching for *****multi-turn response selection***** based on publicly recognized benchmark data sets. | ||
| punctuation prediction | 22 | |
| 2012.iwslt-papers.15 We build a monolingual translation system from German to German implementing segmentation and *****punctuation prediction***** as a machine translation task. | ||
| 2007.iwslt-1.28 Our focus was threefold: using hierarchical phrase-based models in spoken language translation, the incorporation of sub-lexical information in model estimation via morphological analysis (Arabic) and word and character segmentation (Chinese), and the use of n-gram sequence models for source-side *****punctuation prediction*****. | ||
| 2014.iwslt-papers.17 *****Punctuation prediction***** is an important task in spoken language translation and can be performed by using a monolingual phrase-based translation system to translate from unpunctuated to text with punctuation. | ||
| 2018.iwslt-1.15 We show how these components can be tightly coupled by encoding ASR confusion networks, as well as ASR-like noise adaptation, vocabulary normalization, and implicit *****punctuation prediction***** during translation. | ||
| 2011.iwslt-papers.7 *****Punctuation prediction***** is an important task in Spoken Language Translation. | ||
| argument search | 22 | |
| 2020.lrec-1.65 We present an approach to evaluate *****argument search***** techniques in view of their use in argumentative dialogue systems by assessing quality aspects of the retrieved arguments. | ||
| W19-4516 The model can be used to extract comparative sentences for pro/con argumentation in comparative / *****argument search***** engines or debating technologies. | ||
| P19-1054 We experiment with two recent contextualized word embedding methods (ELMo and BERT) in the context of open-domain *****argument search*****. | ||
| N18-5005 Argument mining is a core technology for enabling *****argument search***** in large corpora. | ||
| 2021.sigdial-1.39 To address this issue, we propose a combination of argumentative dialogue systems with *****argument search***** technology that enables a system to discuss any topic on which the search engine is able to find suitable arguments. | ||
| corpus - based | 22 | |
| W03-3019 We investigate an aspect of the relationship between parsing and *****corpus - based***** methods in NLP that has received relatively little attention : coverage augmentation in rule - based parsers . | ||
| L16-1698 This paper presents some work on direct and indirect speech in Portuguese using *****corpus - based***** methods : we report on a study whose aim was to identify ( i ) Portuguese verbs used to introduce reported speech and ( ii ) syntactic patterns used to convey reported speech , in order to enhance the performance of a quotation extraction system , dubbed QUEMDISSE ? . | ||
| L14-1433 Multiword expressions ( MWEs ) are quite frequent in languages such as English , but their diversity , the scarcity of individual MWE types , and contextual ambiguity have presented obstacles to *****corpus - based***** studies and NLP systems addressing them as a class . | ||
| L10-1281 In order to utilize the *****corpus - based***** techniques that have proven effective in natural language processing in recent years , costly and time - consuming manual creation of linguistic resources is often necessary . | ||
| 2016.gwc-1.58 Le and Fokkens ( 2015 ) recently showed that taxonomy - based approaches are more reliable than *****corpus - based***** approaches in estimating human similarity ratings . | ||
| general | 22 | |
| C16-1071 The method makes it possible to abstract away from the individual feature occurrences by grouping features together that behave alike with respect to the target class , thus providing a new , more *****general***** perspective on the data . | ||
| 2020.emnlp-main.253 Whilst there has been growing progress in Entity Linking ( EL ) for *****general***** language , existing datasets fail to address the complex nature of health terminology in layman 's language . | ||
| 2020.emnlp-main.636 Leveraging large amounts of unlabeled data using Transformer - like architectures , like BERT , has gained popularity in recent times owing to their effectiveness in learning *****general***** representations that can then be further fine - tuned for downstream tasks to much success . | ||
| 2021.emnlp-main.160 In this paper , we propose efficient algorithms for the WordPiece tokenization used in BERT , from single - word tokenization to *****general***** text ( e.g. , sentence ) tokenization . | ||
| 2021.ranlp-1.98 We present GeSERA , an open - source improved version of SERA for evaluating automatic extractive and abstractive summaries from the *****general***** domain . | ||
| named entity recognition ( NER ) | 22 | |
| 2020.acl-main.581 To better tackle the *****named entity recognition ( NER )***** problem on languages with little / no labeled data , cross - lingual NER must effectively leverage knowledge learned from source languages with rich labeled data . | ||
| D19-1672 Building *****named entity recognition ( NER )***** models for languages that do not have much training data is a challenging task . | ||
| L06-1218 This paper describes the development of CiceroArabic , the first wide coverage *****named entity recognition ( NER )***** system for Modern Standard Arabic . | ||
| D18-1230 Recent advances in deep neural models allow us to build reliable *****named entity recognition ( NER )***** systems without handcrafting features . | ||
| N18-5012 Our VnCoreNLP supports key natural language processing ( NLP ) tasks including word segmentation , part - of - speech ( POS ) tagging , *****named entity recognition ( NER )***** and dependency parsing , and obtains state - of - the - art ( SOTA ) results for these tasks . | ||
| Web | 22 | |
| 2020.lrec-1.692 In this paper , we present KGvec2go , a *****Web***** API for accessing and consuming graph embeddings in a light - weight fashion in downstream applications . | ||
| 2020.coling-main.36 HTML tags are typically discarded in free text Named Entity Recognition from *****Web***** pages . | ||
| L16-1716 In this paper we describe the new developments brought to LRE Map , especially in terms of the user interface of the *****Web***** application , of the searching of the information therein , and of the data model updates . | ||
| L14-1354 *****Web***** 2.0 has allowed a never imagined communication boom . | ||
| 2021.naacl-main.473 Multimodal summarization becomes increasingly significant as it is the basis for question answering , *****Web***** search , and many other downstream tasks . | ||
| automatic speech recognition ( ASR | 22 | |
| 2020.nlp4convai-1.8 Speech - based virtual assistants , such as Amazon Alexa , Google assistant , and Apple Siri , typically convert users ' audio signals to text data through *****automatic speech recognition ( ASR***** ) and feed the text to downstream dialog models for natural language understanding and response generation . | ||
| 2014.iwslt-papers.3 Word Confidence Estimation ( WCE ) for machine translation ( MT ) or *****automatic speech recognition ( ASR***** ) consists in judging each word in the ( MT or ASR ) hypothesis as correct or incorrect by tagging it with an appropriate label . | ||
| 2018.iwslt-1.15 This work describes AppTek 's speech translation pipeline that includes strong state - of - the - art *****automatic speech recognition ( ASR***** ) and neural machine translation ( NMT ) components . | ||
| 2020.lrec-1.411 In this paper we present a first version of a transcription portal for audio files based on *****automatic speech recognition ( ASR***** ) in various languages . | ||
| L14-1443 We present a dataset of telephone conversations in English and Czech , developed for training acoustic models for *****automatic speech recognition ( ASR***** ) in spoken dialogue systems ( SDSs ) . | ||
| fine - | 22 | |
| 2021.eacl-demos.22 Transfer learning , particularly approaches that combine multi - task learning with pre - trained contextualized embeddings and *****fine -***** tuning , have advanced the field of Natural Language Processing tremendously in recent years . | ||
| 2021.eacl-main.14 Language modeling with BERT consists of two phases of ( i ) unsupervised pre - training on unlabeled text , and ( ii ) *****fine -***** tuning for a specific supervised task . | ||
| 2021.insights-1.9 Further pre - training language models on in - domain data ( domain - adaptive pre - training , DAPT ) or task - relevant data ( task - adaptive pre - training , TAPT ) before *****fine -***** tuning has been shown to improve downstream tasks ' performances . | ||
| 2021.acl-long.491 Event extraction ( EE ) has considerably benefited from pre - trained language models ( PLMs ) by *****fine -***** tuning . | ||
| 2021.emnlp-main.467 Many recent successes in sentence representation learning have been achieved by simply *****fine -***** tuning on the Natural Language Inference ( NLI ) datasets with triplet loss or siamese loss . | ||
| question answering ( QA ) | 22 | |
| C16-1226 Recent trend in *****question answering ( QA )***** systems focuses on using structured knowledge bases ( KBs ) to find answers . | ||
| P19-1610 Despite the advancement of *****question answering ( QA )***** systems and rapid improvements on held - out test sets , their generalizability is a topic of concern . | ||
| 2021.emnlp-main.754 Question generation has recently shown impressive results in customizing *****question answering ( QA )***** systems to new domains . | ||
| 2020.acl-main.662 In addition to the traditional task of machines answering questions , *****question answering ( QA )***** research creates interesting , challenging questions that help systems how to answer questions and reveal the best systems . | ||
| R17-1018 We propose to use *****question answering ( QA )***** data from Web forums to train chat - bots from scratch , i.e. , without dialog data . | ||
| recognizers | 21 | |
| 2009.jeptalnrecital-recital.2 We present a performance comparison of two different types of Japanese and English grammar-based speech ***** recognizers *****. | ||
| D19-5907 We present a speech annotation interface CoSSAT, which helps annotators transcribe code-switched speech faster, more easily and more accurately than a traditional interface, by displaying candidate words from monolingual speech ***** recognizers *****. | ||
| L10-1463 Our evaluation is aimed at speech recognition consumers and potential consumers with limited experience with readily available ***** recognizers *****. | ||
| L10-1207 Current state-of-the-art systems for automatic phonetic transcription (APT) are mostly phone ***** recognizers ***** based on Hidden Markov models (HMMs). | ||
| W16-4712 However, do to the variety of human expressiveness, current state-of-the art phenotype concept ***** recognizers ***** and automatic annotators struggle with specific domain issues and challenges | ||
| Subtasks | 21 | |
| 2021.semeval-1.140 Our architecture, called DVTT (Double Visual Textual Transformer), approaches ***** Subtasks ***** 1 and 3 of Task 6 as multi-label classification problems, where the text and/or images of the meme are processed, and the probabilities of the presence of each possible persuasion technique are returned as a result. | ||
| 2021.semeval-1.108 This paper presents our systems for the three ***** Subtasks ***** of SemEval Task4: Reading Comprehension of Abstract Meaning (ReCAM). | ||
| 2020.semeval-1.65 Our systems rely on pre-trained language models, i.e., BERT (including its variants) and UniLM, and rank 10th and 7th among 27 and 17 systems on ***** Subtasks ***** B and C, respectively. | ||
| S19-2099 More work needs to be done on the unbalance data problem in ***** Subtasks ***** B and C. Some future work is also discussed. | ||
| S17-2106 This paper describes the system we have used for participating in ***** Subtasks ***** A (Message Polarity Classification) and B (Topic-Based Message Polarity Classification according to a two-point scale) of SemEval-2017 Task 4 Sentiment Analysis in Twitter | ||
| adaption | 21 | |
| L12-1005 We show that by standard ***** adaption ***** techniques the recognition rate already rises from virtually zero to up to 51.7% and can be further improved by domain-specific rules to 47.9%. | ||
| 2020.lrec-1.780 To address this issue, we propose and investigate an approach that performs a robust acoustic model ***** adaption ***** to a target domain in a cross-lingual, multi-staged manner. | ||
| 2021.wmt-1.12 According to the final evaluation results, a deeper, wider, and stronger network can improve translation performance in general, yet our data domain ***** adaption ***** method can improve performance even more. | ||
| N19-1379 In this paper, we propose CoNDA, a neural-based approach for continuous domain ***** adaption ***** with normalization and regularization. | ||
| P18-3005 Despite a large body of research on the intersection of vision-language technology, its ***** adaption ***** to the medical domain is not fully explored | ||
| adapting | 21 | |
| W17-2620 We conduct several systematic experiments ***** adapting ***** a Wav2Letter convolutional neural network originally trained for English ASR to the German language. | ||
| 2021.wmt-1.53 We find that forward/back-translation significantly improves the translation results, data selection and gradual fine-tuning are particularly effective during ***** adapting ***** domain, while knowledge distillation brings slight performance improvement. | ||
| 2021.emnlp-main.730 We first propose a base NAR model by directly ***** adapting ***** the common training scheme from its AutoRegressive (AR) counterpart. | ||
| P18-2050 In this paper, we propose a simple and parameter-efficient adaptation technique that only requires ***** adapting ***** the bias of the output softmax to each particular user of the MT system, either directly or through a factored approximation. | ||
| 2020.lt4hala-1.13 We address the problem of creating and evaluating quality Neo-Latin word embeddings for the purpose of philosophical research, ***** adapting ***** the Nonce2Vec tool to learn embeddings from Neo-Latin sentences | ||
| replicability | 21 | |
| 2021.sigtyp-1.8 In this paper we explore two machine-driven approaches for prefix and suffix statistics which are crude approximations, but have advantages in terms of time and ***** replicability *****. | ||
| 2020.inlg-1.29 This paper outlines our ideas for a shared task on reproducibility of human evaluations in NLG which aims (i) to shed light on the extent to which past NLG evaluations are replicable and reproducible, and (ii) to draw conclusions regarding how evaluations can be designed and reported to increase ***** replicability ***** and reproducibility. | ||
| R19-1089 With recent efforts in drawing attention to the task of replicating and/or reproducing results, for example in the context of COLING 2018 and various LREC workshops, the question arises how the NLP community views the topic of ***** replicability ***** in general. | ||
| L10-1285 Furthermore, we describe in more detail how DeReKo deals with the fact that all its texts are subject to third parties' intellectual property rights, and how it deals with the issue of ***** replicability *****, which is particularly challenging given DeReKo's dynamic growth and the possibility to construct from it an open number of virtual corpora. | ||
| 2020.insights-1.15 We propose best practices to increase the ***** replicability ***** of NER evaluations by increasing transparency regarding the handling of improper label sequences | ||
| supertagging | 21 | |
| 2000.iwpt-1.9 We show that certain strategies yield an improved extracted LTAG in terms of compactness, broad coverage, and ***** supertagging ***** accuracy. | ||
| W18-6528 Hypertagging, or ***** supertagging ***** for surface realization, is the process of assigning lexical categories to nodes in an input semantic graph. | ||
| P18-2099 Our constraints accelerate both PCFG and TAG parsing, and combine effectively with other pruning techniques (coarse-to-fine and ***** supertagging *****) for an overall speedup of two orders of magnitude, while improving accuracy. | ||
| 2020.findings-emnlp.406 In this paper, we begin an empirical investigation: we train the ***** supertagging ***** model of Vaswani et al. | ||
| 1997.iwpt-1.22 In previous work we introduced the idea of ***** supertagging ***** as a means of improving the efficiency of a lexicalized grammar parser | ||
| specification | 21 | |
| L12-1512 iIt is often argued that a set of standard linguistic processing functionalities should be identified,with each of them given a formal ***** specification *****. | ||
| 2015.jeptalnrecital-court.9 The difficulty may stem from a poor ***** specification ***** of the keyword assignment task in view of the rank-based approach. | ||
| L14-1104 The focus of this paper lies on the construction of an integrated ontology, TMO, the TrendMiner Ontology, that has been assembled from several independent multilingual taxonomies and ontologies which are brought together by an interface ***** specification *****, expressed in OWL. | ||
| L12-1097 As a further consequence, the experiment has led to a richer ***** specification ***** of the editor guidelines to be used in the last compilation phase of the thesaurus. | ||
| 1991.mtsummit-papers.2 The paper focuses on seven selected topics: recent enhancements made in the Slot Grammar formalism and the specific analysis components; ***** specification ***** of a semantic type hierarchy and its use for verb sense disambiguation; incorporation of statistical techniques in the translation process; anaphora resolution; linkage of target morphology modules; methods for the construction of large MT lexicons; and interactive disambiguation | ||
| inconsistency | 21 | |
| D19-5517 Although consumer-generated texts are valuable since they contain a great number and wide variety of user evaluations, spelling ***** inconsistency ***** and the variety of expressions make analysis difficult. | ||
| L14-1651 An investigation of issues arising from a natural ***** inconsistency ***** within social media data found that machine learning algorithms tend to over fit to the data because Twitter contains a lot of repetition in the form of retweets. | ||
| L16-1629 Possible applications of methods for ***** inconsistency ***** detection are improving the annotation procedure as well as the guidelines and correcting errors in completed annotations. | ||
| 1963.earlymt-1.29 The original scheme of following each path until it terminates either in an analysis or in a grammatical ***** inconsistency ***** has been considerably improved through the. incorporation of two path-testing techniques. | ||
| 2020.emnlp-main.749 However, system-generated abstractive summaries often face the pitfall of factual ***** inconsistency *****: generating incorrect facts with respect to the source text | ||
| macro | 21 | |
| 2020.semeval-1.204 On the test set, our BERT classifier obtained ***** macro ***** F1 score of 0.90707 for subtask A, and 0.65279 for subtask B. The BiLSTM classifier obtained ***** macro ***** F1 score of 0.57565 for subtask C. | ||
| 2020.semeval-1.158 The experiment results show SVM achieved 35% for its F1 ***** macro *****, which is 0.132 points or 13.2% above the baseline model. | ||
| W19-4626 Our top model is able to achieve an F1 ***** macro ***** averaged score of 65.66 on MADAR's small-scale parallel corpus of 25 dialects and Modern Standard Arabic (MSA). | ||
| 2021.case-1.15 We achieved an average ***** macro ***** F1 of 0.65 in subtask 1 (i.e., document level classification), and a ***** macro ***** F1 of 0.70 in subtask 2 (i.e., sentence level classification). | ||
| L16-1520 By combining these two online applications, ***** macro *****- and micro-analyses of dialectal data (respectively offered by Gabmap and ALT-Web) are effectively and dynamically combined | ||
| distinctions | 21 | |
| L12-1443 Our extensible scheme is designed to account for ***** distinctions ***** between claims, performatives, atypical uses of factivity, and the authority of the one making the utterance. | ||
| D19-1359 By capturing sense ***** distinctions ***** evoked by syntagmatic relations, SyntagNet enables knowledge-based WSD systems to establish a new state of the art which challenges the hitherto unrivaled performances attained by supervised approaches. | ||
| L06-1374 This joint development allows for better motivated sense ***** distinctions *****, and a tighter coupling between both resources. | ||
| W19-5950 More precisely, we translate the discourse relations into a set of values for attributes based on ***** distinctions ***** used in the mappings between discourse frameworks proposed by Sanders et al. | ||
| L08-1368 Examples of translation outputs illustrate how considering gender and number ***** distinctions ***** in the POS tagset can be relevant | ||
| theoretically | 21 | |
| 2020.coling-main.546 The effectiveness of the representations is analyzed ***** theoretically ***** by a proposed framework. | ||
| W19-3520 It is innovative as it combines individual, situational, and social-structural determinants of online aggression and tries to ***** theoretically ***** derive their interplay. | ||
| 2020.figlang-1.22 Word-level annotation is poorly grounded ***** theoretically ***** and is harder to use in downstream tasks such as metaphor interpretation. | ||
| 2021.iwcs-1.15 This paper explores the `chicken-or-egg' problem of interdependencies between these components ***** theoretically ***** and practically. | ||
| 2020.emnlp-main.733 We first reveal the theoretical connection between the masked language model pre-training objective and the semantic similarity task ***** theoretically *****, and then analyze the BERT sentence embeddings empirically | ||
| moreover | 21 | |
| 2021.inlg-1.10 In particular, our experiments indicate that self-training with constrained decoding can enable sequence-to-sequence models to achieve satisfactory quality using vanilla decoding with five to ten times less data than with ordinary supervised baseline; ***** moreover *****, by leveraging pretrained models, data efficiency can be increased further to fifty times. | ||
| 2020.lrec-1.41 We focus on a specific error type, namely linking adverbial (e.g. however, ***** moreover *****) errors. | ||
| P18-1235 Deep convolutional neural networks excel at sentiment polarity classification, but tend to require substantial amounts of training data, which ***** moreover ***** differs quite significantly between domains. | ||
| 2020.findings-emnlp.202 Medical datasets are commonly imbalanced in their finding labels because incidence rates differ among diseases; ***** moreover *****, the ratios of abnormalities to normalities are significantly imbalanced. | ||
| P18-2075 In addition to directly optimizing for a tree-level metric such as F1, policy gradient has the potential to reduce exposure bias by allowing exploration during training; ***** moreover *****, it does not require a dynamic oracle for supervision | ||
| duration | 21 | |
| 2020.findings-emnlp.302 We introduce two effective models for ***** duration ***** prediction, which incorporate external knowledge by reading temporal-related news sentences (time-aware pre-training). | ||
| W16-4111 It is an elastic measure that takes into account idiosyncratic pause ***** duration ***** of translators as well as further confounds such as bi-gram frequency, letter frequency and some motor tasks involved in writing. | ||
| 2020.lrec-1.802 The resulting datasets after data extraction with Praat scripts (Boersma and Weenink, 2019) are analysed with R (R Core Team, 2017), focusing on ***** duration *****. | ||
| 2020.acl-main.739 Regarding possession ***** duration *****, we derive the time spans we work with empirically from annotations indicating lower and upper bounds. | ||
| P19-1588 First, human attention is represented by the reading ***** duration ***** estimated from eye-tracking corpus | ||
| Crucially | 21 | |
| 2020.lrec-1.221 ***** Crucially *****, downsizing the linguistic sample to about 30% of the original dataset does not diminish the discriminatory performance of the classifier. | ||
| N18-1197 ***** Crucially *****, our models do not require language data to learn these concepts: language is used only in pretraining to impose structure on subsequent learning. | ||
| 2021.emnlp-main.506 ***** Crucially *****, we develop a largely automated pipeline for constructing suitable training examples from Wikipedia. | ||
| L12-1222 ***** Crucially ***** it makes use of the universal language of images to identify action types, avoiding the underdeterminacy of semantic definitions. | ||
| D18-1199 ***** Crucially *****, we show that distributional semantics is a helpful heuristic for distinguishing the literal usage of idioms, giving us a way to formulate a literal usage metric to estimate the likelihood that the idiom is intended literally | ||
| rephrasing | 21 | |
| P17-1006 These effects are detrimental for language understanding systems, which may infer that `inexpensive' is a ***** rephrasing ***** for `expensive' or may not associate `acquire' with `acquires'. | ||
| 2021.emnlp-main.500 An important task in NLP applications such as sentence simplification is the ability to take a long, complex sentence and split it into shorter sentences, ***** rephrasing ***** as necessary. | ||
| 2021.mtsummit-research.14 Our findings indicate that and despite the fact that some of the identified phenomena depend on domain and/or language and the following set of phenomena can be considered as generally challenging for modern MT systems: ***** rephrasing ***** groups of words and translation of ambiguous source words and translating noun phrases and and mistranslations. | ||
| P19-1287 In contrast, human translators often refer to reference data, either ***** rephrasing ***** the intricate sentence fragments with common terms in source language, or just accessing to the golden translation directly. | ||
| P19-1333 We present an approach for recursively splitting and ***** rephrasing ***** complex English sentences into a novel semantic hierarchy of simplified sentences, with each of them presenting a more regular structure that may facilitate a wide variety of artificial intelligence tasks, such as machine translation (MT) or information extraction (IE) | ||
| personalization | 21 | |
| P18-1206 We propose a scalable neural model architecture with a shared encoder, a novel attention mechanism that incorporates ***** personalization ***** information and domain-specific classifiers that solves the problem efficiently. | ||
| 2020.acl-main.313 Knowledge graph embedding methods often suffer from a limitation of memorizing valid triples to predict new ones for triple classification and search ***** personalization ***** problems. | ||
| 2021.emnlp-main.421 We further discuss possible harms and hazards around such ***** personalization *****, and argue that value-sensitive design represents a crucial path forward through these challenges. | ||
| R19-1024 In Natural Language Generation systems, ***** personalization ***** strategies - i.e, the use of information about a target author to generate text that (more) closely resembles human-produced language - have long been applied to improve results. | ||
| 2021.emnlp-main.541 It has previously been shown that ***** personalization ***** through model fine-tuning substantially improves performance | ||
| characterizing | 21 | |
| 2020.acl-main.191 In many real scenarios, obtaining high- quality annotated data is expensive and time consuming; in contrast, unlabeled examples ***** characterizing ***** the target task can be, in general, easily collected. | ||
| 2021.reinact-1.8 We applied this system to a particular task of ***** characterizing ***** spatial configurations of blocks in a simple physical Blocks World (BW) domain using natural locative expressions, as well as generating justifications for the proposed spatial descriptions by indicating the factors that the system used to arrive at a particular conclusion. | ||
| 2021.eacl-main.169 Therefore, to bootstrap further their domain adaptation, we propose a simple yet unexplored approach, which we call biomedical entity-aware masking (BEM) strategy, encouraging masked language models to learn entity-centric knowledge based on the pivotal entities ***** characterizing ***** the domain at hand, and employ those entities to drive the LM fine-tuning. | ||
| 2020.coling-main.35 Predicting stance changes require ***** characterizing ***** both aspects and the interaction between them, especially in realistic settings in which stance changes are very rare. | ||
| W16-5414 Adding MWE variants/tokens to a dictionary resource requires ***** characterizing ***** the flexibility among other morphosyntactic features | ||
| attentive | 21 | |
| N19-1371 Results using a suite of baseline models — ranging from heuristic (rule-based) approaches to ***** attentive ***** neural architectures — demonstrate the difficulty of the task, which we believe largely owes to the lengthy, technical input texts. | ||
| N19-1122 In addition to the standard recurrent neural network, we introduce a novel ***** attentive ***** recurrent network to leverage the strengths of both attention models and recurrent networks. | ||
| 2020.findings-emnlp.378 In this paper, we improve NER by leveraging different types of syntactic information through ***** attentive ***** ensemble, which functionalizes by the proposed key-value memory networks, syntax attention, and the gate mechanism for encoding, weighting and aggregating such syntactic information, respectively. | ||
| E17-1119 Despite being a natural comparison and addition, previous work on ***** attentive ***** neural architectures have not considered hand-crafted features and we combine these with learnt features and establish that they complement each other. | ||
| N19-1048 In an evaluation on four tasks, we show that ***** attentive ***** mimicking outperforms previous work for both rare and medium-frequency words | ||
| complementarity | 21 | |
| 2020.semeval-1.19 To this end, ***** complementarity ***** between 768- and 1024-dimensional BERT embeddings, and average word sense vectors were used. | ||
| P17-1004 To address this issue, we introduce a multi-lingual neural relation extraction framework, which employs mono-lingual attention to utilize the information within mono-lingual texts and further proposes cross-lingual attention to consider the information consistency and ***** complementarity ***** among cross-lingual texts. | ||
| 2021.naacl-main.113 Although some recent works show potential ***** complementarity ***** among different state-of-the-art systems, few works try to investigate this problem in text summarization. | ||
| 2021.spnlp-1.6 Our experiments with different claim classifiers on a German immigration newspaper corpus show consistent performance increases for joint prediction, in particular for infrequent categories and discuss the ***** complementarity ***** of the two approaches. | ||
| L06-1299 Thus, the commonalities and the ***** complementarity ***** of the lexical databases are more readily apparent | ||
| naturalistic | 21 | |
| L08-1075 Rather, people appear to rely on a large body of folk knowledge in the form of stereotypical associations, clichës and other kinds of ***** naturalistic ***** descriptions, many of which express views of the world that are second-hand, overly-simplified and, in some cases, non-literal to the point of being poetic. | ||
| 2020.findings-emnlp.358 We also propose new and more effective testbeds for both datasets, by introducing ***** naturalistic ***** variation by the user. | ||
| W19-3647 Using over 30 hours of ***** naturalistic ***** data (from 28 speakers in 5 Nigerian cities), the procedures for segmenting audio files into phonemic units via the Munich Automatic Segmentation System (MAUS), and the extraction of their spectral values in Praat are explained. | ||
| 2020.acl-main.180 Applying this measure to ***** naturalistic ***** speech corpora, we find evidence suggesting that speakers alter their productions to make contextually more confusable words easier to understand. | ||
| L14-1285 The LAST MINUTE corpus comprises records and transcripts of ***** naturalistic ***** problem solving dialogs between N = 130 subjects and a companion system simulated in a Wizard of Oz experiment | ||
| LongSumm | 21 | |
| 2020.sdp-1.41 This paper presents our methods for the ***** LongSumm ***** 2020: | ||
| 2020.sdp-1.25 Our system participates in two shared tasks, CL-SciSumm 2020 and ***** LongSumm ***** 2020. | ||
| 2020.sdp-1.39 On blind test corpora, our system ranks first and third for the ***** LongSumm ***** and LaySumm tasks respectively. | ||
| 2021.sdp-1.12 The ***** LongSumm ***** task needs participants generate long summary for scientific document | ||
| 2020.sdp-1.27 In this paper , we present the IIIT Bhagalpur and IIT Patna team 's effort to solve the three shared tasks namely , CL - SciSumm 2020 , CL - LaySumm 2020 , *****LongSumm***** 2020 at SDP 2020 . | ||
| ADAPT | 21 | |
| I17-4010 We describe the work of a team from the ***** ADAPT ***** Centre in Ireland in addressing automatic answer selection for the Multi-choice Question Answering in Examinations shared task. | ||
| 2020.msr-1.3 In this paper, we describe the ***** ADAPT ***** submission to the Surface Realization Shared Task 2020. | ||
| 2020.ngt-1.17 This paper describes the ***** ADAPT ***** Centre's submission to STAPLE (Simultaneous Translation and Paraphrase for Language Education) 2020, a shared task of the 4th Workshop on Neural Generation and Translation (WNGT), for the English-to-Portuguese translation task. | ||
| 2020.wmt-1.91 This paper describes the ***** ADAPT ***** Centre's submissions to the WMT20 Biomedical Translation Shared Task for English-to-Basque | ||
| 2018.iwslt-1.11 In this paper we present the *****ADAPT***** system built for the Basque to English Low Resource MT Evaluation Campaign . | ||
| HTML | 21 | |
| 2020.wac-1.3 However, two critical steps in the development of web corpora remain challenging: the identification of clean text from source ***** HTML ***** and the assignment of genre or register information to the documents. | ||
| L12-1250 Most of them are written in ***** HTML ***** and have to be rendered by an ***** HTML ***** engine in order to display the data they contain on a screen. | ||
| L06-1047 If you want to read the ***** HTML ***** page, for instance http://www.memodata.com, double click on any word at random | ||
| D18-1099 However, this organization is generally discarded during text collection, and collecting it is not straightforward: the same visual organization can be implemented in a myriad of different ways in the underlying ***** HTML *****. | ||
| L12-1021 The ***** HTML ***** rendering fully preserved and all annotations consist in new ***** HTML ***** spans with specific styles | ||
| crawled | 21 | |
| 2020.lrec-1.155 The corpus was ***** crawled ***** and scraped from the public domain (SEC filings) and is, to the best of our knowledge, the first freely available corpus of its kind. | ||
| 2020.acl-demos.20 Applying our tool leads to improved translation quality while significantly reducing the size of the training data, also clearly outperforming an alternative ranking given in the ***** crawled ***** data set. | ||
| 2021.acl-long.169 We use Jupyter notebooks containing visualization programs ***** crawled ***** from GitHub to train PlotCoder. | ||
| 2013.iwslt-evaluation.11 Due to the lack of expertly transcribed acoustic speech data for German, acoustic model training had to be performed on publicly available data ***** crawled ***** from the internet. | ||
| C16-1187 After a study of social conversation data ***** crawled ***** from the web, we observed that some characteristics estimated from the responses of messages are discriminative for identifying context dependent messages | ||
| conditional | 21 | |
| 2021.acl-long.91 In Neural Machine Translation (and, more generally, ***** conditional ***** language modeling), the generation of a target token is influenced by two types of context: the source and the prefix of the target sequence. | ||
| 2021.eacl-main.18 Additionally, we devise two mechanisms to alleviate the two common problems of vanilla NAG models: the inflexibility of prefixed output length and the ***** conditional ***** independence of individual token predictions. | ||
| 2020.acl-main.676 We propose a simple data augmentation protocol aimed at providing a compositional inductive bias in ***** conditional ***** and un***** conditional ***** sequence models. | ||
| 2020.acl-main.316 We also generalize our method to ***** conditional ***** language modeling and propose Coupled-CVAE, which largely improves the diversity of dialogue generation on the Switchboard dataset. | ||
| D18-1527 We address these concerns with a model that incorporates document covariates to estimate ***** conditional ***** word embedding distributions | ||
| phenomena | 21 | |
| 2021.blackboxnlp-1.41 This top-down approach is, however, costly when we have no probable hypothesis on the association between the target model component and ***** phenomena *****. | ||
| L16-1692 The paper shows that by including TEI-XML structuring in corpus-based analyses significances can be observed for different linguistic ***** phenomena *****, as e.g. the development of conceptual text structures themselves, the syntactic embedding of terms in certain conceptual text structures, and ***** phenomena ***** of language change which become obvious via the layout of a text. | ||
| L16-1237 It comes with guidelines for the manual POS annotation of transcripts of German spoken data and an extended version of the STTS (Stuttgart Tübingen Tagset) which accounts for ***** phenomena ***** typically found in spontaneous spoken German. | ||
| W19-4819 We study two natural language ***** phenomena *****: center embedding sentences and syntactic island constraints on the filler–gap dependency. | ||
| L10-1597 At last, we conduct an in-depth analysis of the effective errors of the CasEN system, providing us with some useful indications about ***** phenomena ***** that gave rise to errors (e.g. metonymy, encapsulation, detection of right boundaries) and are as many challenges for named entity recognition systems | ||
| explanatory | 21 | |
| L08-1551 As a result of the task, Spanish WN has been shown to exhibit 1) lack of ***** explanatory ***** clarity (it does not define word meanings, but glosses and examplifies them instead; it does not systematically encode metaphoric meanings, either); 2) structural inadequacy (some words appear as hyponyms of another sense of the same word; sometimes there even coexist in Spanish WN a general sense and a specific one related to the same concept, but with no structural link in between; hyperonymy relationships have been detected that are likely to raise doubts to human annotators; there can even be found cases of auto-hyponymy); 3) cross-linguistic inconsistency (there exist in English EWN concepts whose lexical equivalent is missing in Spanish WN; glosses in one language more often than not contradict or diverge from glosses in another language). | ||
| L16-1408 It comprises a large ***** explanatory ***** dictionary of more than 250,000 entries that are derived from more than 280 external sources. | ||
| L06-1404 The work is based on a Maximum Entropy model of stochastic resolution of grammatical conflicting constraints, and is demonstrably capable of putting ***** explanatory ***** theoretical accounts to the challenging test of an extensive, usage-based empirical verification. | ||
| 2020.semeval-1.73 For sentence generation, we used Neural Machine Translation (NMT) model to generate ***** explanatory ***** sentences. | ||
| 1998.amta-papers.19 I survey the research and practice of SI, and note that ***** explanatory ***** analyses of SI do not yet exist | ||
| transductive | 21 | |
| 2021.naacl-main.333 They are also ***** transductive ***** in nature, thus cannot handle out-of-graph documents. | ||
| D19-1379 In ***** transductive ***** learning, an unlabeled test set is used for model training. | ||
| 2021.emnlp-main.195 The proposed ***** transductive ***** learning approach is general and effective to the task of unsupervised style transfer, and we will apply it to the other two typical methods in the future. | ||
| 2020.acl-main.746 We introduce a ***** transductive ***** model for parsing into Universal Decompositional Semantics (UDS) representations, which jointly learns to map natural language utterances into UDS graph structures and annotate the graph with decompositional semantic attribute scores. | ||
| C18-1125 In order to deal with this, we present the character-based decoder part of a multilingual approach based on ***** transductive ***** transfer learning for a historical handwriting recognition task on Italian Comedy Registers | ||
| collocational | 21 | |
| W17-1706 In our approach, priority is given to parsing alternatives involving collocations, and hence ***** collocational ***** information helps the parser through the maze of alternatives, with the aim to lead to substantial improvements in the performance of both tasks (collocation identification and parsing), and in that of a subsequent task (machine translation). | ||
| L12-1332 Word sketches are one-page, automatic, corpus-based summaries of a word's grammatical and ***** collocational ***** behaviour. | ||
| W89-0240 There are a number of ***** collocational ***** constraints in natural languages that ought to play a more important role in natural language parsers | ||
| 2020.mwe-1.13 This research investigates the *****collocational***** errors made by English learners in a learner corpus . | ||
| W19-4709 This paper introduces a novel method to track *****collocational***** variations in diachronic corpora that can identify several changes undergone by these phraseological combinations and to propose alternative solutions found in later periods . | ||
| coreferential | 21 | |
| N18-2055 We observed that the central event of a document usually has many ***** coreferential ***** event mentions that are scattered throughout the document for enabling a smooth transition of subtopics. | ||
| 2020.emnlp-main.209 In these languages, the direct translation of `the doctor removed his mask' is not ambiguous between a ***** coreferential ***** reading and a disjoint reading. | ||
| 2020.crac-1.15 We present a study focusing on variation of ***** coreferential ***** devices in English original TED talks and news texts and their German translations. | ||
| 2020.crac-1.2 By training a classifier on a combination of lexical and semantic features, we show that resolving the ***** coreferential ***** relations prior to classification is beneficial in a joint optimization setup. | ||
| L14-1701 This paper presents three corpora with *****coreferential***** annotation of person entities for Portuguese , Galician and Spanish . | ||
| connectionist | 21 | |
| W89-0224 Application of this type of ***** connectionist ***** model to the area of spoken language processing is discussed. | ||
| D18-1336 We present a novel non-autoregressive architecture based on ***** connectionist ***** temporal classification and evaluate it on the task of neural machine translation. | ||
| 2020.wanlp-1.8 We employ a recurrent neural network (RNN), combined with the ***** connectionist ***** temporal classification (CTC) loss to deal with unequal input/output lengths. | ||
| 1991.iwpt-1.14 A *****connectionist***** network is defined that parses a grammar in Chomsky Normal Form in logarithmic time , based on a modification of Rytter 's recognition algorithm . | ||
| D18-1073 The use of *****connectionist***** approaches in conversational agents has been progressing rapidly due to the availability of large corpora . | ||
| hyperbolic | 21 | |
| 2021.emnlp-main.657 In contrast, ***** hyperbolic ***** models are effective at modeling hierarchical relations, but do not perform as well on patterns on which circular rotation excels. | ||
| 2021.naacl-main.36 Extensive experiments in language modeling, unaligned style transfer, and dialog-response generation demonstrate the effectiveness of the proposed APo-VAE model over VAEs in Euclidean latent space, thanks to its superb capabilities in capturing latent language hierarchies in ***** hyperbolic ***** space. | ||
| W17-4904 We learn ***** hyperbolic ***** adjective patterns that are representative of the strongly-valenced expressive language commonly used in either positive or negative reviews. | ||
| 2020.acl-main.283 Second, Hyperbolic Dynamic Routing (HDR) is introduced to aggregate ***** hyperbolic ***** capsules in a label-aware manner, so that the label-level discriminative information can be preserved along the depth of neural networks. | ||
| 2021.conll-1.37 Distributional representation has become a fundamental approach for encoding word relationships, particularly embeddings in ***** hyperbolic ***** space showed great performance in representing hierarchies by taking advantage of their spatial properties | ||
| Argument Reasoning Comprehension | 21 | |
| S18-1184 This paper presents our submissions to SemEval 2018 Task 12: the ***** Argument Reasoning Comprehension ***** Task. | ||
| S18-1194 This paper describes our system in SemEval-2018 task 12: ***** Argument Reasoning Comprehension *****. | ||
| 2020.lrec-1.622 This paper reports on the scientific reproduction of several systems addressing the ***** Argument Reasoning Comprehension ***** Task of SemEval2018 | ||
| S18-1189 This paper describes the system submitted to SemEval-2018 Task 12 ( The *****Argument Reasoning Comprehension***** Task ) . | ||
| S18-1185 The *****Argument Reasoning Comprehension***** Task is a difficult challenge requiring significant language understanding and complex reasoning over world knowledge . | ||
| unified | 21 | |
| L12-1206 In contrast to other national corpora, it is conceptualised as a linked collection of many existing and future language resources representing language use in Australia, ***** unified ***** through common technical standards. | ||
| 2020.mwe-1.4 Previous studies ***** unified ***** these two constructions under a single semantic analysis and adopted either a mereological or a scalar approach. | ||
| 2020.sigdial-1.8 ConvoKit provides an ***** unified ***** framework for representing and manipulating conversational data, as well as a large and diverse collection of conversational datasets. | ||
| 2020.acl-main.183 Our experiments show that multi-tasking over several tasks that focus on particular capabilities results in better blended conversation performance compared to models trained on a single skill, and that both ***** unified ***** or two-stage approaches perform well if they are constructed to avoid unwanted bias in skill selection or are fine-tuned on our new task. | ||
| 2021.acl-long.188 Based on the ***** unified ***** formulation, we exploit the pre-training sequence-to-sequence model BART to solve all ABSA subtasks in an end-to-end framework | ||
| correct | 21 | |
| D17-1131 Given a question and a set of answer candidates, answer triggering determines whether the candidate set contains any ***** correct ***** answers. | ||
| 2021.emnlp-main.287 Many models utilize a predefined confusion set to learn a mapping between ***** correct ***** characters and its visually similar or phonetically similar misuses but the mapping may be out-of-domain. | ||
| L06-1049 The machine trained sets of lemmatisation rules are very easy to produce without having linguistic knowledge given that one has ***** correct ***** training data. | ||
| 2020.acl-srw.31 Unlike other languages, Japanese poses unique challenges: (1) Japanese texts are unsegmented so that we cannot simply apply a spelling checker, and (2) the way people inputting kanji logographs results in typos with drastically different surface forms from ***** correct ***** ones. | ||
| P18-2053 However, most of the existing neural machine translation models only use one of the ***** correct ***** translations as the targets, and the other ***** correct ***** sentences are punished as the in***** correct ***** sentences in the training stage | ||
| semantic dependencies | 21 | |
| 2020.coling-main.170 To generate correct answers, the comprehension of the ***** semantic dependencies ***** among implicit visual and textual contents is critical. | ||
| L14-1495 In this paper, we present a data-driven approach to model ***** semantic dependencies ***** between medical concepts, qualified by the beliefs of physicians. | ||
| C16-1203 In this paper, we study ***** semantic dependencies ***** between verbs and their arguments by modeling selectional preferences in the context of machine translation. | ||
| 2020.acl-demos.35 Abstract Meaning Representation (AMR) (Banarescu et al., 2013) is a framework for ***** semantic dependencies ***** that encodes its rooted and directed acyclic graphs in a format called PENMAN notation. | ||
| 2020.acl-main.518 In this paper, we leverage the power of pre-trained language models for improving video-grounded dialogue, which is very challenging and involves complex features of different dynamics: (1) Video features which can extend across both spatial and temporal dimensions; and (2) Dialogue features which involve ***** semantic dependencies ***** over multiple dialogue turns | ||
| persona | 21 | |
| 2020.emnlp-main.739 Since such a choice is not observed in the data, we model it using a discrete latent random variable and use variational learning to sample from hundreds of ***** persona ***** expansions. | ||
| 2020.findings-emnlp.324 Open-domain dialogue systems frequently suffer from generic responses that do not characterize ***** persona *****l stories, so we look to infuse conversations with ***** persona ***** information by mimicking prototype conversations. | ||
| 2021.eacl-main.44 Our automatic and human evaluations on the PersonaChat corpus confirm that our approach increases the rate of responses that are factually consistent with ***** persona ***** facts over its supervised counterpart while retains the language quality of responses. | ||
| 2020.emnlp-main.531 Notably, our results show that ***** persona ***** improves empathetic responding more when CoBERT is trained on empathetic conversations than non-empathetic ones, establishing an empirical link between ***** persona ***** and empathy in human conversations. | ||
| D19-1193 Experimental results on PERSONA-CHAT dataset show that the DIM model outperforms its baseline model, i.e., IMN with ***** persona ***** fusion, by a margin of 14.5% and outperforms the present state-of-the-art model by a margin of 27.7% in terms of top-1 accuracy hits@1 | ||
| component | 21 | |
| W17-5536 However, the complexity of these ***** component *****s hinders developers from determining which ***** component ***** causes an error. | ||
| P19-3011 As a showcase, we extend the MultiWOZ dataset with user dialog act annotations to train all ***** component ***** models and demonstrate how ConvLab makes it easy and effortless to conduct complicated experiments in multi-domain end-to-end dialog settings. | ||
| 2020.ccl-1.96 A clause complex consists of clauses, which are connected by ***** component ***** sharing relations and logic-semantic relations. | ||
| 2020.iwdp-1.9 This paper clarifies the existence of ***** component ***** sharing mechanism in both English and Chinese clause complexes, illustrates the differences in ***** component ***** sharing between the two languages, and introduces a formal annotation scheme to represent clause-complex level structural transformations. | ||
| D19-1149 However, it is still unclear which ***** component ***** dominates the process of disambiguation | ||
| evaluation metric | 21 | |
| 2021.naacl-main.90 While traditional corpus-level ***** evaluation metric *****s for machine translation (MT) correlate well with fluency, they struggle to reflect adequacy. | ||
| 2021.hackashop-1.19 In the 2021 Embeddia Hackathon, we implemented one novel, normative theory-based ***** evaluation metric *****, “activation”, and use it to compare two recommendation strategies of New York Times comments, one based on user likes and another on editor picks. | ||
| 2020.acl-main.450 QAGS has substantially higher correlations with these judgments than other automatic ***** evaluation metric *****s. | ||
| 2021.humeval-1.9 Our contributions include the annotated dataset that we make publicly available and the proposal of Success Rate @k as an ***** evaluation metric ***** that is more appropriate than the traditional QA's and information retrieval's metrics. | ||
| L10-1211 The evaluation itself is processed in a UIMA component, users can create and plug their own ***** evaluation metric *****s in addition to the predefined metrics. | ||
| agglutinative languages | 21 | |
| 2021.ranlp-1.31 Character-based word-segmentation models have been extensively applied to ***** agglutinative languages *****, including Thai, due to their high performance. | ||
| P17-2105 Our approach improves on the language-dependent state of the art for two ***** agglutinative languages ***** (Turkish and Kazakh) and can be potentially applied to other morphologically complex languages. | ||
| L06-1442 In ***** agglutinative languages ***** each morpheme is concatinatively added on to form a complete morphological structure. | ||
| 2020.wmt-1.21 Our main system, and only submission, is based on a multilingual approach, jointly training a Transformer model on several ***** agglutinative languages *****. | ||
| L04-1259 Highly inflectional/***** agglutinative languages ***** like Hungarian typically feature possible word forms in such a magnitude that automatic methods that provide morphosyntactic annotation on the basis of some training corpus often face the problem of data sparseness. | ||
| patterns | 21 | |
| L14-1708 Workflow languages focus on expressive power of the languages to describe variety of workflow ***** patterns ***** to meet users' needs. | ||
| 2020.pam-1.2 We use this model to develop a socio-semantic theory of conventionalised reasoning ***** patterns *****, known as topoi. | ||
| 2020.nlpcss-1.14 This limits their use for understanding the dynamics, ***** patterns ***** and prevalence of online abuse. | ||
| S17-2166 Word Embedding Distance Pattern, which uses the head noun word embedding to generate distance ***** patterns ***** based on labeled keyphrases, is proposed as an incremental feature set to enhance the conventional Named Entity Recognition feature sets. | ||
| 1998.amta-papers.33 The approach includes two main tasks: finding ***** patterns ***** and formulating rules to automate the translation of English terms into Spanish terms. | ||
| learn | 21 | |
| D17-2003 Case studies tend to be used in legal, business, and health education contexts, but less in the teaching and ***** learn *****ing of linguistics. | ||
| D19-1212 Multi-view ***** learn *****ing algorithms are powerful representation ***** learn *****ing tools, often exploited in the context of multimodal problems. | ||
| 2020.aacl-main.29 To resolve the cold start problem in training, we propose a method using a pseudo data generator which generates pseudo texts and KB triples for ***** learn *****ing an initial model. | ||
| W19-4447 In view of the influence of the first language on ***** learn *****ers, we further propose an effective approach to improve the quality of the suggested sentences. | ||
| P19-1516 In this paper, we propose a neural network inspired multi- task ***** learn *****ing framework that can simultaneously extract ADRs from various sources. | ||
| recent | 21 | |
| W19-4324 The most ***** recent ***** successes are predominantly due to the use of different variations of attention mechanisms, but their cognitive plausibility is questionable. | ||
| L14-1017 The resultant data has also been ***** recent *****ly used in disfluency studies across domains. | ||
| L14-1031 The second method makes use of ***** recent ***** advances in distributional similarity representation to transfer existing norms to their closest neighbors in a high-dimensional vector space. | ||
| 2006.amta-papers.28 Discriminative training methods have ***** recent *****ly led to significant advances in the state of the art of machine translation (MT). | ||
| E17-1019 Moreover, we explore the utilization of the ***** recent *****ly proposed Word Mover's Distance (WMD) document metric for the purpose of image captioning. | ||
| collection | 21 | |
| L10-1362 Question answering (QA) systems aim at retrieving precise information from a large ***** collection ***** of documents. | ||
| 2020.lrec-1.218 A weak TLS algorithm can even match a stronger one by employing a stronger IR method in the data ***** collection ***** phase. | ||
| L14-1186 Information about imageability of words can be obtained from the MRC Psycholinguistic Database (MRCPD) for English words and Lëxico Informatizado del Espan̈ol Programa (LEXESP) for Spanish words, which is a ***** collection ***** of human ratings obtained in a series of controlled surveys. | ||
| L12-1299 What would be a good method to provide a large ***** collection ***** of semantically annotated texts with formal, deep semantics rather than shallow? | ||
| 2020.acl-demos.5 We present a large improvement over classic search engine baseline on several standard QA datasets and provide the community a collaborative data ***** collection ***** tool to curate the first natural language processing research QA dataset via a community effort. | ||
| automatic analysis | 21 | |
| W18-0908 In this era of web 2.0, ***** automatic analysis ***** of sarcasm and metaphors is important for their extensive usage. | ||
| W19-2206 We used Natural Language Processing (NLP) techniques and deep learning methods allowing us to scale the ***** automatic analysis ***** of millions of US federal court dockets. | ||
| W16-3708 The ***** automatic analysis ***** of emotions conveyed in social media content, e.g., tweets, has many beneficial applications. | ||
| 2020.semeval-1.153 Their multi-modal nature, caused by a mixture of text and image, makes them a very challenging research object for ***** automatic analysis *****. | ||
| L12-1200 We review recent advances in linguistics-based partial tagging and parsing, and regard the achieved analysis performance as sufficient for reconsidering a previously proposed method: combining nearly correct but partial ***** automatic analysis ***** with a minimal amount of human postediting (disambiguation) to achieve nearly correct corpus annotation accuracy at a competitive annotation speed. | ||
| techniques | 21 | |
| D18-1207 We propose two ***** techniques ***** to improve the level of abstraction of generated summaries. | ||
| L12-1283 This work is part of a project for MWE extraction and characterization using different ***** techniques ***** aiming at measuring the properties related to idiomaticity, as institutionalization, non-compositionality and lexico-syntactic fixedness. | ||
| L06-1015 That is, the retrieved documents from both systems are shown to the judges without any information about thesearch ***** techniques *****. | ||
| 2020.wanlp-1.32 In this paper, several ***** techniques ***** with multiple algorithms are applied for Arabic dialects identification starting from removing noise till classification task using all Arabic countries as 21 classes. | ||
| 2020.gebnlp-1.6 Furthermore, we analyze the effect of the debiasing ***** techniques ***** on downstream tasks which show a negligible impact on traditional embeddings and a 2% decrease in performance in contextualized embeddings. | ||
| complex word | 21 | |
| W18-0541 This paper investigates the use of character n-gram frequencies for identifying ***** complex word *****s in English, German and Spanish texts. | ||
| 2020.winlp-1.6 As a result, SIMPLEX-PB 2.0 features much more reliable and numerous candidate substitutions to ***** complex word *****s, as well as word complexity rankings produced by a group underprivileged children. | ||
| 2020.acl-main.424 replacing ***** complex word *****s or phrases by simpler synonyms), reorder components, and/or delete information deemed unnecessary. | ||
| R17-1081 Here, we evaluate the extent to which sentiment polarity of ***** complex word *****s can be predicted based on their morphological make-up. | ||
| R19-1135 The Quantum WSD algorithm requires concepts representations as vectors in the complex domain and thus we have developed a technique for computing ***** complex word ***** and sentence embeddings based on the Paragraph Vectors algorithm. | ||
| natural language parsing | 21 | |
| 1991.iwpt-1.22 This paper describes a ***** natural language parsing ***** algorithm for unrestricted text which uses a probability-based scoring function to select the “best” parse of a sentence. | ||
| W89-0230 When context-free parsers are used for ***** natural language parsing *****, pattern recognition, and so forth, there may be a great number of parses for a sentence. | ||
| 2003.mtsummit-systems.17 The conjunction of state-of-the-art ***** natural language parsing *****, multiword expression identification and large bilingual databases provides a powerful and effective tool for people who want to read on-line material in a foreign language which they are not completely fluent in. | ||
| 1997.iwpt-1.2 These figures are rather unusual for ordinary parsers or parser generators, because they are mostly used in the context of ***** natural language parsing *****, and thus do not have to face the same computation problems. | ||
| 1993.iwpt-1.7 Although his work with PARSIFAL pioneered the field of deterministic ***** natural language parsing *****, his method has several drawbacks: The rules and actions in the grammar / interpreter are so embedded that it is difficult to distinguish between them. | ||
| logical form | 21 | |
| P19-1010 Combined with a decoder copy mechanism, this approach provides a conceptually simple mechanism to generate ***** logical form *****s with entities. | ||
| K18-1035 Weakly-supervised semantic parsers are trained on utterance-denotation pairs, treating ***** logical form *****s as latent. | ||
| L12-1613 For each MWE its basic morpho***** logical form ***** and the base forms of its constituents are specified but also each MWE is assigned to a class on the basis of its syntactic structure. | ||
| D19-1104 Moreover, we present a data generation strategy for constructing utterance-***** logical form ***** pairs from different domains. | ||
| 2020.acl-main.605 We propose variable - in - situ logico - semantic graphs to bridge the gap between semantic graph and *****logical form***** parsing . | ||
| lexical information | 21 | |
| W16-4307 We introduce any-gram kernels which model ***** lexical information ***** in a significantly faster way than the traditional n-gram features, while capturing all possible orders of n-grams n in a sequence without the need to explicitly present a pre-specified set of such orders. | ||
| L12-1521 The RELISH project promotes language-oriented research by addressing a two-pronged problem: (1) the lack of harmonization between digital standards for ***** lexical information ***** in Europe and America, and (2) the lack of interoperability among existing lexicons of endangered languages, in particular those created with the Shoebox/Toolbox lexicon building software. | ||
| L16-1498 Wiktionary is a large-scale resource for cross-lingual ***** lexical information ***** with great potential utility for machine translation (MT) and many other NLP tasks, especially automatic morphological analysis and generation. | ||
| 1997.iwpt-1.22 First, we have vastly improved our results; 92% accurate for supertag disambiguation using ***** lexical information *****, larger training corpus and smoothing techniques. | ||
| 2007.iwslt-1.28 Our focus was threefold: using hierarchical phrase-based models in spoken language translation, the incorporation of sub-***** lexical information ***** in model estimation via morphological analysis (Arabic) and word and character segmentation (Chinese), and the use of n-gram sequence models for source-side punctuation prediction. | ||
| automatic annotation | 21 | |
| D19-3044 For languages with simple morphology such as English, ***** automatic annotation ***** pipelines such as spaCy or Stanford's CoreNLP successfully serve projects in academia and the industry. | ||
| 2016.lilt-13.3 Ultimately, this work aims at providing a method for the ***** automatic annotation ***** of data with boundedness information and at contributing to Machine Translation by taking into account linguistic data. | ||
| L12-1203 The program contains several functions which compute target points (or significant points) to model F0 contour, perform ***** automatic annotation ***** of different shapes and export all data in an xls file. | ||
| 2021.ranlp-1.81 In this paper, we introduce the Greek version of the ***** automatic annotation ***** tool ERRANT (Bryant et al., 2017), which we named ELERRANT. | ||
| L16-1576 In this paper, we present the ***** automatic annotation ***** of bibliographical references' zone in papers and articles of XML/TEI format. | ||
| visually grounded | 21 | |
| 2021.acl-srw.8 The impressive performances of pre-trained ***** visually grounded ***** language models have motivated a growing body of research investigating what has been learned during the pre-training. | ||
| 2021.eacl-tutorials.1 We will also discuss emerging research topics such as BERT-based approaches and ***** visually grounded ***** learning. | ||
| 2020.emnlp-main.60 In this paper, we propose learning representations from a set of implied, ***** visually grounded ***** expressions between image and text, automatically mined from those datasets. | ||
| W19-1808 Recent work on ***** visually grounded ***** language learning has focused on broader applications of grounded representations, such as visual question answering and multimodal machine translation. | ||
| 2020.challengehml-1.7 To this end, understanding passenger intents from spoken interactions and vehicle vision systems is an important building block for developing contextual and ***** visually grounded ***** conversational agents for AV. | ||
| forms | 21 | |
| 2020.coling-main.278 Our proposed LaAP-Net outper***** forms ***** existing approaches on three benchmark datasets for the text VQA task by a noticeable margin. | ||
| P19-1513 Experiments show that our model outper***** forms ***** several baselines by a large margin. | ||
| S19-2217 Results show that the CNN model per***** forms ***** better than other models for both the subtasks. | ||
| 2020.emnlp-main.459 The experimental results on five datasets sampled from Freebase, NELL and Wikidata show that our method outper***** forms ***** state-of-the-art baselines. | ||
| 2021.ecnlp-1.18 Through various exper- iments, we show that this architecture outper- ***** forms ***** a typical slot detector approach, with a gain of +81% in accuracy and +41% in F1 score. | ||
| language independent | 21 | |
| W18-3609 The UniTO realizer is ***** language independent *****, and its simple architecture allowed it to be scored in the central part of the final ranking of the shared task. | ||
| L12-1164 It covers English open-class words, but the concept base is ***** language independent *****. | ||
| 2020.wat-1.8 In addition, we employed ***** language independent ***** adapter to further improve the system performances. | ||
| S17-2015 A sense-based ***** language independent ***** textual similarity approach is presented, in which a proposed alignment similarity method coupled with new usage of a semantic network (BabelNet) is used. | ||
| R17-1077 In this paper , we introduce a cross - lingual Semantic Role Labeling ( SRL ) system with *****language independent***** features based upon Universal Dependencies . | ||
| customer service | 21 | |
| 2020.emnlp-main.149 Take ***** customer service ***** and court debate dialogue as examples, compatible logics can be observed across different dialogue instances, and this information can provide vital evidence for utterance generation. | ||
| 2021.sigdial-1.48 Live chat in ***** customer service ***** platforms is critical for serving clients online. | ||
| Q19-1024 Neural end-to-end goal-oriented dialog systems showed promise to reduce the workload of human agents for ***** customer service *****, as well as reduce wait time for users. | ||
| 2021.naacl-main.239 To study ***** customer service ***** dialogue systems in more realistic settings, we introduce the Action-Based Conversations Dataset (ABCD), a fully-labeled dataset with over 10K human-to-human dialogues containing 55 distinct user intents requiring unique sequences of actions constrained by policies to achieve task success. | ||
| 2021.sigdial-1.54 We test our models on ***** customer service ***** dialogues and experimental results demonstrated that our models can reliably select informative sentences and words for automatic summarization. | ||
| space | 21 | |
| 2020.semeval-1.30 It consists of preparing a semantic vector ***** space ***** for each corpus, earlier and later; computing a linear transformation between earlier and later ***** space *****s, using Canonical Correlation Analysis and orthogonal transformation;and measuring the cosines between the transformed vector for the target word from the earlier corpus and the vector for the target word in the later corpus. | ||
| D18-1212 Through the joint exploitation of these constraints in an adversarial manner, the underlying cross-language semantics relevant to retrieval tasks are better preserved in the embedding ***** space *****. | ||
| L14-1031 The second method makes use of recent advances in distributional similarity representation to transfer existing norms to their closest neighbors in a high-dimensional vector ***** space *****. | ||
| 2020.vardial-1.6 However, these approaches require cross-lingual information such as seed dictionaries to train the model and find a linear transformation between the word embedding ***** space *****s. | ||
| P17-1029 In this paper we propose multi-***** space ***** variational encoder-decoders, a new model for labeled sequence transduction with semi-supervised learning. | ||
| components | 21 | |
| 2020.findings-emnlp.250 Word-embeddings are vital ***** components ***** of Natural Language Processing (NLP) models and have been extensively explored. | ||
| 1997.mtsummit-workshop.4 (1) From a linguistic viewpoint, the expected benefits include a refinement of the aspectual model in (Olsen, 1994; Olsen, 1997) (which provides necessary but not sufficient conditions for aspectual com- position), and a refinement of the verb classifications in (Levin, 1993); we also expect our approach to eventually produce a systematic definition (in terms of LCSs and compositional operations) of the precise meaning ***** components ***** responsible for Levin's classification. | ||
| S17-2031 The first stage deals with constructing neural word embeddings, the ***** components ***** of sentence embeddings. | ||
| 2020.emnlp-main.52 Specifically, we devise two ***** components *****, prototype enhanced retrospection and hierarchical distillation, to mitigate the adverse effects of semantic ambiguity and class imbalance, respectively. | ||
| W16-5201 Kathaa exposes an intuitive web based Interface for the users to interact with and modify complex NLP Systems; and a precise Module definition API to allow easy integration of new state of the art NLP ***** components *****. | ||
| agreement | 21 | |
| L08-1428 We discuss evaluation results of the defined concepts for semantic role annotation concerning the redundancy and completeness of the tagset and the reliability of annotations in terms of inter-annotator ***** agreement *****. | ||
| N18-2036 We validate the integrity of the corpus with interannotator ***** agreement ***** analyses. | ||
| W18-5610 Manually created reference data show 0.76 inter-annotator ***** agreement *****. | ||
| L12-1081 end user licence ***** agreement *****s) required for the agency's operation. | ||
| L08-1551 We present the results of an *****agreement***** task carried out in the framework of the KNOW Project and consisting in manually annotating an agreement sample totaling 50 sentences extracted from the SenSem corpus . | ||
| morphological and syntactic | 21 | |
| L08-1593 We proceed to compare the two languages according to the diversity of available lexical items, ***** morphological and syntactic ***** properties, and then try to understand the translation of colour. | ||
| W16-5414 We propose an automated method that identifies the ***** morphological and syntactic ***** flexibility of Arabic Verbal Multiword Expressions (AVMWE). | ||
| 2021.bea-1.5 A broad range of quantifiable linguistic complexity features (lexical, ***** morphological and syntactic *****) are extracted and calculated. | ||
| L12-1184 With the CINTIL-International Corpus of Portuguese, an ongoing corpus annotated with fully flegded grammatical representation, sentences get not only a high level of lexical, ***** morphological and syntactic ***** annotation but also a semantic analysis that prepares the data to a manual specification step and thus opens the way for a number of tools and resources for which there is a great research focus at the present. | ||
| W17-1717 The system was meant to accommodate the variety of linguistic resources provided for each language, in terms of accompanying ***** morphological and syntactic ***** information. | ||
| action | 21 | |
| 2020.sltu-1.22 We also considered the inter***** action ***** of adjectives with other grammatical means, especially other part of speeches, e.g. | ||
| D18-1207 We propose two techniques to improve the level of abstr***** action ***** of generated summaries. | ||
| L12-1283 This work is part of a project for MWE extr***** action ***** and characterization using different techniques aiming at measuring the properties related to idiomaticity, as institutionalization, non-compositionality and lexico-syntactic fixedness. | ||
| L14-1009 Our formalization is based on the BDI model (Belief, Desire and Intetion) and constitues a first step toward a unifying model for subjective information extr***** action *****. | ||
| L12-1249 This paper adresses the described challenge of phrase extr***** action ***** from documents in different domains and languages and proposes an approach, which does not use comprehensive lexica and therefore can be easily transferred to new domains and languages. | ||
| semantic and syntactic | 21 | |
| 2020.eval4nlp-1.12 The data set is expanded to contain ***** semantic and syntactic ***** tests and is multilingual (English, German, and Italian). | ||
| C18-1321 Apart from textual view cued by both the ***** semantic and syntactic ***** information, a complimentary view extracted from images contained in the web-snippets is also utilized in the current framework. | ||
| W18-6204 Word embeddings such as Word2Vec are efficient at incorporating ***** semantic and syntactic ***** properties of words, yielding good results for document classification. | ||
| 2021.naacl-main.108 However, contextual representations from pre-trained models contain entangled ***** semantic and syntactic ***** information, and therefore cannot be directly used to derive useful semantic sentence embeddings for some tasks. | ||
| C18-1216 Distributed representations of words play a major role in the field of natural language processing by encoding ***** semantic and syntactic ***** information of words. | ||
| syntactic dependency | 21 | |
| 2021.eacl-main.170 In this study, we design a directed ***** syntactic dependency ***** graph based on a dependency tree to establish a path from the target to candidate opinions. | ||
| 2020.acl-main.642 To simultaneously capture the relations between objects in an image and the ***** syntactic dependency ***** relations between words in a question, we propose a novel dual channel graph convolutional network (DC-GCN) for better combining visual and textual advantages. | ||
| W16-5406 Paratactic syntactic structures are difficult to represent in ***** syntactic dependency ***** tree structures. | ||
| 2020.acl-main.493 Motivated by these results, we present an unsupervised analysis method that provides evidence mBERT learns representations of ***** syntactic dependency ***** labels, in the form of clusters which largely agree with the Universal Dependencies taxonomy. | ||
| 2020.iwltp-1.13 It is meant both for demonstration of available services, from text-span annotations to ***** syntactic dependency ***** trees as well as playing or automatically synthesizing Romanian words, and for the development of new annotated corpora. | ||
| legal judgment prediction | 21 | |
| P19-1424 We release a new English ***** legal judgment prediction ***** dataset, containing cases from the European Court of Human Rights. | ||
| D18-1390 *****Legal Judgment Prediction***** (LJP) aims to predict the judgment result based on the facts of a case and becomes a promising application of artificial intelligence techniques in the legal field. | ||
| 2021.nllp-1.3 So far, *****Legal Judgment Prediction***** (LJP) datasets have been released in English, French, and Chinese. | ||
| 2021.ranlp-1.139 *****Legal judgment prediction***** (LJP) usually consists in a text classification task aimed at predicting the verdict on the basis of the fact description. | ||
| 2020.emnlp-main.540 Existing works have proved that using law articles as external knowledge can improve the performance of the *****Legal Judgment Prediction*****. | ||
| japanese morphological analysis | 21 | |
| R19-1093 We evaluate the method on a *****Japanese morphological analysis***** task. | ||
| L10-1111 To solve the unknown morpheme problem in *****Japanese morphological analysis*****, we previously proposed a novel framework of online unknown morpheme acquisition and its implementation. | ||
| I17-1094 We incorporated the acquired variant-normalization pairs into *****Japanese morphological analysis*****. | ||
| 2020.law-1.8 This paper also reports benchmark results on our corpus for *****Japanese morphological analysis*****, named entity recognition, and dependency parsing. | ||
| L08-1535 In this paper, we discuss lemma identification in *****Japanese morphological analysis*****, which is crucial for a proper formulation of morphological analysis that benefits not only NLP researchers but also corpus linguists. | ||
| rhetorical | 21 | |
| K18-1044 However, rarely do editorials change anyone's stance on an issue completely, nor do they tend to argue explicitly (but rather follow a subtle ***** rhetorical ***** strategy). | ||
| P19-2028 Fallacies like the personal attack—also known as the ad hominem attack—are introduced in debates as an easy win, even though they provide no ***** rhetorical ***** contribution. | ||
| N18-1009 This work examines the ***** rhetorical ***** techniques that speakers employ during political campaigns. | ||
| L08-1459 The motivation for the study is to gain a better understanding of the ***** rhetorical ***** properties of parentheticals in order to enable a natural language generation system to produce parentheticals as part of a ***** rhetorical *****ly well-formed output. | ||
| W19-8662 For this purpose, we turn input sentences into a two-layered semantic hierarchy in the form of core facts and accompanying contexts, while identifying the ***** rhetorical ***** relations that hold between them. | ||
| definition modeling | 21 | |
| 2020.mwe-1.9 Our model outperforms previous approaches in the generative task of *****Definition Modeling***** in many settings, but it also matches or surpasses the state of the art in discriminative tasks such as Word Sense Disambiguation and Word-in-Context. | ||
| P18-2043 We explore recently introduced *****definition modeling***** technique that provided the tool for evaluation of different distributed vector representations of words through modeling dictionary definitions of words. | ||
| 2020.emnlp-main.513 In this paper, we tackle the task of *****definition modeling*****, where the goal is to learn to generate definitions of words and phrases. | ||
| W19-6201 Building on the distributional hypothesis, we argue here that the most natural formalization of *****definition modeling***** is to treat it as a sequence-to-sequence task, rather than a word-to-sequence task: given an input sequence with a highlighted word, generate a contextually appropriate definition for it. | ||
| D19-1357 *****Definition modeling***** includes acquiring word embeddings from dictionary definitions and generating definitions of words. | ||
| Pre - trained language | 21 | |
| 2020.inlg-1.39 *****Pre - trained language***** models have recently contributed to significant advances in NLP tasks . | ||
| 2020.findings-emnlp.292 *****Pre - trained language***** models have been dominating the field of natural language processing in recent years , and have led to significant performance gains for various complex natural language tasks . | ||
| 2020.findings-emnlp.338 *****Pre - trained language***** models that learn contextualized word representations from a large un - annotated corpus have become a standard component for many state - of - the - art NLP systems . | ||
| D19-1441 *****Pre - trained language***** models such as BERT have proven to be highly effective for natural language processing ( NLP ) tasks . | ||
| 2021.eacl-main.262 *****Pre - trained language***** models like BERT achieve superior performances in various NLP tasks without explicit consideration of syntactic information . | ||
| interfaces | 20 | |
| 2021.acl-demo.33 It supports both statistical and neural translation models as well as provides quality estimation to inform users of reliability, two user feedback ***** interfaces ***** for experts and common users respectively, example inputs to collect human translations for monolingual data, word alignment visualization, and relevant terms from the Cherokee English dictionary. | ||
| 2020.lrec-1.370 The Oxford Online Database of Romance Verb Morphology provides this kind of information, however, it is not maintained anymore and is only available as a web service without ***** interfaces ***** for machine-readability. | ||
| L06-1194 To tackle these problems, Telefönica Möviles Espan̈a has carried out several projects with the final aim to define a corporate methodology based on rapid prototyping of the user ***** interfaces *****, so that designers could integrate the process of design of voice ***** interfaces ***** with emulations of the navigation through the flow charts. | ||
| 2003.mtsummit-systems.13 The steps in each process are described, and screen images are provided to illustrate the system architecture and example tool ***** interfaces *****. | ||
| 2021.hcinlp-1.2 We conclude that it is imperative that those for whom the ***** interfaces ***** are designed have a voice in the design process | ||
| summarization dataset | 20 | |
| 2021.eacl-main.265 Experimental results on a benchmark ***** summarization dataset ***** verify the effectiveness of our proposed method. | ||
| 2020.aacl-main.61 To deeply study this task, we present SportsSum, a Chinese sports game ***** summarization dataset ***** which contains 5,428 soccer games of live commentaries and the corresponding news articles. | ||
| 2020.acl-main.456 With no style-specific article-headline pair (only a standard headline ***** summarization dataset ***** and mono-style corpora), our method TitleStylist generates stylistic headlines by combining the summarization and reconstruction tasks into a multitasking framework. | ||
| N18-1065 We present NEWSROOM, a ***** summarization dataset ***** of 1.3 million articles and summaries written by authors and editors in newsrooms of 38 major news publications. | ||
| D19-1388 Extensive experiments conducted on a large-scale real-world text ***** summarization dataset ***** show that PESG achieves the state-of-the-art performance in terms of both automatic metrics and human evaluations | ||
| equations | 20 | |
| W19-8661 We propose a novel neural network model to generate math word problems from the given ***** equations ***** and topics. | ||
| P18-1039 Our experiments show using intermediate forms outperforms directly predicting ***** equations *****. | ||
| P19-1517 The key to solving the problem is to reveal the underlying mathematical relations (such as addition and subtraction) among quantities, and then generate ***** equations ***** to find solutions. | ||
| 1997.iwpt-1.2 Thanks to this new parser generator, I was able to show that most biocomputing models previously based on dynamic programming ***** equations ***** were unified by MTSAGs, and that they were better handled by automatically generated parsers than by handwritten programs. | ||
| L12-1349 Finally, based on the statistical distribution of LDD path types, we propose empirical bounds on traditional regular expression based functional uncertainty ***** equations ***** used to handle LDDs in LFG | ||
| insertion | 20 | |
| 2020.findings-emnlp.111 We achieve this by decomposing the text-editing task into two sub-tasks: tagging to decide on the subset of input tokens and their order in the output text and ***** insertion ***** to in-fill the missing tokens in the output not present in the input. | ||
| L12-1229 This paper examines both linguistic behavior and practical implication of empty argument ***** insertion ***** in the Hindi PropBank. | ||
| 2012.amta-wptp.5 Our adaptive APE is able to insert within 3 words of the best location 73% of the time (32% in the exact location) in Arabic-English MT output, and 67% of the time in Chinese-English output (30% in the exact location), and delivers improved performance on automated adequacy metrics over a previous rule-based approach to ***** insertion *****. | ||
| 2017.iwslt-1.11 Experiments show that generalizing rare and unknown words greatly improves the punctuation ***** insertion ***** performance, reaching up to 8.8 points of improvement in F-score when applied to the out-of-domain test scenario. | ||
| 2020.deelio-1.4 Our experiments demonstrate that ***** insertion ***** of character-level synthetic noise and keyword replacement with hypernyms are effective augmentation methods, and that the quality of generations improves to a peak at approximately three times the amount of original data | ||
| offensiveness | 20 | |
| 2020.semeval-1.253 The SemEval-2020 Task 12 (OffensEval) challenge focuses on detection of signs of ***** offensiveness ***** using posts or comments over social media. | ||
| 2021.semeval-1.174 The “HaHackathon: Detecting and Rating Humor and Offense” task at the SemEval 2021 competition focuses on detecting and rating the humor level in sentences, as well as the level of ***** offensiveness ***** contained in these texts with humoristic tones. | ||
| 2021.acl-long.210 Finally, we evaluate the ability of widely-used neural models to predict ***** offensiveness ***** scores on this new dataset. | ||
| R19-1125 Our paper includes two main contributions; first, using a neural network to measure the level of ***** offensiveness ***** in conversations; and second, the analysis of conversations around offensive comments using decoupling functions. | ||
| 2021.sigdial-1.1 Though users express dissatisfaction in correlation with these errors, certain dissatisfaction types (such as ***** offensiveness ***** and privacy objections) depend on additional factors – such as the user's personal attitudes, and prior unaddressed dissatisfaction in the conversation | ||
| detector | 20 | |
| W16-5002 We present our machine learning based uncertainty ***** detector ***** which is based on a rich features set including lexical, morphological, syntactic, semantic and discourse-based features, and we evaluate our system on a small set of manually annotated social media texts. | ||
| P17-1053 Additionally, we propose a simple KBQA system that integrates entity linking and our proposed relation ***** detector ***** to make the two components enhance each other. | ||
| 2021.ecnlp-1.18 Through various exper- iments, we show that this architecture outper- forms a typical slot ***** detector ***** approach, with a gain of +81% in accuracy and +41% in F1 score. | ||
| 2021.semeval-1.8 For the visual features, we have tested both grid features based on ResNet and salient region features from pretrained object ***** detector *****. | ||
| W19-8626 Moreover, we prove that our ***** detector ***** can recognize both machine-translated and machine-back-translated texts without the language information which is used to generate these machine texts | ||
| linguist | 20 | |
| 2020.iwltp-1.13 It integrates multiple text and speech processing modules and exposes their functionality through a web interface designed for the ***** linguist ***** researcher. | ||
| L10-1318 Our goal was twofold: first, to develop an easy-to-use system that required a minimum of learning from the part of the ***** linguist *****; second, one that provided a straightforward way of checking the results obtained, in order to immediately evaluate the results of the rules devised. | ||
| 2010.amta-government.12 To fill this gap, the government has invested in the research, development and implementation of Human Language Technologies into the ***** linguist ***** workflow. | ||
| R19-1094 incom.py allows ***** linguist ***** experts to quickly and easily perform statistical analyses and compare those with experimental results. | ||
| 2004.amta-papers.15 In addition, we had a legacy MT software system, which neither the speakers nor the ***** linguist ***** was familiar with, although we had the opportunity to occasionally confer with experienced system users | ||
| lexicography | 20 | |
| 2020.globalex-1.1 Therefore, the OntoLex community has put forward the proposal for a novel module for frequency, attestation and corpus information (FrAC), that not only covers the requirements of digital ***** lexicography *****, but also accommodates essential data structures for lexical information in natural language processing. | ||
| W16-4707 We investigate how both model-related factors and application-related factors affect the accuracy of distributional semantic models (DSMs) in the context of specialized ***** lexicography *****, and how these factors interact. | ||
| L10-1161 Morphological items represent a real challenge for ***** lexicography *****, especially for the development of multilingual tools. | ||
| 2020.lrec-1.395 Aligning senses across resources and languages is a challenging task with beneficial applications in the field of natural language processing and electronic ***** lexicography *****. | ||
| 2020.winlp-1.13 Predicting the degree of compositionality of noun compounds is a crucial ingredient for ***** lexicography ***** and NLP applications, to know whether the compound should be treated as a whole, or through its constituents | ||
| MRs | 20 | |
| P18-1070 Annotating NL utterances with their corresponding ***** MRs ***** is expensive and time-consuming, and thus the limited availability of labeled data often becomes the bottleneck of data-driven, supervised models. | ||
| P19-1447 Semantic parsing considers the task of transducing natural language (NL) utterances into machine executable meaning representations (***** MRs *****). | ||
| 2020.inlg-1.35 Here, the goal is still to generate a natural language prompt, but in SG-NLG, the input ***** MRs ***** are paired with rich schemata providing contextual information. | ||
| 2020.coling-main.267 Our work significantly increases the match in compositional structure between ***** MRs ***** and improves multi-task learning (MTL) in a low-resource setting, serving as a proof of concept for future broad-scale cross-MR normalization. | ||
| K19-2013 Via extensive analysis of implicit alignments in AMR, we recategorize five meaning representations (***** MRs *****) into two classes: Lexical- Anchoring and Phrasal-Anchoring | ||
| actionable | 20 | |
| N18-6003 How to turn such massive and unstructured text data into structured, ***** actionable ***** knowledge, and furthermore, how to teach machines learn to reason and complete the extracted knowledge is a grand challenge to the research community. | ||
| W18-5415 Our results suggest the proposed algorithms are high performance and data efficient, able to glean ***** actionable ***** insights from fewer than 10,000 data points. | ||
| D19-1668 To train it, we wrote a non-trivial pipeline to convert PHI, the largest digital corpus of ancient Greek inscriptions, to machine ***** actionable ***** text, which we call PHI-ML. | ||
| L16-1117 In addition to the user utterances that contain an ***** actionable ***** item, annotations also include the arguments associated with the ***** actionable ***** item. | ||
| 2020.emnlp-main.747 Our work leads to ***** actionable ***** suggestions for evaluating and characterizing rationales | ||
| predicative | 20 | |
| L10-1256 We also use a pattern knowledge base over the syntactic dependencies to extract flat ***** predicative ***** logical representations. | ||
| L14-1050 In many cases, it is assumed that the definition of a ***** predicative ***** term can be inferred by combining the definition of a related lexical unit with the information provided by the semantic relation (i.e. lexical function) that links them. | ||
| L06-1476 The lexicon-grammar based paradigm in computer linguistics is derived from the predicate logic and attributes a central role to the ***** predicative ***** constructions. | ||
| 1963.earlymt-1.16 As an example, parts of the ***** predicative ***** blocking routine developed at Wayne State University will be presented as formulated with the aid of decision tables. | ||
| L16-1082 In this paper , we focus on Czech complex predicates formed by a light verb and a *****predicative***** noun expressed as the direct object . | ||
| convergence | 20 | |
| 2021.emnlp-main.283 Deep reinforcement learning (RL) methods often require many trials before ***** convergence *****, and no direct interpretability of trained policies is provided. | ||
| R19-1017 We describe extensive experiments using transfer learning and warm-starting techniques with improvements of more than 5% in relative percentage of success rate in the majority of cases, and up to 10x faster ***** convergence ***** as opposed to training the system without them. | ||
| 2018.iwslt-1.8 In both the scenarios our goal is to improve the translation performance, while minimizing the training ***** convergence ***** time. | ||
| P18-1063 Moreover, by first operating at the sentence-level and then the word-level, we enable parallel decoding of our neural generative model that results in substantially faster (10-20x) inference speed as well as 4x faster training ***** convergence ***** than previous long-paragraph encoder-decoder models. | ||
| N18-1034 Finally, we find that supervision leads to faster ***** convergence ***** as compared to an LDA baseline and that dDMR's model fit is less sensitive to training parameters than DMR | ||
| instantiation | 20 | |
| 2020.emnlp-main.296 We present an ***** instantiation ***** of this framework with a trained evidence estimator which relies on distant supervision from question answering (where various resources exist) to identify segments which are likely to answer the query and should be included in the summary. | ||
| L14-1272 This design comprises different processes such as data selection, formal definition and ***** instantiation ***** of an image. | ||
| W18-5210 This paper goes a step further by addressing the task of automatically identifying reasoning patterns of arguments using predefined templates, which is called argument template (AT) ***** instantiation *****. | ||
| 2020.emnlp-main.740 Furthermore, we introduce LABES-S2S, which is a copy-augmented Seq2Seq model ***** instantiation ***** of LABES. | ||
| 2020.wnut-1.73 We make a preliminary ***** instantiation ***** of this formal model for the text classification approaches | ||
| cue | 20 | |
| L10-1229 We provide information about the morphological type of the ***** cue *****, the characteristics of the scope in relation to the morpho-syntactic features of the ***** cue ***** and of the clause, and the ambiguity level of the ***** cue ***** by describing in which cases certain negation ***** cue *****s do not express negation. | ||
| N19-1340 In detail, we use sentences and their labels from train dataset as an associated memory ***** cue ***** to help label the target sentence. | ||
| 2020.wmt-1.63 Priming is a well known and studied psychology phenomenon based on the prior presentation of one stimulus (***** cue *****) to influence the processing of a response. | ||
| L16-1619 The annotation scheme was tested with an inter-annotator agreement study showing satisfactory results for the identification of ARs and high agreement on the selection of the text spans corresponding to its constitutive elements: source, ***** cue ***** and content | ||
| 2021.conll-1.41 In this paper , we revisit the task of negation resolution , which includes the subtasks of *****cue***** detection ( e.g. | ||
| jointly | 20 | |
| 2021.emnlp-main.301 Recently, pre-trained sequence to sequence (seq2seq) models have proven to be very effective in ***** jointly ***** making predictions, as well as generating NL explanations. | ||
| Q16-1017 The two components are estimated ***** jointly ***** so as to minimize errors in recovering arguments. | ||
| 2021.acl-long.219 We transform the end-to-end disease recognition and normalization task as an action sequence prediction task, which not only ***** jointly ***** learns the model with shared representations of the input, but also ***** jointly ***** searches the output by state transitions in one search space. | ||
| 2021.emnlp-main.391 The topology of each graph models similarity relations among words, and is estimated ***** jointly ***** with the graph embedding. | ||
| 2020.repl4nlp-1.24 Then we provide a simple method that ensembles predictions from multiple replacements while ***** jointly ***** modeling the uncertainty of type annotations and label predictions | ||
| discursive | 20 | |
| W19-8631 Emphasis will be placed on adapting NLG methodologies to the political domain, which entails special attention to affect, ***** discursive ***** variety, and rhetorical strategies that align a speaker with their interlocutor, even in cases of policy disagreement. | ||
| 2018.jeptalnrecital-court.22 Our long-term goal is to automatically construct a test set of context-dependent sentences in order to evaluate machine translation models designed to improve the translation of contextual, ***** discursive ***** phenomena. | ||
| L08-1612 Besides, the scheme provides a typology of discourse markers based on their ***** discursive ***** functions including hypothesis, co-argumentation, cause, consequence, concession, generalization, topicalization, reformulation, enumeration, synthesis, etc. | ||
| W18-4303 Cross-document event chain co-referencing in corpora of news articles would achieve increased precision and generalizability from a method that consistently recognizes narrative, ***** discursive *****, and phenomenological features such as tense, mood, tone, canonicity and breach, person, hermeneutic composability, speed, and time | ||
| L14-1557 This paper investigates the *****discursive***** phenomenon called other - repetitions ( OR ) , particularly in the context of spontaneous French dialogues . | ||
| multimodality | 20 | |
| W18-6901 We explain the process of generating contextually-relevant utterances, such as task-specific feedback messages, and discuss challenges regarding ***** multimodality ***** and multilingualism for situated natural language generation from a robot tutoring perspective. | ||
| 2021.bionlp-1.33 This paper aims to investigate whether using ***** multimodality ***** during training improves the summarizing performances of the model at test-time. | ||
| 2020.acl-main.114 We compare various ***** multimodality ***** integration and fusion strategies. | ||
| L08-1032 This paper presents the results of a joint effort of a group of ***** multimodality ***** researchers and tool developers to improve the interoperability between several tools used for the annotation of ***** multimodality *****. | ||
| 2021.mmsr-1.1 In this position paper we explain how the field uses outdated definitions of ***** multimodality ***** that prove unfit for the machine learning era | ||
| postprocessing | 20 | |
| 2012.iwslt-evaluation.7 A number of different techniques are evaluated in the MT and SLT tracks, including domain adaptation via data selection, translation model interpolation, phrase training for hierarchical and phrase-based systems, additional reordering model, word class language model, various Arabic and Chinese segmentation methods, ***** postprocessing ***** of speech recognition output with an SMT system, and system combination. | ||
| 2011.freeopmt-1.12 A lightweight manual ***** postprocessing ***** is carried out in order to fix inconsistencies in the automatically derived dictionaries and to add very frequent words that are missing according to a corpus analysis. | ||
| W19-5314 Our system is based on the self-attentional Transformer networks, into which we integrated the most recent effective strategies from academic research (e.g., BPE, back translation, multi-features data selection, data augmentation, greedy model ensemble, reranking, ConMBR system combination, and ***** postprocessing *****). | ||
| L10-1292 To address this issue, we have investigated the use of a grammar checker for two purposes in connection with SMT: as an evaluation tool and as a ***** postprocessing ***** tool. | ||
| 2014.iwslt-evaluation.10 While this ***** postprocessing ***** method does not allow us to achieve better results than a state-of-the-art system, this should be an interesting way to explore, for example by adding this topic space information at an early stage in the translation process | ||
| fluent | 20 | |
| W16-3713 However, manual creation of such a parallel corpus is time consuming, and requires experts ***** fluent ***** in both languages. | ||
| N18-1183 This paper explores the time course of lexical memory retrieval by modeling ***** fluent ***** language production. | ||
| 2020.iwslt-1.22 We specifically tackle the problem of disfluency removal in dis***** fluent *****-to-***** fluent ***** text-to-text translation assuming no access to ***** fluent ***** references during training. | ||
| 2021.eacl-main.299 Thus, a disfluency correction system that converts dis***** fluent ***** to ***** fluent ***** text is of great value. | ||
| D18-1423 It is a challenging task to automatically compose poems with not only *****fluent***** expressions but also aesthetic wording . | ||
| manifold | 20 | |
| 2020.acl-main.276 We propose a novel ***** manifold ***** based geometric approach for learning unsupervised alignment of word embeddings between the source and the target languages. | ||
| 2020.emnlp-main.97 However, in natural language, it is difficult to generate new examples that stay on the underlying data ***** manifold *****. | ||
| D19-1225 When trained on corpora of fonts, our model learns a ***** manifold ***** over font styles that can be used to analyze or reconstruct new, unseen fonts. | ||
| 2021.ccl-1.84 Compared with the previous query expansion methods our methodcombines multiple query expansion methods to better represent query information and at the same time it makes a useful attempt on ***** manifold ***** ranking | ||
| 2020.starsem-1.11 The *****manifold***** hypothesis suggests that word vectors live on a submanifold within their ambient vector space . | ||
| cues | 20 | |
| L08-1492 In this paper, we apply these automatically extracted ***** cues ***** to a new annotated corpus, to determine the portability and generality of the ***** cues ***** we learn. | ||
| L16-1183 The corpus contains annotations for four facets of emotion: valence, arousal, emotion category and emotion ***** cues *****. | ||
| P17-2085 Fortunately, some mentions may have more ***** cues ***** for linking, which can be used as seed mentions to bridge other mentions and the uninformative entities. | ||
| 2021.emnlp-main.277 The BPM_MT simultaneously carries out two tasks at learning: 1) BC category prediction using acoustic and lexical features, and 2) sentiment score prediction based on sentiment ***** cues *****. | ||
| 2021.eacl-main.232 However, BERT alone does not capture the implicit knowledge of deception ***** cues *****: its contribution is conditional on the concurrent use of attention to learn ***** cues ***** from BERT's representations | ||
| batch | 20 | |
| 2020.acl-main.323 While much previous research (Sutskever et al., 2013; Duchi et al., 2011; Kingma and Ba, 2015) focuses on accelerating convergence and reducing the effects of the learning rate, comparatively few papers concentrate on the effect of ***** batch ***** size. | ||
| 2021.repl4nlp-1.31 This, however, still conditions each example's loss on all ***** batch ***** examples and requires fitting the entire large ***** batch ***** into GPU memory. | ||
| 2020.acl-main.504 While previous works have investigated approaches to reduce model size, relatively little attention has been paid to techniques to improve ***** batch ***** throughput during inference. | ||
| 2020.ngt-1.24 The neural machine translation decoding also benefits from FP16 inference, attention caching, dynamic ***** batch *****ing, and ***** batch ***** pruning. | ||
| 2020.acl-main.86 We also demonstrate that allowing instances of different tasks to be interleaved as much as possible between each epoch and ***** batch ***** has a clear benefit in multitask performance over forcing task homogeneity at the epoch or ***** batch ***** level | ||
| erroneous | 20 | |
| 2020.inlg-1.45 Our results show that different kinds of errors elicit significantly different evaluation scores, even though all ***** erroneous ***** descriptions differ in only one character from the reference descriptions. | ||
| W17-0807 Intrinsic evaluation of the created scheme confirmed its potential contribution to the consistent classification of identified ***** erroneous ***** text spans, achieving visibly higher Cohen's kappa values, up to 0.831, than previous work. | ||
| W17-1908 In order to avoid ***** erroneous ***** (spam) crowdsourced results, we used a novel task-specific two-phase filtering process where users were asked to identify synonyms in the target language, and remove ***** erroneous ***** senses. | ||
| 2021.naacl-industry.22 Although deep neural networks have been widely employed and proven effective in sentiment analysis tasks, it remains challenging for model developers to assess their models for ***** erroneous ***** predictions that might exist prior to deployment. | ||
| 2020.emnlp-main.207 Most publicly available parallel corpora for Bengali are not large enough; and have rather poor quality, mostly because of incorrect sentence alignments resulting from ***** erroneous ***** sentence segmentation, and also because of a high volume of noise present in them | ||
| phonotactic | 20 | |
| W19-4224 We also show that the model's learned representations map onto existing measures of words' phonological structure (phonological neighborhood density and ***** phonotactic ***** probability). | ||
| Q14-1008 We find that the improvements range from 10 to 4%, depending on both the use of ***** phonotactic ***** cues and, to a lesser extent, the amount of evidence available to the learner. | ||
| 2021.sigmorphon-1.19 We introduce a simple and highly general ***** phonotactic ***** learner which induces a probabilistic finite-state automaton from word-form data. | ||
| L16-1742 Polish rhythmic database and tools developed with the aim of investigating timing phenomena and rhythmic structure in Polish including topics such as, inter alia, the effect of speaking style and tempo on timing patterns, ***** phonotactic ***** and phrasal properties of speech rhythm and stability of rhythm metrics. | ||
| 2021.sigmorphon-1.4 This paper investigates how the ordering of tone relative to the segmental string influences the calculation of ***** phonotactic ***** probability | ||
| Dimensional Sentiment | 20 | |
| I17-4019 This paper introduces Mainiway AI Labs submitted system for the IJCNLP 2017 shared task on ***** Dimensional Sentiment ***** Analysis of Chinese Phrases (DSAP), and related experiments. | ||
| I17-4016 This paper introduces Team Alibaba's systems participating IJCNLP 2017 shared task No. 2 ***** Dimensional Sentiment ***** Analysis for Chinese Phrases (DSAP). | ||
| I17-4017 In this paper, we propose our model for IJCNLP 2017 ***** Dimensional Sentiment ***** Analysis for Chinese Phrases shared task | ||
| 2021.rocling-1.46 This technical report aims at the ROCLING 2021 Shared Task : *****Dimensional Sentiment***** Analysis for Educational Texts . | ||
| I17-4014 CKIP takes part in solving the *****Dimensional Sentiment***** Analysis for Chinese Phrases ( DSAP ) share task of IJCNLP 2017 . | ||
| RumourEval | 20 | |
| 2020.nlp4if-1.3 We propose a novel architecture for integrating features with pre-trained models that address these challenges and test our method on the ***** RumourEval ***** 2019 dataset. | ||
| S17-2082 Final submission for NileTMRG on ***** RumourEval ***** 2017. | ||
| S19-2147 As in ***** RumourEval ***** 2017 we provided a dataset of dubious posts and ensuing conversations in social media, annotated both for stance and veracity. | ||
| 2020.acl-main.97 Experiments on two public datasets, ***** RumourEval ***** and PHEME, demonstrate that DTCA not only provides explanations for the results of claim verification but also achieves the state-of-the-art performance, boosting the F1-score by more than 3.11%, 2.41%, respectively. | ||
| S19-2194 The paper presents Columbia team's participation in the SemEval 2019 Shared Task 7: ***** RumourEval ***** 2019 | ||
| derived | 20 | |
| L12-1537 In our framework, contextual occurrence information of much fewer canonical expressions are expanded into the whole forms of ***** derived ***** expressions, to be utilized when identifying those ***** derived ***** expressions. | ||
| W19-6105 We introduce an English evaluation set which is larger, more varied, and more realistic than seen to date, with terms ***** derived ***** from a historical thesaurus. | ||
| 2021.emnlp-main.765 However, high-dimensional vectors can encode complex linguistic information which leads to the problem that the ***** derived ***** clusters cannot explicitly align with the relational semantic classes. | ||
| C16-1208 Though various approaches have been proposed to study the opinion formation problem, they all formulate opinions as the ***** derived ***** sentiment values either discrete or continuous without considering the semantic information. | ||
| C16-1122 We use linear regression models to analyze CDSM performance and obtain insights into the linguistic factors that influence how predictable the distributional context of a ***** derived ***** word is going to be | ||
| quantified | 20 | |
| W19-8667 We discuss what this exercise can teach us about the nature of quantification and about the challenges posed by the generation of ***** quantified ***** expressions. | ||
| D19-1503 The contributions of different components of the model are ***** quantified ***** using ablation analysis. | ||
| L08-1168 Their performance is generally ***** quantified ***** through a comparison with the judgements of the first type of approach. | ||
| W19-8602 We model the production of ***** quantified ***** referring expressions (QREs) that identify collections of visual items. | ||
| 2014.lilt-9.8 The relational syllogistic is an extension of the language of Classical syllogisms in which predicates are allowed to feature transitive verbs with *****quantified***** objects . | ||
| situational | 20 | |
| 2021.eacl-main.146 When responding to a disaster, humanitarian experts must rapidly process large amounts of secondary data sources to derive ***** situational ***** awareness and guide decision-making. | ||
| W19-6130 We consider cross- and multilingual text classification approaches to the identification of online registers (genres), i.e. text varieties with specific ***** situational ***** characteristics. | ||
| 2021.eacl-main.317 However, they fail to consider the importance of pragmatic aspects and the need to consistently update new social ***** situational ***** information without forgetting the accumulated experiences. | ||
| S18-1104 Subtask-A involves classification of tweets into ironic and non-ironic instances whereas Subtask-B involves classification of the tweet into - non-ironic, verbal irony, ***** situational ***** irony or other verbal irony. | ||
| W19-8603 The results of the study suggest that Bayesian approaches must integrate individual generation preference and the cooperativeness of the ***** situational ***** task in order to model the broad variance between speakers more adequately | ||
| layer | 20 | |
| D19-5015 Our best performing system uses pre-trained ELMo word embeddings, followed by a bidirectional LSTM and an attention ***** layer *****. | ||
| 2021.naacl-main.455 Specifically, SGG is a hierarchical neural network which consists of a pointing-based selector at low ***** layer ***** concentrated on present keyphrase generation, a selection-guided generator at high ***** layer ***** dedicated to absent keyphrase generation, and a guider in the middle to transfer information from selector to generator. | ||
| 2020.acl-main.525 Its hidden state at ***** layer ***** l represents an l-gram in the input text, which is labeled only if its corresponding text region represents a complete entity mention. | ||
| S19-2055 Several ***** layer *****s namely embedding ***** layer *****, encoding-decoding ***** layer *****, softmax ***** layer ***** and a loss ***** layer ***** are used to map the sequences from textual conversations to the emotions namely Angry, Happy, Sad and Others. | ||
| 2020.challengehml-1.1 In addition to use the Transformer architecture, our approach relies on a modular co-attention and a glimpse ***** layer ***** to jointly encode one or more modalities | ||
| fuzzy | 20 | |
| 2010.jec-1.4 We show that for ***** fuzzy ***** matches of over 70%, one method outperforms both SMT and TM baselines. | ||
| 2020.repl4nlp-1.4 Indeed, this representation enables the use ***** fuzzy ***** set theoretic operations, such as union, intersection and difference. | ||
| W19-4217 First, we adapt a graph-based approach to characterize the clusters (***** fuzzy ***** types) of tone contour shapes observed in each tone n-gram category. | ||
| 2014.amta-researchers.4 When a computer-assisted translation (CAT) tool does not find an exact match for the source segment to translate in its translation memory (TM), translators must use ***** fuzzy ***** matches that come from translation units in the translation memory that do not completely match the source segment. | ||
| 2020.acl-main.144 This paper explores data augmentation methods for training Neural Machine Translation to make use of similar translations , in a comparable way a human translator employs *****fuzzy***** matches . | ||
| multi | 20 | |
| L08-1144 We present work on a three-stage system to detect and classify disfluencies in ***** multi ***** party dialogues. | ||
| L10-1119 In this paper we want to point out some issues arising when a natural language processing task involves several languages (like ***** multi *****- lingual, ***** multi *****document summarization and the machine translation aspects involved) which are often neglected. | ||
| D19-1199 In real ***** multi *****- party conversations, we can observe who is speaking, but the addressee information is not always explicit. | ||
| 2021.wanlp-1.5 Using annotated data from the same event, a BERT model is fine-tuned to classify tweets into different categories in the ***** multi *****- label setting. | ||
| 2021.emnlp-main.601 Introducing such ***** multi ***** label examples at the cost of annotating fewer examples brings clear gains on natural language inference task and entity typing task, even when we simply first train with a single label data and then fine tune with ***** multi ***** label examples | ||
| multilingual NLP | 20 | |
| 2020.coling-main.423 We present the first evaluation of the applicability of a spatial arrangement method (SpAM) to a typologically diverse language sample, and its potential to produce semantic evaluation resources to support ***** multilingual NLP *****, with a focus on verb semantics. | ||
| D18-1027 Cross-lingual word embeddings are becoming increasingly important in ***** multilingual NLP *****. | ||
| C16-1123 In recent years linguistic typologies, which classify the world's languages according to their functional and structural properties, have been widely used to support ***** multilingual NLP *****. | ||
| 2020.udw-1.16 The UD framework defines guidelines for a crosslingual syntactic analysis in the framework of dependency grammar, with the aim of providing a consistent treatment across languages that not only supports ***** multilingual NLP ***** applications but also facilitates typological studies. | ||
| 2021.acl-demo.8 Researching typological properties of languages is fundamental for progress in ***** multilingual NLP ***** | ||
| unlabeled corpora | 20 | |
| E17-1010 Nonetheless, recent approaches exploiting neural networks on ***** unlabeled corpora ***** achieve promising results, surpassing this hard baseline in most test sets. | ||
| 2020.findings-emnlp.168 Moreover, most of previous summarization models ignore abundant ***** unlabeled corpora ***** resources available for pretraining. | ||
| 2018.gwc-1.30 It only requires large ***** unlabeled corpora ***** and a sense inventory such as WordNet, and therefore does not rely on annotated data. | ||
| 2021.emnlp-main.502 This paper presents a novel training method, Conditional Masked Language Modeling (CMLM), to effectively learn sentence representations on large scale ***** unlabeled corpora ***** | ||
| W18-5040 The research described in this paper examines how to learn linguistic knowledge associated with discourse relations from ***** unlabeled corpora *****. | ||
| pronominal anaphora | 20 | |
| P19-1386 We present a corpus of over 8,000 annotated text passages with ambiguous ***** pronominal anaphora *****. | ||
| W18-4105 The Winograd Schema Challenge targets ***** pronominal anaphora ***** resolution problems which require the application of cognitive inference in combination with world knowledge. | ||
| L08-1363 This first edition focused only on English ***** pronominal anaphora ***** and NP coreference, and was organised as an exploratory exercise where various issues were investigated. | ||
| 2010.iwslt-papers.10 As a case in point, we study the problem of ***** pronominal anaphora ***** translation by manually evaluating German-English SMT output. | ||
| D19-1294 The ongoing neural revolution in machine translation has made it easier to model larger contexts beyond the sentence-level, which can potentially help resolve some discourse-level ambiguities such as ***** pronominal anaphora *****, thus enabling better translations | ||
| supervision | 20 | |
| W17-5031 We present a very simple model for text quality assessment based on a deep convolutional neural network, where the only ***** supervision ***** required is one corpus of user-generated text of varying quality, and one contrasting text corpus of consistently high quality. | ||
| D19-1324 For learning, we construct oracle extractive-compressive summaries, then learn both of our components jointly with this ***** supervision *****. | ||
| 2021.naacl-main.201 This includes mechanisms to create additional labeled data like data augmentation and distant ***** supervision ***** as well as transfer learning settings that reduce the need for target ***** supervision *****. | ||
| N19-1351 We use the resulting data as ***** supervision ***** for learning transferable sentence embeddings. | ||
| 2021.emnlp-main.505 Motivated by the failure of a Transformer model on the SCAN compositionality challenge (Lake and Baroni, 2018), which requires parsing a command into actions, we propose two auxiliary sequence prediction tasks as additional training ***** supervision ***** | ||
| relevant | 20 | |
| W18-3511 Classifiers are adopted to learn the sense ***** relevant ***** features of the words in the resource and also to automate the tagging of sense-types for verbs. | ||
| 2020.sigdial-1.10 To help capture these behaviors, we define a hybrid relational model in which ***** relevant ***** discourse behaviors are formulated as discrete latent variables and scored using neural networks. | ||
| 2021.acl-tutorials.3 It is believed that meta-learning has great potential to be applied in NLP, and some works have been proposed with notable achievements in several ***** relevant ***** problems, e.g., relation extraction, machine translation, and dialogue generation and state tracking. | ||
| L12-1449 A timeline is a graphical representation of a period of time, on which ***** relevant ***** events are marked. | ||
| C18-2022 LanguageNet directly shows the definition of a sense, bilingual synonyms and sense ***** relevant ***** examples | ||
| submissions | 20 | |
| S19-2118 This paper describes our system ***** submissions ***** as part of our participation (team name: JU_ETCE_17_21) in the SemEval 2019 | ||
| S17-2036 This paper shows the details of our system ***** submissions ***** in the task 2 of SemEval 2017. | ||
| 2021.winlp-1.4 We performed a computational linguistic analysis (part-of-speech analysis, emoji detection, sentiment analysis) on ***** submissions ***** around the time of the cancer diagnosis and around the time of remission. | ||
| D19-5633 In this paper, we report our system ***** submissions ***** to all 6 tracks of the WNGT 2019 shared task on Document-Level Generation and Translation. | ||
| 2013.iwslt-evaluation.1 In addition, ***** submissions ***** of one of the official machine translation tracks were also evaluated with human post-editing | ||
| navigation | 20 | |
| W17-5542 We demonstrate an information ***** navigation ***** system for sightseeing domains that has a dialogue interface for discovering user interests for tourist activities. | ||
| 2020.emnlp-main.271 Meanwhile, due to the lack of intermediate supervision, the agent's performance at following each part of the instruction cannot be assessed during ***** navigation *****. | ||
| 2020.findings-emnlp.157 We use the progress agents make towards the goal as a reinforcement learning reward signal to directly inform not only ***** navigation ***** actions, but also both question and answer generation. | ||
| 2021.emnlp-main.328 In this work, we propose a simple and effective language-aligned supervision scheme, and a new metric that measures the number of sub-instructions the agent has completed during ***** navigation *****. | ||
| 2020.sdp-1.23 The system provides search at multiple levels of textual granularity, from sentences to aggregations across documents, both in natural language and through ***** navigation ***** in a domain specific Knowledge Graph | ||
| semantic relationships | 20 | |
| P19-1423 Inter-sentence relation extraction deals with a number of complex ***** semantic relationships ***** in documents, which require local, non-local, syntactic and semantic dependencies. | ||
| S17-1026 We address the pattern selection task by exploiting the knowledge represented by entailment graphs, which capture ***** semantic relationships ***** holding among the learned pattern candidates. | ||
| W19-5051 Textual inference is the task of finding the ***** semantic relationships ***** between pairs of text. | ||
| L10-1341 The edges of the graph have been partially annotated by hand with ***** semantic relationships *****. | ||
| 2020.cogalex-1.8 This paper presents a bidirectional transformer based approach for recognising ***** semantic relationships ***** between a pair of words as proposed by CogALex VI shared task in 2020. | ||
| annotation schemes | 20 | |
| L14-1337 Interoperability of ***** annotation schemes ***** is one of the key words in the discussions about annotation of corpora. | ||
| 2020.findings-emnlp.347 Current automated methods to estimate turn and dialogue level user satisfaction employ hand-crafted features and rely on complex ***** annotation schemes *****, which reduce the generalizability of the trained models. | ||
| 2020.lrec-1.16 Text-processing algorithms that annotate main components of a story-line are presently in great need of corpora and well-agreed ***** annotation schemes *****. | ||
| L16-1136 The aim of the developed corpus is twofold: i) to assess the reliability of the different sense ***** annotation schemes ***** for Danish measured by qualitative analyses and annotation agreement scores, and ii) to serve as training and test data for machine learning algorithms with the practical purpose of developing sense taggers for Danish. | ||
| W19-3301 Developers of cross-linguistic semantic ***** annotation schemes ***** face a number of issues not encountered in monolingual annotation. | ||
| literary texts | 20 | |
| 2021.naacl-srw.5 In this thesis proposal, we explore the application of event extraction to ***** literary texts *****. | ||
| D19-3035 We demonstrate a stylometry toolkit for analysis of Latin ***** literary texts *****, which is freely available at www.qcrit.org/stylometry. | ||
| W17-2208 This paper presents an approach to extract co-occurrence networks from ***** literary texts *****. | ||
| L16-1168 We propose a scheme for annotating direct speech in ***** literary texts *****, based on the Text Encoding Initiative (TEI) and the coreference annotation guidelines from the Message Understanding Conference (MUC). | ||
| 2021.ranlp-1.84 We explored differences between ***** literary texts ***** originally authored in Russian and fiction translated into Russian from 11 languages. | ||
| textual information | 20 | |
| N19-1037 Moreover, we promote the framework to two variants, Hi-GRU with individual features fusion (HiGRU-f) and HiGRU with self-attention and features fusion (HiGRU-sf), so that the word/utterance-level individual inputs and the long-range con***** textual information ***** can be sufficiently utilized. | ||
| 2020.coling-main.344 Both models consist of two parts: an encoder enhanced by deep neural networks (DNN) that can utilize the con***** textual information ***** to encode the input into latent variables, and a decoder which is a generative model able to reconstruct the input. | ||
| 2020.latechclfl-1.5 We propose a simple concatenation approach that improves the quality of automatically generated title translations for artworks, by leveraging ***** textual information ***** extracted from Iconclass. | ||
| D18-1127 Con***** textual information ***** has been shown effective on the task. | ||
| 2020.lrec-1.862 Using semantic and con***** textual information *****, non-speakers of a language familiar with the Latin script can produce high quality named entity annotations to support construction of a name tagger. | ||
| extractive text summarization | 20 | |
| 2021.emnlp-main.11 Based on Multi-GCN, we propose a Multiplex Graph Summarization (Multi-GraS) model for ***** extractive text summarization *****. | ||
| D19-1300 In this work, we re-examine the problem of ***** extractive text summarization ***** for long documents. | ||
| W19-1906 We report on the use of this pipeline in a disease-specific ***** extractive text summarization ***** task on clinical notes, focusing primarily on progress notes by physicians and nurse practitioners. | ||
| 2020.emnlp-main.295 Sentence-level ***** extractive text summarization ***** is substantially a node classification task of network mining, adhering to the informative components and concise representations. | ||
| D18-1438 Experiments show our model outperforms the neural abstractive and ***** extractive text summarization ***** methods that do not consider images. | ||
| semantic composition | 20 | |
| L08-1355 Indeed, the majority of Vietnamese words is built by ***** semantic composition ***** from about 7,000 syllables, which also have a meaning as isolated words. | ||
| I17-1035 Second, we show that a bag-of-words embedding model posts state-of-the-art on a dataset of arguments annotated for convincingness, outperforming an SVM with numerous hand-crafted features as well as recurrent neural network models that attempt to capture ***** semantic composition *****. | ||
| C16-1119 Our model has several distinctive features: (1) Each sentence is divided into three context subsequences according to two annotated nominals, which allows the model to encode each context subsequence independently so as to selectively focus as on the important context information; (2) The hierarchical model consists of two recurrent neural networks (RNNs): the first one learns context representations of the three context subsequences respectively, and the second one computes ***** semantic composition ***** of these three representations and produces a sentence representation for the relationship classification of the two nominals. | ||
| D17-1122 It should be emphasized that each word in a sentence has a different importance from the perspective of ***** semantic composition *****, so we exploit two novel and efficient strategies to explicitly calculate a weight for each word. | ||
| 2020.emnlp-main.651 By formulating DST as a semantic parsing task over hierarchical representations, we can incorporate ***** semantic composition *****ality, cross-domain knowledge sharing and co-reference. | ||
| measures | 20 | |
| 2005.mtsummit-papers.29 Example-based machine translation (EBMT) systems, so far, rely on heuristic ***** measures ***** in retrieving translation examples. | ||
| L12-1235 The recommendations concern a variety of topics including the organisation of an infrastructure project as a function of the types of tasks that have to be carried out, involvement of the targeted users, metadata, semantic interoperability and the role of registries, ***** measures ***** to maximally ensure sustainability, and cooperation with similar projects in other countries. | ||
| N18-1063 Current ***** measures ***** for evaluating text simplification systems focus on evaluating lexical text aspects, neglecting its structural aspects. | ||
| P17-1155 We show that while the scores of n-gram based automatic ***** measures ***** are similar for all interpretation models, SIGN's interpretations are scored higher by humans for adequacy and sentiment polarity. | ||
| Q13-1035 One is a macro-level analysis that ***** measures ***** how domain shift affects corpus-level evaluation; the second is a micro-level analysis for word-level errors. | ||
| mention detection | 20 | |
| K19-1063 A typical architecture for end-to-end entity linking systems consists of three steps: ***** mention detection *****, candidate generation and entity disambiguation. | ||
| P17-1114 In this paper, we study a novel approach for named entity recognition (NER) and ***** mention detection ***** (MD) in natural language processing. | ||
| 2021.crac-1.5 We introduce a modular, hybrid coreference resolution system that extends a rule-based baseline with three neural classifiers for the subtasks ***** mention detection *****, mention attributes (gender, animacy, number), and pronoun resolution. | ||
| P19-1525 To address these concerns, we present a model which directly models all possible spans and performs joint entity ***** mention detection ***** and relation extraction. | ||
| W17-1503 This paper presents results of an experiment integrating information from valency dictionary of Polish into a *****mention detection***** system . | ||
| multilingual dependency parsing | 20 | |
| N19-1393 Our work investigates the use of high-level language descriptions in the form of typological features for ***** multilingual dependency parsing *****. | ||
| K17-3015 For this year's ***** multilingual dependency parsing ***** shared task, we developed a pipeline system, which uses a variety of features for each of its components. | ||
| 2020.emnlp-main.240 We show that this proposed approach obtains surprisingly good performance in tasks such as bilingual lexicon induction, cross-lingual word similarity, multilingual document classification, and ***** multilingual dependency parsing *****. | ||
| N19-1188 Our fully unsupervised multilingual embedding spaces yield results that are on par with the state-of-the-art methods in the bilingual lexicon induction (BLI) task, and simultaneously obtain state-of-the-art scores on two downstream tasks: multilingual document classification and ***** multilingual dependency parsing *****, outperforming even supervised baselines. | ||
| 2021.iwpt-1.21 This paper presents our ***** multilingual dependency parsing ***** system as used in the IWPT 2021 Shared Task on Parsing into Enhanced Universal Dependencies. | ||
| keyword extraction | 20 | |
| L06-1498 In this paper, we enhance the conventional ***** keyword extraction ***** systems by attaching the keyword recovery function. | ||
| 2021.emnlp-main.638 In particular, term weighting is the basis for ***** keyword extraction *****. | ||
| 2020.findings-emnlp.304 ***** keyword extraction ***** and (c). | ||
| L06-1171 The paper presents a tool for ***** keyword extraction ***** from multilingual resources developed within the AXMEDIS project. | ||
| L14-1028 The dataset is our first step toward developing automatic methods for summarization and ***** keyword extraction ***** from emails. | ||
| discourse representation | 20 | |
| U19-1010 In this paper, we propose to use neural ***** discourse representation *****s obtained from a rhetorical structure theory (RST) parser to enhance document representations. | ||
| 2020.lrec-1.717 To exhibit the applicability of our representation scheme, we annotate text taken from diverse datasets and show how we extend the capabilities of existing spatial representation languages with fine-grained decomposition of semantics and blend it seamlessly with AMRs of sentences and ***** discourse representation *****s as a whole. | ||
| W19-2715 This article presents a generic approach and a system, ToNy, a discourse segmenter developed for the DisRPT shared task where multiple ***** discourse representation ***** schemes, languages and domains are represented. | ||
| 2020.wnut-1.21 We find that deletion errors affect detection performance the most, due to their impact on the features of syntactic complexity and ***** discourse representation ***** in speech. | ||
| 2020.lt4hala-1.15 Fictional prose can be broadly divided into narrative and discursive forms with direct speech being central to any ***** discourse representation ***** (alongside indirect reported speech and free indirect discourse). | ||
| classes | 20 | |
| 2020.wanlp-1.32 In this paper, several techniques with multiple algorithms are applied for Arabic dialects identification starting from removing noise till classification task using all Arabic countries as 21 ***** classes *****. | ||
| L08-1019 Computational models can help us to shed new light on the real structure of event type ***** classes ***** as well as to gain a better understanding of context-driven semantic shifts. | ||
| 2021.adaptnlp-1.5 More broadly, our method can be used for textual domain adaptation where the latent ***** classes ***** are unknown but overlap with known ***** classes ***** from other domains. | ||
| L10-1434 Our second interest lies in the actual comparison of the models: How does a very simple distributional model compare to much more complex approaches, and which representation of selectional preferences is more appropriate, using (i) second-order properties, (ii) an implicit generalisation of nouns (by clusters), or (iii) an explicit generalisation of nouns by WordNet ***** classes ***** within clusters? | ||
| L14-1491 Both regression - in order to predict the exact heart rate value - and a binary classification setting for high and low heart rate ***** classes ***** are investigated. | ||
| controlled language | 20 | |
| L14-1509 This paper presents an overview of the findings from an exploratory study carried out to investigate if the appropriateness level of text alternatives for images in French can be improved when applying ***** controlled language ***** (CL) rules. | ||
| L16-1029 As a result, requirements can contain a relatively large diversity of lexical and grammatical errors, which are not eliminated by the use of guidelines from ***** controlled language *****s. | ||
| L10-1073 We propose a solution which anchors on using ***** controlled language *****s as interfaces to semantic web applications. | ||
| 2003.mtsummit-papers.18 In the field of ***** controlled language ***** applications, it is more usual to constrain the source language in this way rather than the target. | ||
| 1999.mtsummit-1.7 The present paper deals with several recurrent issues in the design and implementation of *****controlled language***** checkers . | ||
| ground truth | 20 | |
| 2020.acl-main.591 In this paper, we make use of a multi-task objective, i.e., the models simultaneously predict words as well as ***** ground truth ***** parse trees in a form called “syntactic distances”, where information between these two separate objectives shares the same intermediate representation. | ||
| Q19-1018 We present a new cubic-time algorithm to calculate the optimal next step in shift-reduce dependency parsing, relative to ***** ground truth *****, commonly referred to as dynamic oracle. | ||
| 2020.findings-emnlp.348 To support this task, we implement a 3D dynamic environment simulator and collect a dataset with human-written navigation and assembling instructions, and the corresponding ***** ground truth ***** trajectories. | ||
| 2021.naacl-main.300 These evaluations show that for some specialized collections, standard coherence measures may not inform the most appropriate topic model or the optimal number of topics, and current interpretability performance validation methods are challenged as a means to confirm model quality in the absence of ***** ground truth ***** data. | ||
| L08-1370 This paper describes the development of a *****ground truth***** dataset of culturally diverse Romanized names in which approximately 70,000 names are matched against a subset of 700 . | ||
| textual inference | 20 | |
| D19-6105 Compared to a state-of-the-art MTL approach to ***** textual inference *****, the simple techniques we use yield similar performance on a universe of task combinations while reducing training time and model size. | ||
| W19-5039 We hope that this shared task will attract further research efforts in ***** textual inference *****, question entailment, and question answering in the medical domain. | ||
| W19-5051 We report on our system for ***** textual inference ***** and question entailment in the medical domain for the ACL BioNLP 2019 Shared Task, MEDIQA. | ||
| L14-1119 However, if attempts have been made to define what ***** textual inference *****s are, few seek to classify inference phenomena by difficulty. | ||
| D19-1631 Recently , biomedical version of embeddings obtained from language models such as BioELMo have shown state - of - the - art results for the *****textual inference***** task in the medical domain . | ||
| sentiment lexicon | 20 | |
| P17-1154 Though a variety of neural network models have been proposed recently, however, previous models either depend on expensive phrase-level annotation, most of which has remarkably degraded performance when trained with only sentence-level annotation; or do not fully employ linguistic resources (e.g., ***** sentiment lexicon *****s, negation words, intensity words). | ||
| D17-1052 Recently there were some attempts to employ representation learning algorithms to construct a ***** sentiment lexicon ***** with sentiment-aware word embedding. | ||
| S18-1051 We extract 3 kind of features from each of the tweets - one denoting the sentiment and emotion metrics obtained from different ***** sentiment lexicon *****s, one denoting the semantic representation of the word using dense representations like Glove, Word2vec and finally the syntactic information through POS N-grams, Word clusters, etc. | ||
| L12-1073 Our results show that sentiment analysis based on a simple keyword matching against a ***** sentiment lexicon ***** or a supervised classifier trained with distant supervision does not correlate well with the actual election results. | ||
| L16-1006 Finally, we describe a qualitative study of the automatic translations of English ***** sentiment lexicon *****s into Arabic, which shows that about 88% of the automatically translated entries are valid for English as well. | ||
| speech data | 20 | |
| L14-1645 Along with the methodology for coping with this diversity in the ***** speech data *****, we also describe a set of experiments performed in order to investigate the efficiency of different approaches for automatic data pruning. | ||
| 2020.lrec-1.511 The aim of the analysis is to select ***** speech data ***** from GP for the development of multilingual Automatic Speech Recognition (ASR) system for the Ethiopian languages. | ||
| 2014.iwslt-evaluation.15 For the latter task, various techniques have been considered: punctuation and number normalization, adaptation to ASR errors, as well as the use of structured output layer neural network models for ***** speech data *****. | ||
| L08-1187 We are working with large quantities of dialogue speech including business meetings, friendly discourse, and telephone conversations, and have produced web-based tools for the visualisation of non-verbal and paralinguistic features of the ***** speech data *****. | ||
| L16-1309 This paper describes *****speech data***** recording , processing and annotation of a new speech corpus CoRuSS ( Corpus of Russian Spontaneous Speech ) , which is based on connected communicative speech recorded from 60 native Russian male and female speakers of different age groups ( from 16 to 77 ) . | ||
| unsupervised machine | 20 | |
| W19-2307 Latent space based GAN methods and attention based sequence to sequence models have achieved impressive results in text generation and ***** unsupervised machine ***** translation respectively. | ||
| 2020.acl-main.658 Finally, we provide a unified outlook for different types of research in this area (i.e., cross-lingual word embeddings, deep multilingual pretraining, and ***** unsupervised machine ***** translation) and argue for comparable evaluation of these models. | ||
| P19-1019 Together, we obtain large improvements over the previous state-of-the-art in ***** unsupervised machine ***** translation. | ||
| 2020.acl-srw.34 We first produce a synthetic parallel corpus using ***** unsupervised machine ***** translation, and use it to fine-tune a pretrained cross-lingual masked language model (XLM) to derive the multilingual sentence representations. | ||
| 2021.naacl-main.420 Inspired by ***** unsupervised machine ***** translation, we investigate if a strong V&L representation model can be learned through unsupervised pre-training without image-caption corpora. | ||
| neural abstractive | 20 | |
| W18-6545 Till now, ***** neural abstractive ***** summarization methods have achieved great success for single document summarization (SDS). | ||
| 2021.newsum-1.8 In this paper, we focus on improving the quality of the summary generated by ***** neural abstractive ***** dialogue summarization systems. | ||
| 2021.sigdial-1.53 Therefore, in this work, we investigate different approaches to explicitly incorporate coreference information in ***** neural abstractive ***** dialogue summarization models to tackle the aforementioned challenges. | ||
| D18-1089 We attempted to verify the degree of abstractiveness of modern ***** neural abstractive ***** summarization systems by calculating overlaps in terms of various types of units. | ||
| N19-1398 We have compared a number of ***** neural abstractive ***** architectures on the task of teaser generation and the overall best performing system is See et al. | ||
| multilingual information | 20 | |
| 2021.acl-srw.32 While substantial work has been done in this direction, one of the limitations of the current approaches is that these models are focused only on one language and do not use ***** multilingual information *****. | ||
| L06-1474 We will describe new cross-lingual strategies for the development ***** multilingual information ***** services on mobile devices. | ||
| L14-1608 The goal of this work consists in contributing to the research community with a resource for evaluating multilingual retrieval algorithms, with particular focus on domain adaptation strategies for general purpose ***** multilingual information ***** retrieval systems and on the effective exploitation of semantic annotations. | ||
| 1999.mtsummit-1.45 In this paper we describe the design and implementation of MuST, a ***** multilingual information ***** retrieval, summarization, and translation system. | ||
| 1998.amta-papers.22 In this paper, the integration of language translation and text processing system is proposed to build a ***** multilingual information ***** system. | ||
| maximum likelihood | 20 | |
| 2020.findings-emnlp.406 Structured prediction is often approached by training a locally normalized model with ***** maximum likelihood ***** and decoding approximately with beam search. | ||
| W19-3603 Moreover, speaker adaptive training (SAT) is done using a single feature-space ***** maximum likelihood ***** linear regression (FMLLR) transform estimated per speaker. | ||
| N18-1154 In order to alleviate data sparsity and overfitting problems in ***** maximum likelihood ***** estimation (MLE) for sequence prediction tasks, we propose the Generative Bridging Network (GBN), in which a novel bridge module is introduced to assist the training of the sequence prediction model (the generator network). | ||
| 1999.mtsummit-1.61 The ***** maximum likelihood ***** method depends solely on the target language. | ||
| 1999.mtsummit-1.46 In this paper we describe a language recognition algorithm for multilingual documents that is based on mixed-order n-grams, Markov chains, ***** maximum likelihood *****, and dynamic programming. | ||
| modern | 20 | |
| 2021.alta-1.26 Our empirical experiments reveal that these ***** modern ***** pretrained language models suffer from high variance, and the ensemble method can improve the model performance. | ||
| 2021.insights-1.10 In this work, we conduct a comprehensive investigation on one of the centerpieces of ***** modern ***** machine translation systems: the encoder-decoder attention mechanism. | ||
| C16-1262 We assess the reliability and accuracy of (neural) word embeddings for both ***** modern ***** and historical English and German. | ||
| 2020.inlg-1.27 Massive digital disinformation is one of the main risks of ***** modern ***** society. | ||
| 2020.findings-emnlp.295 In this paper, we express our skepticism towards the recent reports of very low Word Error Rates (WERs) achieved by ***** modern ***** Automatic Speech Recognition (ASR) systems on benchmark datasets. | ||
| lexical and syntactic | 20 | |
| W19-3409 For genre identification, previous work had proposed three classes of features, viz., low-level (character-level and token counts), high-level (***** lexical and syntactic ***** information) and derived features (type-token ratio, average word length or average sentence length). | ||
| 2021.emnlp-main.199 The proposed paradigm offers merits over existing paraphrase generation methods: (1) using the context regularizer on meanings, the model is able to generate massive amounts of high-quality paraphrase pairs; (2) the combination of the huge amount of paraphrase candidates and further diversity-promoting filtering yields paraphrases with more ***** lexical and syntactic ***** diversity; and (3) using human-interpretable scoring functions to select paraphrase pairs from candidates, the proposed framework provides a channel for developers to intervene with the data generation process, leading to a more controllable model. | ||
| W17-7906 To this end, we compiled a small translation memory (English-Spanish) and applied several ***** lexical and syntactic ***** transformation rules to the source sentences with both English and Spanish being the source language. | ||
| P19-1198 Our analysis (both quantitative and qualitative involving human evaluators) on public test data shows that the proposed model can perform text-simplification at both ***** lexical and syntactic ***** levels, competitive to existing supervised methods. | ||
| P18-1115 To this end, we present a deep generative model of machine translation which incorporates a chain of latent variables, in order to account for local ***** lexical and syntactic ***** variation in parallel corpora. | ||
| acoustic modeling | 20 | |
| L16-1610 Experiments show that the ONC-based syllable *****acoustic modeling***** with I-vector based DNN-HMM achieves the best performance with the word error rate (WER) of 9.66% and the real time factor (RTF) of 1.38812. | ||
| 2020.lrec-1.810 S-JNAS, a corpus of elderly Japanese speech, is widely used for *****acoustic modeling***** in Japan, but the average age of its speakers is 67.6 years old. | ||
| L04-1168 Context dependent and independent phoneme units were used in these experiments with two different approaches to *****acoustic modeling*****, namely discrete and continuous Hidden Markov Models (HMMs). | ||
| 2020.sltu-1.17 In this paper, we compare a fully convolutional approach for acoustic modelling in ASR with a variety of established *****acoustic modeling***** approaches. | ||
| L08-1016 In the current work, we produce a manual segmentation of laughter in a large corpus of interactive multi-party seminars, which promises to be a valuable resource for *****acoustic modeling***** purposes. | ||
| timebank corpus | 20 | |
| D17-1190 Evaluation of the proposed approach on *****TimeBank corpus***** shows that sequential modeling is capable of accurately recognizing temporal relations between events, which outperforms a neural net model using various discrete features as input that imitates previous feature based models. | ||
| L12-1451 The paper describes the main steps for the construction, annotation and validation of the Romanian version of the *****TimeBank corpus*****. | ||
| L06-1202 In our work, we present an analysis of the *****TimeBank corpus*****—the only available reference sample of TimeML-compliant annotation—from the point of view of its utility as a training resource for developing automated TimeML annotators. | ||
| L08-1020 The paper describes the construction and usage of the Romanian version of the *****TimeBank corpus*****. | ||
| L14-1382 French resources have been evaluated in two different ways: on the French *****TimeBank corpus*****, a corpus of newspaper articles in French annotated according to the ISO-TimeML standard, and on a user application for automatic building of event timelines. | ||
| multi - source translation | 20 | |
| D19-5208 We submitted our transformer-based NMT system with built using the following methods: a) relative positioning method for pairwise relationships between the input elements, b) back-translation and *****multi-source translation***** for data augmentation, c) right-to-left (r2l)-reranking model robust against error propagation in autoregressive architectures such as decoders, and d) checkpoint ensemble models, which selected the top three models with the best validation bilingual evaluation understudy (BLEU) . | ||
| 2016.iwslt-1.16 We combine systems using different vocabularies, reverse translation systems, *****multi-source translation***** system. | ||
| 2021.acl-long.446 Multi-source sequence generation (MSG) is an important kind of sequence generation tasks that takes multiple sources, including automatic post-editing, *****multi-source translation*****, multi-document summarization, etc. | ||
| 2018.iwslt-1.7 *****Multi-source translation***** systems translate from multiple languages to a single target language. | ||
| 2021.wat-1.6 The *****multi-source translation***** is an approach to exploit multiple inputs (e.g. in two different formats) to increase translation accuracy. | ||
| compound splitting | 20 | |
| D18-1295 As they are language agnostic, we will demonstrate that they also outperform the state of the art for the related task of German *****compound splitting*****. | ||
| 2020.lrec-1.543 We develop a two-fold deep learning-based approach of noun *****compound splitting***** and idiomatic compound detection for the German language that we train using a newly collected corpus of annotated German compounds. | ||
| P17-2010 *****Compound splitting***** has great potential for this novel task that is both transparent and well-defined. | ||
| W17-1722 This paper presents a simple method for German *****compound splitting***** that combines a basic frequency-based approach with a form-to-lemma mapping to approximate morphological operations. | ||
| C16-1301 For English-to-German translation, we use target-side *****compound splitting***** through a special syntax during training that allows the model to merge compound words and gain 0.2 BLEU points. | ||
| genia corpus | 20 | |
| I17-1027 We show that our model outperforms the existing state-of-the-art methods on the coordination annotated Penn Treebank and *****Genia corpus***** without any syntactic information from parsers. | ||
| L12-1485 Previously, we designed an annotation scheme to enrich events with several aspects (or dimensions) of interpretation, which we term meta-knowledge, and applied this scheme to the entire *****GENIA corpus*****. | ||
| L08-1073 A 50-abstract subset (492 sentences) of the *****GENIA corpus***** (Kim et al., 2003) is annotated with labeled head-dependent relations using the grammatical relations (GR) evaluation scheme (Carroll et al., 1998) ,which has been used for parser evaluation in the newswire domain. | ||
| L06-1199 We conduct our experiments on the *****Genia corpus***** and the Genia ontology and evaluate the different measures by comparing the results of our approach with a gold standard provided by one of the authors, a biologist. | ||
| 2020.lrec-1.1 For nested NER, the evaluation of our model on the *****GENIA corpora***** shows that our model matches or outperforms state-of-the-art models despite not being specifically designed for this task. | ||
| open - domain dialogue generation | 20 | |
| D17-1230 We apply adversarial training to *****open-domain dialogue generation*****, training a system to produce sequences that are indistinguishable from human-generated dialogue utterances. | ||
| 2020.acl-main.333 *****Open-domain dialogue generation***** has gained increasing attention in Natural Language Processing. | ||
| 2020.emnlp-main.352 Existing *****open-domain dialogue generation***** models are usually trained to mimic the gold response in the training set using cross-entropy loss on the vocabulary. | ||
| N19-1349 Sequence-to-sequence models for *****open-domain dialogue generation***** tend to favor generic, uninformative responses. | ||
| 2020.emnlp-main.276 *****Open-domain dialogue generation***** suffers from the data insufficiency problem due to the vast size of potential responses. | ||
| dynamic | 20 | |
| D19-1074 In this study , we first investigate a novel capsule network with *****dynamic***** routing for linear time Neural Machine Translation ( NMT ) , referred as CapsNMT . | ||
| W19-5926 Understanding and conversing about *****dynamic***** scenes is one of the key capabilities of AI agents that navigate the environment and convey useful information to humans . | ||
| D18-1350 In this study , we explore capsule networks with *****dynamic***** routing for text classification . | ||
| 2017.jeptalnrecital-recital.12 Finding Missing Categories in Incomplete Utterances This paper introduces an efficient algorithm ( O(n4 ) ) for finding a missing category in an incomplete utterance by using unification technique as when learning categorial grammars , and *****dynamic***** programming as in CockeYoungerKasami algorithm . | ||
| Q17-1019 Pruning hypotheses during *****dynamic***** programming is commonly used to speed up inference in settings such as parsing . | ||
| graph - based | 20 | |
| 2020.emnlp-main.711 State - of - the - art models for multi - hop question answering typically augment large - scale language models like BERT with additional , intuitively useful capabilities such as named entity recognition , *****graph - based***** reasoning , and question decomposition . | ||
| L08-1182 This paper presents the results of a *****graph - based***** method for performing knowledge - based Word Sense Disambiguation ( WSD ) . | ||
| D19-1314 Generating text from *****graph - based***** data , such as Abstract Meaning Representation ( AMR ) , is a challenging task due to the inherent difficulty in how to properly encode the structure of a graph with labeled edges . | ||
| L08-1090 Krahmer et al.s ( 2003 ) *****graph - based***** framework provides an elegant and flexible approach to the generation of referring expressions . | ||
| W18-2403 Recent collective Entity Linking studies usually promote global coherence of all the mapped entities in the same document by using semantic embeddings and *****graph - based***** approaches . | ||
| 20 | ||
| 2021.clpsych-1.8 In this shared task , we accept the challenge of constructing models to identify *****Twitter***** users who attempted suicide based on their tweets 30 and 182 days before the adverse event 's occurrence . | ||
| W19-1311 In this work , we investigate the impact of incorporating emotion classes on the task of predicting emojis from *****Twitter***** texts . | ||
| L14-1402 Several works in Natural Language Processing have recently looked into part - of - speech annotation of *****Twitter***** data and typically used their own data sets . | ||
| 2020.findings-emnlp.222 The performance of standard coreference resolution is known to drop significantly on *****Twitter***** texts . | ||
| W16-3920 In this paper , we present our approach for named entity recognition in *****Twitter***** messages that we used in our participation in the Named Entity Recognition in Twitter shared task at the COLING 2016 Workshop on Noisy User - generated text ( WNUT ) . | ||
| Deep neural | 20 | |
| 2021.naacl-main.201 *****Deep neural***** networks and huge language models are becoming omnipresent in natural language applications . | ||
| 2021.naacl-tutorials.2 *****Deep neural***** networks have constantly pushed the state - of - the - art performance in natural language processing and are considered as the de - facto modeling approach in solving complex NLP tasks such as machine translation , summarization and question - answering . | ||
| W17-2630 *****Deep neural***** networks have advanced the state of the art in named entity recognition . | ||
| D17-1085 *****Deep neural***** networks for machine comprehension typically utilizes only word or character embeddings without explicitly taking advantage of structured linguistic information such as constituency trees and dependency trees . | ||
| 2020.sustainlp-1.4 *****Deep neural***** networks have demonstrated their superior performance in almost every Natural Language Processing task , however , their increasing complexity raises concerns . | ||
| sign | 20 | |
| 2021.mtsummit-at4ssl.3 Sign language lexica are a useful resource for researchers and people learning *****sign***** languages . | ||
| 2020.signlang-1.35 Sign language research most often relies on exhaustively annotated and segmented data , which is scarce even for the most studied *****sign***** languages . | ||
| 2021.mtsummit-at4ssl.11 This paper addresses the tasks of *****sign***** segmentation and segment - meaning mapping in the context of sign language ( SL ) recognition . | ||
| 2005.mtsummit-ebmt.14 Users of *****sign***** languages are often forced to use a language in which they have reduced competence simply because documentation in their preferred format is not available . | ||
| W16-4116 Computational linguistic approaches to *****sign***** languages could benefit from investigating how complexity influences structure . | ||
| intelligent | 20 | |
| L16-1231 Ontologies are powerful to support semantic based applications and *****intelligent***** systems . | ||
| 2014.lilt-10.1 This paper presents a cognitively - inspired algorithm for the semantic analysis of nominal compounds by *****intelligent***** agents . | ||
| P17-1120 Recently emerged *****intelligent***** assistants on smartphones and home electronics ( e.g. , Siri and Alexa ) can be seen as novel hybrids of domain - specific task - oriented spoken dialogue systems and open - domain non - task - oriented ones . | ||
| 2020.acl-main.730 We present the task of Spatio - Temporal Video Question Answering , which requires *****intelligent***** systems to simultaneously retrieve relevant moments and detect referenced visual concepts ( people and objects ) to answer natural language questions about videos . | ||
| 2021.naacl-main.421 When *****intelligent***** agents communicate to accomplish shared goals , how do these goals shape the agents ' language ? | ||
| question answering ( QA | 20 | |
| 2020.emnlp-main.246 State - of - the - art *****question answering ( QA***** ) relies upon large amounts of training data for which labeling is time consuming and thus expensive . | ||
| 2020.acl-main.414 Evidence retrieval is a critical stage of *****question answering ( QA***** ) , necessary not only to improve performance , but also to explain the decisions of the QA method . | ||
| 2021.emnlp-main.444 The goal of *****question answering ( QA***** ) is to answer _ any _ question . | ||
| P18-1160 Neural models for *****question answering ( QA***** ) over documents have achieved significant performance improvements . | ||
| 2020.findings-emnlp.133 The incompleteness of knowledge base ( KB ) is a vital factor limiting the performance of *****question answering ( QA***** ) . | ||
| intuitively | 19 | |
| N19-1161 We argue that this formulation has several ***** intuitively ***** attractive properties, particularly with the respect to improving robustness and generalization to mappings between difficult language pairs or word pairs. | ||
| 2020.coling-main.270 Although being ***** intuitively ***** sensible for human, metaphor detection is still a challenging task due to the subtle ontological differences between metaphorical and non-metaphorical expressions. | ||
| L12-1469 Furthermore a good semantic search algorithm is not enough to fullfil user needs, it is worthwhile to implement visualization methods which can support users in ***** intuitively ***** understanding why and how the results were retrieved. | ||
| 2020.findings-emnlp.420 However, few models consider the fusion of linguistic features with multiple visual features with different sizes of receptive fields, though the proper size of the receptive field of visual features ***** intuitively ***** varies depending on expressions. | ||
| 2020.lrec-1.263 Relative to temporal graphs, the tree form of TDTs naturally omits some fraction of temporal relationships, which ***** intuitively ***** should decrease the amount of temporal information available, potentially increasing temporal indeterminacy of the global ordering | ||
| labeler | 19 | |
| L10-1269 The second parser is a combination of the Berkeley parser (Petrov et al., 2006) and a functional role ***** labeler *****: trained on the original constituency treebank, the Berkeley parser first outputs constituency trees, which are then labeled with functional roles, and then converted into dependency trees. | ||
| W18-4809 We compare a conditional random field based sequence ***** labeler ***** and a neural encoder-decoder model and show that a nearly 0.9 F1-score on labeled accuracy of morphemes can be achieved with 3,000 words of transcribed oral text. | ||
| 2020.emnlp-main.319 The compressor collects useful information from the output of the semantic role ***** labeler *****, filtering noisy and conflicting evidence. | ||
| C18-1038 Our model employs a cooperative gated neural network to retrieve answers with the assistance of extra labels given by a neural turing machine ***** labeler *****. | ||
| Q19-1022 The backbone of our model is an LSTM-based semantic role ***** labeler ***** jointly trained with two auxiliary tasks: predicting the dependency label of a word and whether there exists an arc linking it to the predicate | ||
| synthesizing | 19 | |
| N18-1057 In this paper, we consider ***** synthesizing ***** parallel data by noising a clean monolingual corpus. | ||
| J18-4007 These quantitative analyses are supplemented by extensive qualitative analysis, highlighting the compatibility of computational and qualitative methods in ***** synthesizing ***** evidence about the creation of interactional meaning. | ||
| L16-1546 TTS systems capable of ***** synthesizing ***** such text need to be able to handle text that is written in multiple languages and scripts. | ||
| 2021.acl-long.411 In this paper we present the first model for directly ***** synthesizing ***** fluent, natural-sounding spoken audio captions for images that does not require natural language text as an intermediate representation or source of supervision. | ||
| 2020.coling-main.232 In addition, further ablation studies demonstrate the effectiveness of our graph-based iterative knowledge retrieval module and the answer choice-aware attention module in retrieving and ***** synthesizing ***** background knowledge from multiple knowledge sources | ||
| interacting | 19 | |
| W17-3518 It provides a common benchmark on which to train, evaluate and compare “microplanners”, i.e. generation systems that verbalise a given content by making a range of complex ***** interacting ***** choices including referring expression generation, aggregation, lexicalisation, surface realisation and sentence segmentation. | ||
| 2005.mtsummit-posters.20 This is followed by a short description of several additional application areas of this software for which LTC has received EU funding: The AMBIENT project carries out a market validation for multilingual and multimodal eLearning for business and innovation management, the EUCAM project tests multilingual eLearning in the automotive industry, including a major car manufacturer and the German and European Metal Workers Associations, and the ALADDIN project provides a mobile multilingual environment for tour guides, ***** interacting ***** between tour operators and tourists, with the objective of optimising their travel experience. | ||
| 2021.naacl-main.467 However, current approaches still focus on sentence-level relations ***** interacting ***** among tokens. | ||
| 2020.findings-emnlp.273 We consider problems of making sequences of decisions to accomplish tasks, ***** interacting ***** via the medium of language. | ||
| 2020.emnlp-main.591 Recent efforts have made great progress to track multiple entities in a procedural text, but usually treat each entity separately and ignore the fact that there are often multiple entities ***** interacting ***** with each other during one process, some of which are even explicitly mentioned | ||
| frequencies | 19 | |
| W18-5814 A technical connection was adduced between this result and Good-Turing smoothing, which assigns probability mass to unseen events on the basis of the simplifying assumption that word ***** frequencies ***** are stationary. | ||
| W19-3607 That means it included the statistics of Word ***** frequencies *****, Word sequence ***** frequencies *****, Part-of-speech sequence ***** frequencies ***** and other important information. | ||
| W17-4213 In the second experiment, on discourse level we used ***** frequencies ***** of rhetorical relations types in texts. | ||
| L14-1659 Focus of this paper will be the description of the tagging process and evaluation of statistical properties like word form ***** frequencies ***** and part of speech tag distributions. | ||
| 2020.cogalex-1.3 The present study used data on word ***** frequencies ***** to test two hypotheses | ||
| implemented | 19 | |
| L10-1508 This paper focuses on the definition of the patterns, on the measures used to assess the reliability of the suggested specific semantic relation and on the evaluation of the ***** implemented ***** system. | ||
| L12-1377 A concept of asynchronous handling of requests sent to the ***** implemented ***** Web service (Multiservice) is introduced to enable processing large amounts of text by setting up language processing chains of desired complexity. | ||
| 2020.lrec-1.83 We implement a multilingual interactive agent in the field of healthcare and conduct experiments to illustrate the effectiveness of the ***** implemented ***** agent. | ||
| 2020.semeval-1.72 The evaluation result shows that the ***** implemented ***** model achieved an accuracy of 93.9% obtained and published at the post-evaluation result on the leaderboard. | ||
| L06-1249 The main contribution of the presented work consists of the expressiveness of the query formula, in the elegant and intuitive way the rules are written (and their easy reversibility), and in the performance of the ***** implemented ***** tool | ||
| registers | 19 | |
| 2020.readi-1.8 We argue that the added value of this type of visualisation is the polygonal shape that provides an intuitive grasp of text complexity similarities across the ***** registers ***** of a corpus. | ||
| W16-5406 The annotation was performed on the core data of `Balanced Corpus of Contemporary Written Japanese', which comprised about one million words and 1980 samples from six ***** registers *****, such as newspapers, books, magazines, and web texts. | ||
| N19-1062 Online texts - across genres, ***** registers *****, domains, and styles - are riddled with human stereotypes, expressed in overt or subtle ways. | ||
| L14-1264 In particular, we investigate the diversification of scientific ***** registers ***** over time. | ||
| L12-1111 Second, features are assessed for their relevance for the study of recent language change in scientific ***** registers ***** by means of correspondence analysis | ||
| Siamese | 19 | |
| D18-1494 In this study, we propose a supervised topic model based on the ***** Siamese ***** network, which can trade off label-specific word distributions with document-specific label distributions in a uniform framework. | ||
| 2021.ranlp-1.131 XLM-R embeddings based ***** Siamese ***** architecture using gated recurrent units and bidirectional long short term memory networks provide promising results for this classification problem. | ||
| S17-2026 The network builds on previously explored ***** Siamese ***** network architectures. | ||
| S18-1190 The system consists of 3 stacked LSTMs: one for the reason, one for the claim, and one shared ***** Siamese ***** Network for the 2 candidate warrants | ||
| S18-2012 When we build a neural network model predicting the relationship between two sentences , the most general and intuitive approach is to use a *****Siamese***** architecture , where the sentence vectors obtained from a shared encoder is given as input to a classifier . | ||
| realisation | 19 | |
| L10-1406 The current system comprises a series of classifiers that implement major Document Planning subtasks (namely, data interpretation, content selection, within- and between-sentence structuring), and a small surface ***** realisation ***** grammar of Brazilian Portuguese. | ||
| D19-1305 We propose a modular approach to surface ***** realisation ***** which models each of these components separately, and evaluate our approach on the 10 languages covered by the SR'18 Surface Realisation Shared Task shallow track. | ||
| W17-3505 We present a flexible Natural Language Generation approach for Spanish, focused on the surface ***** realisation ***** stage, which integrates an inflection module in order to improve the naturalness and expressivity of the generated language. | ||
| W18-3601 We report results from the SR'18 Shared Task, a new multilingual surface ***** realisation ***** task organised as part of the ACL'18 Workshop on Multilingual Surface Realisation. | ||
| P17-1017 We thus propose our corpus generation framework as a novel method for creating challenging data sets from which NLG models can be learned which are capable of handling the complex interactions occurring during in micro-planning between lexicalisation, aggregation, surface ***** realisation *****, referring expression generation and sentence segmentation | ||
| keystroke | 19 | |
| W19-3607 Evaluation of the model is performed using developed prototype and ***** keystroke ***** savings (KSS) as a metrics. | ||
| D19-3027 The frequency of Burmese characters is considered in MY-AKKHARA to realize an efficient ***** keystroke ***** distribution on a QWERTY keyboard. | ||
| W16-4111 This paper investigates the design of a key-stroke and subject dependent identification system of cognitive effort to track complexity in translation with ***** keystroke ***** logging (cf. | ||
| 2020.lrec-1.45 This dataset is based on ***** keystroke ***** data and eye tracking data of 65 students from a variety of backgrounds (undergraduate and graduate English as a first language and English as a second language students) and a variety of tasks (argumentative text and academic abstract) | ||
| L16-1574 This paper introduces a toolkit used for the purpose of detecting replacements of different grammatical and semantic structures in ongoing text production logged as a chronological series of computer interaction events ( so - called *****keystroke***** logs ) . | ||
| similarities | 19 | |
| 2021.mrl-1.8 We statistically observe interesting effects that the confluence of reasoning types and language ***** similarities ***** have on transfer performance. | ||
| L06-1298 The parts with low ***** similarities ***** are highly likely to be non-machine-translatable parts. | ||
| 2020.acl-main.575 Specifically, we present a method of instance-based learning that learns ***** similarities ***** between spans. | ||
| 2020.emnlp-main.187 We then take advantage of our multi-view language vector space for multilingual machine translation, where we achieve competitive overall translation accuracy in tasks that require information about language ***** similarities *****, such as language clustering and ranking candidates for multilingual transfer. | ||
| 2020.emnlp-main.456 We then ask whether our gender system ***** similarities ***** alone are sufficient to reconstruct historical relationships between languages | ||
| lexicographical | 19 | |
| 2020.rail-1.9 In this article, we present on the one hand the first versions of the three platforms: the REST API for saving ***** lexicographical ***** resources, the dictionary management platform and the collaborative dictionary platform; on the other hand, we describe the data format chosen and used to encapsulate our resources. | ||
| L06-1427 This work focuses on semi-automatic extraction of verb-noun collocations from a corpus, performed to provide lexical evidence for the manual ***** lexicographical ***** processing of Support Verb Constructions (SVCs) in the Swedish-Czech Combinatorial Valency Lexicon of Predicate Nouns. | ||
| L12-1273 A total of 200 Italian expressions were first selected and examined, using both monolingual and bilingual dictionaries, as well as specific ***** lexicographical ***** works dealing with the subject of idiomaticity, especially of the maritime type, and a similar undertaking was then conducted for the English expressions. | ||
| L14-1732 Our combined model is based on the LMF and TMF metamodels for ***** lexicographical ***** and terminological databases and is compatible with both, thus allowing for the import of information from existing dictionaries and termbases, which may be transferred to the complementary view and re-exported | ||
| 2020.lrec-1.385 Classical Armenian is a poorly endowed language , that despite a great tradition of *****lexicographical***** erudition is coping with a lack of resources . | ||
| disinformation | 19 | |
| 2021.acl-long.158 Because the vast majority of edited or manipulated images are benign, such as photoshopped images for visual enhancements, the key challenge is to understand the complex layers of underlying intents of media edits and their implications with respect to ***** disinformation *****. | ||
| D19-5006 Digital media enables not only fast sharing of information, but also ***** disinformation *****. | ||
| 2020.stoc-1.7 While there are emerging interests in studying how ***** disinformation ***** campaigns form, spread, and influence target audiences, developing ***** disinformation ***** campaign corpora is challenging given the high volume, fast evolution, and wide variation of messages associated with each campaign. | ||
| 2020.inlg-1.27 Massive digital ***** disinformation ***** is one of the main risks of modern society. | ||
| 2020.stoc-1.6 In this paper , we present a web service platform for *****disinformation***** detection in hotel reviews written in English . | ||
| utilising | 19 | |
| N18-2045 Motivated by recent advances in memory-augmented models for machine reading, we propose a novel architecture, ***** utilising ***** external “memory chains” with a delayed memory update mechanism to track entities. | ||
| 2021.inlg-1.5 Conversational systems aim to generate responses that are accurate, relevant and engaging, either through ***** utilising ***** neural end-to-end models or through slot filling. | ||
| 2020.aacl-main.76 We show that the reliance on these gendered pairs has strong limitations: bias measures based off of them are not robust and cannot identify common types of real-world bias, whilst analogies ***** utilising ***** them are unsuitable indicators of bias. | ||
| 2020.lrec-1.165 This was achieved using a word-based Convolutional Neural Network (CNN) ***** utilising ***** a Continuous Bag of Words (CBOW) word embeddings model. | ||
| P17-1024 We demonstrate that merely ***** utilising ***** language cues is not enough to model FOIL-COCO and that it challenges the state-of-the-art by requiring a fine-grained understanding of the relation between text and image | ||
| contiguous | 19 | |
| 2021.emnlp-demo.4 This is challenging, as markup can be nested, apply to spans ***** contiguous ***** in source but non-***** contiguous ***** in target etc. | ||
| N18-2075 Text segmentation, the task of dividing a document into ***** contiguous ***** segments based on its semantic structure, is a longstanding challenge in language understanding. | ||
| 2021.nllp-1.14 We then show through manual evaluation that the model identifies most (89,84%) defined terms in a set of building regulation documents, and that both ***** contiguous ***** and dis***** contiguous ***** Multi-Word Expressions (MWE) are discovered with reasonable accuracy (70,3%). | ||
| 2021.acl-short.26 However, they ignore considering the latent segment structure of the document, in which ***** contiguous ***** sentences often have coherent semantics. | ||
| Q13-1029 There have been several efforts to extend distributional semantics beyond individual words, to measure the similarity of word pairs, phrases, and sentences (briefly, tuples; ordered sets of words, ***** contiguous ***** or non***** contiguous *****) | ||
| generalise | 19 | |
| 2021.acl-short.55 Our findings suggest that the word-level QE models based on powerful pre-trained transformers that we propose in this paper ***** generalise ***** well across languages, making them more useful in real-world scenarios. | ||
| D19-5018 We show that BERT, while capable of handling imbalanced classes with no additional data augmentation, does not ***** generalise ***** well when the training and test data are sufficiently dissimilar (as is often the case with news sources, whose topics evolve over time). | ||
| 2020.findings-emnlp.103 Addressing undersensitivity furthermore improves model robustness on the previously introduced ADDSENT and ADDONESENT datasets, and models ***** generalise ***** better when facing train / evaluation distribution mismatch: they are less prone to overly rely on shallow predictive cues present only in the training set, and outperform a conventional model by as much as 10.9% F1. | ||
| 2021.louhi-1.6 These representations can ***** generalise ***** both bottom-up as well as top-down among various semantic hierarchies. | ||
| L16-1573 Letter-level approaches to diacritic restoration ***** generalise ***** better and do not require a lot of training data but word-level approaches tend to yield better results | ||
| embed | 19 | |
| D17-1027 In this work, we propose an approach to jointly ***** embed ***** Chinese words as well as their characters and fine-grained subcharacter components. | ||
| N18-3007 The second model is based on Siamese Networks (SNs) which ***** embed ***** the metadata input sequence and the generated title in the same space and do not require hand-crafted features at all. | ||
| 2020.acl-main.260 Multilingual representations ***** embed ***** words from many languages into a single semantic space such that words with similar meanings are close to each other regardless of the language. | ||
| P18-1004 Semantic specialization of distributional word vectors, referred to as retrofitting, is a process of fine-tuning word vectors using external lexical knowledge in order to better ***** embed ***** some semantic relation. | ||
| P19-1413 Current statistical script learning approaches ***** embed ***** the events, such that their relationships are indicated by their similarity in the ***** embed *****ding | ||
| assigning | 19 | |
| R19-1110 They reflect several separate but connected strategies: manipulating the shape and the content of the knowledge base, ***** assigning ***** weights over the relations in the knowledge base, and the addition of new relations to it. | ||
| 2020.lrec-1.564 Two annotators strove for carefully ***** assigning ***** entity mentions to classes of genes/proteins as well as families/groups, complexes, variants and enumerations of those where genes and proteins are represented by a single class. | ||
| 2021.emnlp-main.181 During the deep co-training process, we use the session classifier as a reinforcement learning component to learn a session ***** assigning ***** policy by maximizing the local rewards given by the message-pair classifier. | ||
| 2021.acl-long.517 In this paper, we conduct a large-scale controlled study focused on question answering, ***** assigning ***** workers at random to compose questions either (i) adversarially (with a model in the loop); or (ii) in the standard fashion (without a model). | ||
| D19-1223 However, with standard softmax attention, all attention heads are dense, ***** assigning ***** a non-zero weight to all context words | ||
| snippet | 19 | |
| L12-1039 Explicitly conveyed knowledge represents only a portion of the information communicated by a text ***** snippet *****. | ||
| 2021.nllp-1.12 Being able to search using a natural language text ***** snippet ***** instead of a more artificial query could help to prevent query formulation issues. | ||
| K19-1091 We further detect the opinion ***** snippet ***** by self-critical reinforcement learning. | ||
| W19-3302 The paper's format is rather unconventional: there is no explicit related work, no methodology section, no results, and no discussion (and the current ***** snippet ***** is not an abstract but actually an introductory preface). | ||
| 2021.acl-long.301 We present an architecture for joint document and ***** snippet ***** ranking, the two middle stages, which leverages the intuition that relevant documents have good ***** snippet *****s and good ***** snippet *****s come from relevant documents | ||
| coreferent | 19 | |
| 2021.acl-long.374 We study the problem of event coreference resolution (ECR) that seeks to group ***** coreferent ***** event mentions into the same clusters. | ||
| 2020.emnlp-main.452 The discourse-enhanced self-training algorithm iteratively labels new event phrases based on both the classifier's predictions and the polarities of the event's ***** coreferent ***** sentiment expressions. | ||
| 2020.lrec-1.9 When annotators are asked to annotate ***** coreferent ***** spans of text, it is therefore a somewhat unnatural task. | ||
| N19-1074 As shown by previous work, the grouping of ***** coreferent ***** concept mentions across documents is a crucial subtask of it. | ||
| 2021.case-1.14 For a given article, our proposed pipeline comprises of an accurate sentence pair classifier that identifies ***** coreferent ***** sentence pairs and subsequently uses these predicted probabilities to cluster sentences into groups | ||
| mitigation | 19 | |
| 2020.acl-main.265 Unfortunately, due to NRE models rely heavily on surface level cues, we find that existing bias ***** mitigation ***** approaches have a negative effect on NRE. | ||
| D19-1530 CDA/S with the Names Intervention is the only approach which is able to mitigate indirect gender bias: following debiasing, previously biased words are significantly less clustered according to gender (cluster purity is reduced by 49%), thus improving on the state-of-the-art for bias ***** mitigation *****. | ||
| 2021.naacl-main.296 Although these techniques achieve bias reduction for the task and domain at hand, the effects of bias ***** mitigation ***** may not directly transfer to new tasks, requiring additional data collection and customized annotation of sensitive attributes, and re-evaluation of appropriate fairness metrics. | ||
| 2020.lrec-1.245 In response, we have designed a novel named entity annotation scheme and associated guidelines for this domain, which covers hazards, consequences, ***** mitigation ***** strategies and project attributes. | ||
| 2020.acl-main.264 We further propose a bias ***** mitigation ***** approach based on posterior regularization | ||
| evaluative | 19 | |
| W19-4725 This is done in order to test the hypothesis that ***** evaluative ***** adjectives are more prone to temporal semantic change. | ||
| L14-1656 It provides information on segments in the text which denote an aspect or a subjective ***** evaluative ***** phrase which refers to the aspect. | ||
| S18-1093 In task 2, three types of irony are considered; “Irony by contrast” - ironic instances where ***** evaluative ***** expression portrays inverse polarity (positive, negative) of the literal proposition; “Situational irony” - ironic instances where output of a situation do not comply with its expectation; “Other verbal irony” - instances where ironic intent does not rely on polarity contrast or unexpected outcome. | ||
| 2020.inlg-1.36 In order to evaluate the algorithm properly and validate the applicability of existing models and ***** evaluative ***** information criteria, both, pro- duction and comprehension studies, are con- ducted using a complex domain of objects, pro- viding new directions of approaching the eval- uation of REG algorithms. | ||
| 2021.rocling-1.35 Ever - expanding *****evaluative***** texts on online forums have become an important source of sentiment analysis . | ||
| 19 | ||
| 2021.dravidianlangtech-1.14 Sentiments/opinions/reviews' of users posted on social media are the valuable information that have motivated researchers to analyze them to get better insight and feedbacks about any product such as a video in ***** Instagram *****, a movie in Netflix, or even new brand car introduced by BMW. | ||
| N18-2107 ***** Instagram ***** posts are composed of pictures together with texts which sometimes include emojis. | ||
| W18-6240 We show empirical results of our algorithm on data obtained from the most influential ***** Instagram ***** accounts. | ||
| S19-2097 With the proliferation and ubiquity of smart gadgets and smart devices, across the world, data generated by them has been growing at exponential rates; in particular social media platforms like Facebook, Twitter and ***** Instagram ***** have been generating voluminous data on a daily basis. | ||
| P18-1186 We introduce the new Multimodal Named Entity Disambiguation ( MNED ) task for multimodal social media posts such as Snapchat or *****Instagram***** captions , which are composed of short captions with accompanying images . | ||
| obfuscated | 19 | |
| 2020.iwpt-1.7 Specifically, a given English text is ***** obfuscated ***** using a neural model that aims to preserve the syntactic relationships of the original sentence so that the ***** obfuscated ***** sentence can be parsed instead of the original one. | ||
| 2021.emnlp-main.152 Critically, our framework operationalizes the notion of ***** obfuscated ***** style in a flexible way that enables two distinct notions of ***** obfuscated ***** style: (1) a minimal notion that effectively intersects the various styles seen in training, and (2) a maximal notion that seeks to obfuscate by adding stylistic features of all sensitive attributes to text, in effect, computing a union of styles. | ||
| 2020.acl-main.203 We show that the existing authorship obfuscation methods are not stealthy as their ***** obfuscated ***** texts can be identified with an average F1 score of 0.87. | ||
| W18-3606 This work presents state of the art results in reconstruction of surface realizations from ***** obfuscated ***** text | ||
| W18-4203 People often create *****obfuscated***** language for online communication to avoid Internet censorship , share sensitive information , express strong sentiment or emotion , plan for secret actions , trade illegal products , or simply hold interesting conversations . | ||
| intonation | 19 | |
| L12-1464 Prosodic research in recent years has been supported by a number of automatic analysis tools aimed at simplifying the work that is requested to study ***** intonation *****. | ||
| 1997.iwpt-1.4 Translating telephones and personal assistants are an interesting test case, in which the salience of rapidly shifting discourse topics and the fact that sentences are machine-generated, rather than written by humans, combine to make the application particularly vulnerable to our poor theoretical grasp of ***** intonation ***** and its functions. | ||
| W16-3807 Human communication is a multimodal activity, involving not only speech and written expressions, but ***** intonation *****, images, gestures, visual clues, and the interpretation of actions through perception. | ||
| L12-1254 This paper will present the design of a Galician syntactic corpus with application to *****intonation***** modeling . | ||
| L10-1268 In this paper , we propose a scheme for annotating utterance - level units in Japanese dialogs , which emerged from an analysis of the interrelationship among four schemes , i ) inter - pausal units , ii ) *****intonation***** units , iii ) clause units , and iv ) pragmatic units . | ||
| dyslexic | 19 | |
| W17-1309 The paper also discusses building a corpus of ***** dyslexic ***** Arabic texts that uses the error annotation scheme and provides an analysis of the errors that were found in the texts. | ||
| 2020.lincr-1.2 In evaluation stage, ***** dyslexic ***** and non-***** dyslexic ***** children were asked to read sentences from the Children version of the Russian Sentence Corpus. | ||
| L16-1610 This is a part of an effort in building a STT system to aid ***** dyslexic ***** students who have cognitive deficiency in writing skills but have no problem expressing their ideas through speech. | ||
| 2020.lrec-1.169 A sample of 21 poor-reading and ***** dyslexic ***** children with an average reading delay of 2.5 years read a portion of the corpus | ||
| L10-1025 Traditional Danish reading training for *****dyslexic***** readers typically involves the presence of a professional reading therapist for guidance , advice and evaluation . | ||
| intuitive | 19 | |
| W19-1603 The framework is focused of developing ***** intuitive ***** verbal interaction with various types of robots. | ||
| D18-1182 We show that this simple setup is capable of teasing out various properties of different popular lexical resources (like WordNet and pre-trained word embeddings), while being ***** intuitive ***** enough to annotate on a large scale. | ||
| L16-1073 Galaxy allows data inputs and processing steps to be selected from graphical menus, and results are displayed in ***** intuitive ***** plots and summaries that encourage interactive workflows and the exploration of hypotheses. | ||
| L16-1549 In order to explore ***** intuitive ***** verbal and non-verbal interfaces in smart environments we recorded user interactions with an intelligent apartment. | ||
| W16-5201 As a practical use case we use Kathaa to visually implement the Sampark Hindi-Panjabi Machine Translation Pipeline and the Sampark Hindi-Urdu Machine Translation Pipeline, to demonstrate the fact that Kathaa can handle really complex NLP systems while still being ***** intuitive ***** for the end user | ||
| tuning | 19 | |
| 2020.wmt-1.140 With state-of-the-art English-German NMT components, we show that ***** tuning ***** to paraphrased references produces a system that is ignificantly better according to human judgment, but 5 BLEU points worse when tested on standard references. | ||
| R17-1071 In particular, using the longest 50% of the ***** tuning ***** sentences, we achieve two-fold ***** tuning ***** speedup, and improvements in BLEU score that rival those of alternatives, which fix BLEU+1's smoothing instead. | ||
| 2020.acl-main.658 We then describe common methodological issues in ***** tuning ***** and evaluation of unsupervised cross-lingual models and present best practices. | ||
| 2021.emnlp-main.792 For efficient learning, we investigate the use of a geometric mapping in embedding space to transform linguistic properties, without any ***** tuning ***** of the pre-trained sentence encoder or decoder. | ||
| 2014.iwslt-evaluation.4 The performance of ***** tuning ***** to IMEANT is comparable to ***** tuning ***** on MEANT (differences are statistically insignificant) | ||
| nearest | 19 | |
| D19-1225 On the task of reconstructing missing glyphs from an unknown font given only a small number of observations, our model outperforms both a strong ***** nearest ***** neighbors baseline and a state-of-the-art discriminative model from prior work. | ||
| W19-5207 We explore using multilingual document embeddings for ***** nearest ***** neighbor mining of parallel data. | ||
| C16-1058 We present a variant of k-***** nearest ***** neighbors (kNN) classification with composite features to identify ***** nearest ***** neighbors for SRL. | ||
| 2021.naacl-demos.13 We further propose a new measure of embedding confidence based on ***** nearest ***** neighborhood overlap, to assist in identifying high-quality embeddings for corpus analysis. | ||
| 2020.sigtyp-1.5 As back-off model for languages whose phylogenetic position is unknown, a k- ***** nearest ***** neighbor classification based on geo- graphic distance is performed | ||
| Numerical | 19 | |
| 2021.emnlp-main.817 ***** Numerical ***** evaluation is performed on four different tasks: machine translation, summarization, data2text generation and image captioning. | ||
| 2021.emnlp-main.563 ***** Numerical ***** reasoning in machine reading comprehension (MRC) has shown drastic improvements over the past few years. | ||
| 2021.deelio-1.14 ***** Numerical ***** common sense (NCS) is necessary to fully understand natural language text that includes numerals | ||
| D19-1251 *****Numerical***** reasoning , such as addition , subtraction , sorting and counting is a critical skill in human 's reading comprehension , which has not been well considered in existing machine reading comprehension ( MRC ) systems . | ||
| 2020.emnlp-main.549 *****Numerical***** reasoning over texts , such as addition , subtraction , sorting and counting , is a challenging machine reading comprehension task , since it requires both natural language understanding and arithmetic computation . | ||
| stereotypical | 19 | |
| 2021.woah-1.10 In hate speech detection, however, equalizing model predictions may ignore important differences among targeted social groups, as hate speech can contain ***** stereotypical ***** language specific to each SGT. | ||
| 2021.emnlp-main.111 We present the first dataset comprising ***** stereotypical ***** attributes of a range of social groups and propose a method to elicit stereotypes encoded by pretrained language models in an unsupervised fashion. | ||
| P19-1160 Specifically, we consider four types of information: feminine, masculine, gender-neutral and ***** stereotypical *****, which represent the relationship between gender vs. bias, and propose a debiasing method that (a) preserves the gender-related information in feminine and masculine words, (b) preserves the neutrality in gender-neutral words, and (c) removes the biases from ***** stereotypical ***** words. | ||
| 2021.acl-long.416 We present StereoSet, a large-scale natural English dataset to measure ***** stereotypical ***** biases in four domains: gender, profession, race, and religion. | ||
| 2021.inlg-1.19 The knowledge of scripts, common chains of events in ***** stereotypical ***** scenarios, is a valuable asset for task-oriented natural language understanding systems | ||
| Word Sense Disambiguation | 19 | |
| 2021.gwc-1.17 To encourage future research on Persian ***** Word Sense Disambiguation *****, we release the PerSemCor in http://nlp.sbu.ac.ir. | ||
| L06-1456 WordNet is the reference sense inventory of most of the current *****Word Sense Disambiguation***** systems . | ||
| D19-1009 Game - theoretic models , thanks to their intrinsic ability to exploit contextual information , have shown to be particularly suited for the *****Word Sense Disambiguation***** task . | ||
| D17-1120 *****Word Sense Disambiguation***** models exist in many flavors . | ||
| C16-1330 Current *****Word Sense Disambiguation***** systems show an extremely poor performance on low frequent senses , which is mainly caused by the difference in sense distributions between training and test data . | ||
| analyzing | 19 | |
| D19-5410 Then we build the connection between priors residing in datasets and model designs, ***** analyzing ***** how different properties of datasets influence the choices of model structure design and training methods. | ||
| 2020.semeval-1.151 We obtained results ***** analyzing ***** the text and images separately, and also in combination. | ||
| S17-1028 As such, we examine the writings of schizophrenia patients ***** analyzing ***** their syntax, semantics and pragmatics. | ||
| 2020.acl-main.698 2019, we propose two new probing tasks ***** analyzing ***** factual knowledge stored in Pretrained Language Models (PLMs). | ||
| 2010.amta-papers.30 We propose a Chinese dependency tree reordering method for Chinese-to-Korean SMT systems through ***** analyzing ***** systematic differences between the Chinese and Korean languages | ||
| Beam | 19 | |
| D18-1342 *****Beam***** search is widely used in neural machine translation , and usually improves translation quality compared to greedy search . | ||
| D19-1144 *****Beam***** search is universally used in ( full - sentence ) machine translation but its application to simultaneous translation remains highly non - trivial , where output words are committed on the fly . | ||
| 2021.emnlp-main.662 *****Beam***** search is the go - to method for decoding auto - regressive machine translation models . | ||
| 2021.emnlp-main.52 *****Beam***** search is the default decoding strategy for many sequence generation tasks in NLP . | ||
| D18-1035 *****Beam***** search is a widely used approximate search strategy for neural network decoders , and it generally outperforms simple greedy decoding on tasks like machine translation . | ||
| topical | 19 | |
| 2021.wanlp-1.26 Our dataset consists of 6,121 claims along with their factual labels and additional metadata, such as fact-checking article content, ***** topical ***** category, and links to posts or Web pages spreading the claim. | ||
| C16-1218 Previous studies have highlighted the necessity for entity linking systems to capture the local entity-mention similarities and the global ***** topical ***** coherence. | ||
| D19-1484 We introduce a novel, large, and diverse dataset of written code-switched productions, curated from ***** topical ***** threads of multiple bilingual communities on the Reddit discussion platform, and explore questions that were mainly addressed in the context of spoken language thus far. | ||
| D19-5558 We introduce a novel, large, and diverse dataset of written code-switched productions, curated from ***** topical ***** threads of multiple bilingual communities on the Reddit discussion platform, and explore questions that were mainly addressed in the context of spoken language thus far. | ||
| 2021.acl-long.128 Existing approaches typically (i) explore the semantic interaction between the claim and evidence at different granularity levels but fail to capture their ***** topical ***** consistency during the reasoning process, which we believe is crucial for verification; (ii) aggregate multiple pieces of evidence equally without considering their implicit stances to the claim, thereby introducing spurious information | ||
| Story | 19 | |
| N19-4016 *****Story***** composition is a challenging problem for machines and even for humans . | ||
| C18-1088 *****Story***** generation is a challenging problem in artificial intelligence ( AI ) and has received a lot of interests in the natural language processing ( NLP ) community . | ||
| W19-2405 *****Story***** infilling involves predicting words to go into a missing span from a story . | ||
| 2020.ccl-1.83 *****Story***** generation is a challenging task of automatically creating natural languages to describe a sequence of events , which requires outputting text with not only a consistent topic but also novel wordings . | ||
| 2021.naacl-main.279 *****Story***** generation is an open - ended and subjective task , which poses a challenge for evaluating story generation models . | ||
| pragmatic | 19 | |
| 2021.eacl-main.204 Our analyses show that the proposed ***** pragmatic ***** features do capture cross-cultural similarities and align well with existing work in sociolinguistics and linguistic anthropology. | ||
| 2020.conll-1.14 We show that agents with a simple repair mechanism can increase efficiency, compared to ***** pragmatic ***** agents, by reducing their computational burden at the cost of longer interactions. | ||
| W19-4013 This paper presents the identification of formulaic sequences in the reference corpus of spoken Slovenian and their annotation in terms of syntactic structure, ***** pragmatic ***** function and lexicographic relevance. | ||
| 2021.inlg-1.41 We implement a mixed, `fast' and `slow', speaker that applies ***** pragmatic ***** reasoning occasionally (only word-initially), while unrolling the language model. | ||
| 2020.codi-1.6 These features are inspired through several sources – cognitive parameters, ***** pragmatic ***** factors and typological status | ||
| framing | 19 | |
| 2020.smm4h-1.12 In this paper, we develop a dataset designed to foster research in depression and anxiety detection in Twitter, ***** framing ***** the detection task as a binary tweet classification problem. | ||
| N19-1304 We provide an NLP framework to uncover four linguistic dimensions of political polarization in social media: topic choice, ***** framing *****, affect and illocutionary force. | ||
| 2021.econlp-1.11 Our experiments show that our approach achieves an 88.78% accuracy for day trading behavior prediction and reveals ***** framing ***** fluctuations prior to and during the COVID-19 pandemic that could be used to guide investment actions. | ||
| 2021.latechclfl-1.2 On the other hand, positive feelings triggered by smells are prevalent, and contribute to ***** framing ***** travels to Italy as an exciting experience involving all senses | ||
| 2021.emnlp-demo.28 We propose and guide users through a five-step end-to-end computational ***** framing ***** analysis framework grounded in media ***** framing ***** theory in communication research. | ||
| methodology | 19 | |
| 2020.lrec-1.312 The paper describes the alignment ***** methodology ***** used, the evaluation of the alignments, and preliminary experiments on statistical and neural machine translation (SMT and NMT) between Inuktitut and English, in both directions. | ||
| R17-1022 This will be tackled by proposing a semi-automatic ***** methodology *****. | ||
| 2020.globalex-1.3 We discuss the motivation, ***** methodology ***** and modeling strategies of the work, as well as its possible applications and potential future developments. | ||
| L12-1402 The paper is divided in three sections: 1) an introduction dedicated to data extracted from scientific documentation; 2) the second section devoted to ***** methodology ***** and data description; 3) the third part containing a statistical representation of terms extracted from the archive: indexes and concordances allow to reflect on the use of certain terms in this field and give possible keys for having access to the extraction of knowledge in the digital era. | ||
| 2021.semeval-1.119 The second dimension we explore is ***** methodology *****, including leveraging attention, employing a greedy remove method, using a frequency ratio, and examining hybrid combinations of multiple methods | ||
| discourse coherence | 19 | |
| W16-4408 The work focuses on a double topical construction of dialogue coherence which refers to ***** discourse coherence ***** on two levels: the evolution of dialogue topics via the interaction between the user and the robot system, and the creation of discourse topics via the content of the Wiki-pedia article itself. | ||
| 2020.aacl-main.67 This paper evaluates the utility of Rhetorical Structure Theory (RST) trees and relations in ***** discourse coherence ***** evaluation. | ||
| 2020.acl-main.439 We propose Conpono, an inter-sentence objective for pretraining language models that models ***** discourse coherence ***** and the distance between sentences. | ||
| 2021.emnlp-main.106 We draw on an insight from ***** discourse coherence ***** theory: potential coreferences are constrained by the reader's discourse focus. | ||
| 2020.codi-1.11 Finally, we make our datasets publicly available as a resource for researchers to use to test ***** discourse coherence ***** models | ||
| distributed | 19 | |
| P18-1223 The semantics from knowledge graphs are integrated in the ***** distributed ***** representations of their entities, while the ranking is conducted by interaction-based neural ranking networks. | ||
| W19-6201 Defining words in a textual context is a useful task both for practical purposes and for gaining insight into ***** distributed ***** word representations. | ||
| S18-1076 The model builds on the ***** distributed ***** tree embedder also known as ***** distributed ***** tree kernel. | ||
| 2011.iwslt-evaluation.2 Accordingly, the final WER was reduced by 30% from the baseline ASR for the ***** distributed ***** test set. | ||
| P17-1163 NBT models reason over pre-trained word vectors, learning to compose them into ***** distributed ***** representations of user utterances and dialogue context | ||
| symbolic | 19 | |
| 2021.emnlp-main.283 In order to achieve fast convergence and interpretability for the policy in RL, we propose a novel RL method for text-based games with a recent neuro-***** symbolic ***** framework called Logical Neural Network, which can learn ***** symbolic ***** and interpretable rules in their differentiable network. | ||
| 2021.naacl-main.274 Motivated by these observations, we propose a novel context-dependent gated module to adaptively control the information flows from the input ***** symbolic ***** features. | ||
| P19-1283 Here we present two methods based on Representational Similarity Analysis (RSA) and Tree Kernels (TK) which allow us to directly quantify how strongly the information encoded in neural activation patterns corresponds to information represented by ***** symbolic ***** structures such as syntax trees. | ||
| 2021.acl-long.456 Along with target expression supervision, our solver is also optimized via 4 new auxiliary objectives to enforce different ***** symbolic ***** reasoning: a) self-supervised number prediction task predicting both number quantity and number locations; b) commonsense constant prediction task predicting what prior knowledge (e.g. how many legs a chicken has) is required; c) program consistency checker computing the semantic loss between predicted equation and target equation to ensure reasonable equation mapping; d) duality exploiting task exploiting the quasi-duality between ***** symbolic ***** equation generation and problem's part-of-speech generation to enhance the understanding ability of a solver. | ||
| 2021.emnlp-main.505 However, existing neural models have been shown to lack this basic ability in learning ***** symbolic ***** structures | ||
| neural approaches | 19 | |
| 2020.nl4xai-1.5 We survey recent papers that integrate traditional NLG submodules in ***** neural approaches ***** and analyse their explainability. | ||
| 2021.eacl-main.55 Non-***** neural approaches ***** to argument mining (AM) are often pipelined and require heavy feature-engineering. | ||
| 2020.acl-main.146 While statistical MT methods have been replaced by ***** neural approaches ***** with superior performance, the twenty-year-old GIZA++ toolkit remains a key component of state-of-the-art word alignment systems. | ||
| 2021.eacl-main.308 With the advancements made by ***** neural approaches ***** in applications such as machine translation (MT), summarization and dialog systems, the need for coherence evaluation of these tasks is now more crucial than ever. | ||
| 2020.coling-industry.21 While ***** neural approaches ***** have achieved significant improvement in machine comprehension tasks, models often work as a black-box, resulting in lower interpretability, which requires special attention in domains such as healthcare or education. | ||
| grammatical errors | 19 | |
| 2014.amta-wptp.1 The PE output was analyzed taking into account accuracy errors (mistranslations and omissions) as well as language errors (***** grammatical errors ***** and syntax errors). | ||
| W17-5907 Detection and correction of Chinese ***** grammatical errors ***** have been two of major challenges for Chinese automatic grammatical error diagnosis.This paper presents an N-gram model for automatic detection and correction of Chinese ***** grammatical errors ***** in NLPTEA 2017 task. | ||
| W18-3710 The goal of this task is to diagnose Chinese sentences containing four kinds of ***** grammatical errors ***** through the model and find out the sentence errors. | ||
| L16-1029 As a result, requirements can contain a relatively large diversity of lexical and ***** grammatical errors *****, which are not eliminated by the use of guidelines from controlled languages. | ||
| 2020.acl-demos.17 This paper presents LinggleWrite, a writing coach that provides writing suggestions, assesses writing proficiency levels, detects ***** grammatical errors *****, and offers corrective feedback in response to user's essay. | ||
| training corpus | 19 | |
| L12-1159 Our results show that the genre of the ***** training corpus ***** does not have a significant effect on summary quality. | ||
| Q18-1031 We propose a new generative language model for sentences that first samples a prototype sentence from the ***** training corpus ***** and then edits it into a new sentence. | ||
| 2021.emnlp-main.617 Second, only the items mentioned in the ***** training corpus ***** have a chance to be recommended in the conversation. | ||
| 2020.emnlp-main.556 Pre-trained language models (LMs) may perpetuate biases originating in their ***** training corpus ***** to downstream models. | ||
| 2011.iwslt-papers.10 Such classifier can be learned from a ***** training corpus ***** that comprises only useful instances. | ||
| verbal multiword | 19 | |
| 2020.mwe-1.20 For our approach, we interpret detecting ***** verbal multiword ***** expressions as a token classification task aiming to decide whether a token is part of a ***** verbal multiword ***** expression or not. | ||
| 2020.mwe-1.17 This paper describes the ERMI system submitted to the closed track of the PARSEME shared task 2020 on automatic identification of ***** verbal multiword ***** expressions (VMWEs). | ||
| W18-4929 In this paper, we describe Mumpitz, the system we submitted to the PARSEME Shared task on automatic identification of ***** verbal multiword ***** expressions (VMWEs). | ||
| W18-4931 This paper describes a system submitted to the closed track of the PARSEME shared task (edition 1.1) on automatic identification of ***** verbal multiword ***** expressions (VMWEs). | ||
| W19-5103 This paper reports on the Romanian journalistic corpus annotated with *****verbal multiword***** expressions following the PARSEME guidelines . | ||
| previous research | 19 | |
| 2021.acl-long.4 We specify 29 model functionalities motivated by a review of ***** previous research ***** and a series of interviews with civil society stakeholders. | ||
| 2021.ranlp-1.140 Our results corroborate ***** previous research ***** on this task in that topic-related features yield better results than style-based ones, although they also highlight the relevance of using higher-length n-grams. | ||
| C18-1045 This paper focus on the macro level discourse structure analysis, which has been less studied in the ***** previous research *****es. | ||
| P17-1027 Restricted non-monotonicity has been shown beneficial for the projective arc-eager dependency parser in ***** previous research *****, as posterior decisions can repair mistakes made in previous states due to the lack of information. | ||
| 2020.crac-1.8 This paper critically examines the assumption prevalent in ***** previous research ***** that SNs are typically accompanied by a specific antecedent, arguing that SNs like “issue” and “decision” are frequently used to refer, not to specific antecedents, but to global discourse topics, in which case they are out of reach of previously proposed resolution strategies that are tailored to SNs with explicit antecedents. | ||
| building | 19 | |
| W19-0503 This allows us to conclude that, despite prior claims, truth-theoretic models are good candidates for ***** building ***** graded lexical representations of meaning. | ||
| E17-5002 The technical differences between NMT and the previously dominant phrase-based statistical approach require that practictioners learn new best practices for ***** building ***** MT systems, ranging from different hardware requirements, new techniques for handling rare words and monolingual data, to new opportunities in continued learning and domain adaptation.This tutorial is aimed at researchers and users of machine translation interested in working with NMT. | ||
| 2020.findings-emnlp.171 Finally, simply finetuning this pre trained QA model into specialized models results in a new state of the art on 10 factoid and commonsense question answering datasets, establishing UNIFIEDQA as a strong starting point for ***** building ***** QA systems. | ||
| L12-1319 The project did not aim at ***** building ***** a full-fledged Construction Grammar, but the registry of English constructions created by this project, which is called Constructicon, provides a representative sample of the current coverage of English constructions (Lee-Goldman & Rhodes 2009). | ||
| L08-1159 Yet, ***** building ***** such models requires appropriate definition of various levels for representing the emotions themselves but also some contextual information such as the events that elicit these emotions. | ||
| corpus analysis | 19 | |
| L14-1523 Such classifiers are built automatically by parallel ***** corpus analysis *****: Creating subcorpora for each translation of a 1:n package, and identifying correlating concepts in these subcorpora as features of the classifier. | ||
| Q15-1021 We back up our arguments with ***** corpus analysis ***** and by highlighting statements that other researchers have made in the simplification literature. | ||
| N18-1181 In a ***** corpus analysis ***** of Mandarin Chinese, we show that the distribution of speaker choices supports the availability-based production account and not the Uniform Information Density. | ||
| E17-4001 I present a general model of the human image description process, and propose to study this process using ***** corpus analysis *****, experiments, and computational modeling. | ||
| L16-1156 We show via ***** corpus analysis ***** that the Generative Lexicon, enhanced in different manners and viewed as both a lexicon and a domain knowledge representation, is a relevant approach. | ||
| dialogue summarization | 19 | |
| 2021.emnlp-main.365 We hope that this study could benchmark Chinese ***** dialogue summarization ***** and benefit further studies. | ||
| 2021.emnlp-main.7 Most existing works for low-resource ***** dialogue summarization ***** directly pretrain models in other domains, e.g., the news domain, but they generally neglect the huge difference between dialogues and conventional articles. | ||
| 2021.newsum-1.12 As part of this process, we also provide an extensive overview of existing ***** dialogue summarization ***** data sets. | ||
| 2021.newsum-1.8 In this paper, we focus on improving the quality of the summary generated by neural abstractive ***** dialogue summarization ***** systems. | ||
| 2021.sigdial-1.53 Therefore, in this work, we investigate different approaches to explicitly incorporate coreference information in neural abstractive ***** dialogue summarization ***** models to tackle the aforementioned challenges. | ||
| language and vision | 19 | |
| 2020.acl-main.495 Neural module networks (NMNs) are a popular approach for modeling compositionality: they achieve high accuracy when applied to problems in ***** language and vision *****, while reflecting the compositional structure of the problem in the network architecture. | ||
| 2020.findings-emnlp.248 However, current approaches often require them to combine redundant information provided by ***** language and vision *****. | ||
| P18-5004 To this end, recent advances at the intersection of ***** language and vision ***** have made incredible progress – from being able to generate natural language descriptions of images/videos, to answering questions about them, to even holding free-form conversations about visual content! | ||
| 2020.findings-emnlp.413 As a community, we have achieved good benchmarks over ***** language and vision ***** domains separately, however joint reasoning is still a challenge for state-of-the-art computer vision and natural language processing (NLP) systems. | ||
| D18-1168 Though moment localization with natural language is similar to other ***** language and vision ***** tasks like natural language object retrieval in images, moment localization offers an interesting opportunity to model temporal dependencies and reasoning in text. | ||
| automatic machine translation | 19 | |
| 2012.amta-papers.24 This paper investigates the usefulness of ***** automatic machine translation ***** metrics when analyzing the impact of source reformulations on the quality of machine-translated user generated content. | ||
| 1999.mtsummit-1.22 This invited talk describes the use of fully ***** automatic machine translation ***** (FAMT) at the Pan American Health Organization. | ||
| 2021.ranlp-1.6 The advancement of the web and information technology has contributed to the rapid growth of digital libraries and ***** automatic machine translation ***** tools which easily translate texts from one language into another. | ||
| L14-1094 The judgments are compared to two ***** automatic machine translation ***** evaluation metrics. | ||
| C16-1172 Recently, the development of neural machine translation (NMT) has significantly improved the translation quality of ***** automatic machine translation *****. | ||
| online social media | 19 | |
| C16-1314 In this paper, we propose a systematic method to leverage user ***** online social media ***** content for predicting offline restaurant consumption level. | ||
| C18-1156 In this paper, we propose a ContextuAl SarCasm DEtector (CASCADE), which adopts a hybrid approach of both content- and context-driven modeling for sarcasm detection in ***** online social media ***** discussions. | ||
| W18-4421 Cyberaggression refers to aggressive online behaviour that aims at harming other individuals, and involves rude, insulting, offensive, teasing or demoralising comments through ***** online social media *****. | ||
| 2020.osact-1.16 There is a need to control and prevent such misuse of ***** online social media ***** through automatic detection of profane language. | ||
| P18-2031 We introduce a new approach to tackle the problem of offensive language in ***** online social media *****. | ||
| recognizing textual | 19 | |
| L10-1469 Many natural language processing tasks, including information extraction, question answering and ***** recognizing textual ***** entailment, require analysis of the polarity, focus of polarity, tense, aspect, mood and source of the event mentions in a text in addition to its predicate-argument structure analysis. | ||
| D19-1340 Here, we investigate the importance that a model assigns to various aspects of data while learning and making predictions, specifically, in a ***** recognizing textual ***** entailment (RTE) task. | ||
| L14-1339 Named entity recognition (NER) is a knowledge-intensive information extraction task that is used for ***** recognizing textual ***** mentions of entities that belong to a predefined set of categories, such as locations, organizations and time expressions. | ||
| 2014.lilt-9.3 From a purely theoretical point of view, it makes sense to approach ***** recognizing textual ***** entailment (RTE) with the help of logic. | ||
| N18-1101 ***** recognizing textual ***** entailment), improving upon available resources in both its coverage and difficulty. | ||
| guidelines | 19 | |
| 2020.coling-main.421 Our method can be used to spot outlier annotators, improve annotation ***** guidelines ***** and provide a better picture of the annotation reliability. | ||
| N18-1055 We further establish ***** guidelines ***** for trustable results in neural GEC and propose a set of model-independent methods for neural GEC that can be easily applied in most GEC settings. | ||
| L16-1247 Still the refinement of the ***** guidelines ***** and methodology is needed in order to re-annotate some syntactic phenomena, e.g. | ||
| 2021.unimplicit-1.3 We derive from our experience a set of evaluation ***** guidelines ***** to reach high inter-annotator agreement on such cases. | ||
| L10-1526 Having established an annotation scenario for capturing semantic relations crossing the sentence boundary in a discourse , and having annotated the first sections of the treebank according to these *****guidelines***** , we report now on the results of the first evaluation of these manual annotations . | ||
| sequence learning | 19 | |
| R19-1119 Self-attentional models are a new paradigm for sequence modelling tasks which differ from common sequence modelling methods, such as recurrence-based and convolution-based ***** sequence learning *****, in the way that their architecture is only based on the attention mechanism. | ||
| D17-1120 To bridge this gap we adopt a different perspective and rely on ***** sequence learning ***** to frame the disambiguation problem: we propose and study in depth a series of end-to-end neural architectures directly tailored to the task, from bidirectional Long Short-Term Memory to encoder-decoder models. | ||
| C16-1130 In this paper, we study WSD with a ***** sequence learning ***** neural net, LSTM, to better capture the sequential and syntactic patterns of the text. | ||
| W18-5020 In this work, we propose a novel approach to NLG using convolutional neural network (CNN) based sequence to ***** sequence learning *****. | ||
| P17-1019 In this paper, we propose an end-to-end question answering system called COREQA in sequence-to-***** sequence learning *****, which incorporates copying and retrieving mechanisms to generate natural answers within an encoder-decoder framework. | ||
| dependency tree | 19 | |
| I17-1007 In this paper, we propose a probabilistic parsing model that defines a proper conditional probability distribution over non-projective ***** dependency tree *****s for a given sentence, using neural representations as inputs. | ||
| P18-2071 Different from widely-used RST-DT and PDTB, SciDTB uses ***** dependency tree *****s to represent discourse structure, which is flexible and simplified to some extent but do not sacrifice structural integrity. | ||
| D19-5901 We pay Turkers to construct unlabeled ***** dependency tree *****s for 500 English sentences using an interactive graphical ***** dependency tree ***** editor, collecting 10 annotations per sentence. | ||
| 2021.latechclfl-1.10 This paper addresses this problem with an unsupervised, rule-based approach for adverbial identification that utilizes ***** dependency tree ***** patterns. | ||
| 2021.eacl-main.170 In this study, we design a directed syntactic dependency graph based on a ***** dependency tree ***** to establish a path from the target to candidate opinions. | ||
| unsupervised text style | 19 | |
| 2021.emnlp-main.730 In this paper, we explore Non-AutoRegressive (NAR) decoding for ***** unsupervised text style ***** transfer. | ||
| 2021.emnlp-main.729 In this paper, we propose a collaborative learning framework for ***** unsupervised text style ***** transfer using a pair of bidirectional decoders, one decoding from left to right while the other decoding from right to left. | ||
| 2020.coling-main.201 In this paper, we propose a novel neural approach to ***** unsupervised text style ***** transfer which we refer to as Cycle-consistent Adversarial autoEncoders (CAE) trained from non-parallel data. | ||
| 2020.acl-main.354 Our experimental results show the effectiveness of the proposed method, SeqGFMN, for three distinct generation tasks in English: unconditional text generation, class-conditional text generation, and ***** unsupervised text style ***** transfer. | ||
| 2020.coling-main.191 The prevalent approach for ***** unsupervised text style ***** transfer is disentanglement between content and style. | ||
| learning word | 19 | |
| D18-1057 Most models for ***** learning word ***** embeddings are trained based on the context information of words, more precisely first order co-occurrence relations. | ||
| C16-1149 In this work, we propose an unsupervised method of ***** learning word *****-emotion association from large text corpora, called Selective Co-occurrences (SECO), by leveraging the property of mutual exclusivity generally exhibited by emotions. | ||
| P18-1094 We evaluate our proposal on ***** learning word ***** embeddings, order embeddings and knowledge graph embeddings and observe both faster convergence and improved results on multiple metrics. | ||
| P19-1402 Existing approaches for ***** learning word ***** embedding often assume there are sufficient occurrences for each word in the corpus, such that the representation of words can be accurately estimated from their contexts. | ||
| D17-1100 Corpora of referring expressions paired with their visual referents are a good source for ***** learning word ***** meanings directly grounded in visual representations. | ||
| fusion | 19 | |
| N19-1037 Moreover, we promote the framework to two variants, Hi-GRU with individual features ***** fusion ***** (HiGRU-f) and HiGRU with self-attention and features ***** fusion ***** (HiGRU-sf), so that the word/utterance-level individual inputs and the long-range contextual information can be sufficiently utilized. | ||
| Q15-1023 We attack this con***** fusion ***** by analyzing differences between several versions of the EL problem and presenting a simple yet effective, modular, unsupervised system, called Vinculum, for entity linking. | ||
| W16-3714 To this end, we explore the use of distinctive feature weights, lexical tone con***** fusion *****s, and a two-step clustering algorithm to learn projections of phoneme segments from mismatched multilingual transcriber languages to the target language. | ||
| P19-1521 To address this label con***** fusion ***** problem, this paper proposes cost-sensitive regularization, which can force the training procedure to concentrate more on optimizing confusing type pairs. | ||
| 2012.iwslt-evaluation.10 The outputs of the subsystems are combined via con***** fusion ***** network combination. | ||
| importance | 19 | |
| 2021.semeval-1.7 The evaluation results for the third subtask confirmed the ***** importance ***** of both modalities, the text and the image. | ||
| 2012.amta-monomt.1 Obtained results highlight the ***** importance ***** of generalization, and therefore generation, for dealing with out-of-domain data. | ||
| 2021.vardial-1.11 The additional analyses carried out underline the ***** importance ***** of optimization, especially when the measure of effectiveness is the Macro-F1. | ||
| 2020.wmt-1.66 In this paper, we first take a step back and look at the commonly used bilingual corpora (WMT), and resurface the existence and ***** importance ***** of implicit structure that existed in it: multi-way alignment across examples (the same sentence in more than two languages). | ||
| L14-1334 In this paper, we consider the ***** importance ***** of identifying the change of state for events - in particular, clinical events that measure and compare the multiple states of a patients health across time. | ||
| policy | 19 | |
| D17-1260 Simulation experiments showed that the proposed approach can significantly improve both safetyand efficiency of on-line ***** policy ***** optimization compared to other companion learning approaches as well as supervised pre-training using static dialogue corpus. | ||
| D19-1010 The reward estimator evaluates the state-action pairs so that it can guide the dialog ***** policy ***** at each dialog turn. | ||
| 2020.findings-emnlp.75 Although I2A achieves a higher success rate than baselines by augmenting predicted future into a ***** policy ***** network, its complicated architecture introduces unwanted instability. | ||
| 2021.dialdoc-1.10 We can leverage these signals to generate the weakly supervised training data for learning dialog ***** policy ***** and reward estimator, and make the ***** policy ***** take actions (generates responses) which can foresee the future direction for a successful (rewarding) conversation. | ||
| L08-1171 We can show that, despite a low fit to the initial data, the objective function obtained from WOZ data makes accurate predictions for automatic dialogue evaluation, and, when automatically optimising a ***** policy ***** using these predictions, the improvement over a strategy simply mimicking the data becomes clear from an error analysis. | ||
| unsupervised morphological paradigm | 19 | |
| 2020.acl-main.598 We propose the task of ***** unsupervised morphological paradigm ***** completion. | ||
| 2020.sigmorphon-1.9 In this paper, we present the systems of the University of Stuttgart IMS and the University of Colorado Boulder (IMS–CUBoulder) for SIGMORPHON 2020 Task 2 on ***** unsupervised morphological paradigm ***** completion (Kann et al., 2020). | ||
| 2020.sigmorphon-1.8 We describe the NYU-CUBoulder systems for the SIGMORPHON 2020 Task 0 on typologically diverse morphological inflection and Task 2 on ***** unsupervised morphological paradigm ***** completion. | ||
| 2021.sigmorphon-1.9 This work describes the Edinburgh submission to the SIGMORPHON 2021 Shared Task 2 on ***** unsupervised morphological paradigm ***** clustering. | ||
| 2020.sigmorphon-1.3 This shows that ***** unsupervised morphological paradigm ***** completion is still largely unsolved. | ||
| structured knowledge | 19 | |
| 2021.eacl-main.153 Pretrained language models have been suggested as a possible alternative or complement to ***** structured knowledge ***** bases. | ||
| 2020.knlp-1.2 However, how to integrate ***** structured knowledge ***** into these DNC models remains a challenging research question. | ||
| 2020.inlg-1.44 (2) By comparing these three KGs, we predict a review score and detailed ***** structured knowledge ***** as evidence for each review category. | ||
| 2020.acl-demos.11 We present the first comprehensive, open source multimedia knowledge extraction system that takes a massive stream of unstructured, heterogeneous multimedia data from various sources and languages as input, and creates a coherent, ***** structured knowledge ***** base, indexing entities, relations, and events, following a rich, fine-grained ontology. | ||
| 2020.semeval-1.48 We propose a novel Knowledge-enhanced Graph Attention Network (KEGAT) architecture for this task, leveraging heterogeneous knowledge from both the ***** structured knowledge ***** base (i.e. | ||
| contextual word | 19 | |
| 2020.coling-main.109 The stellar success of ***** contextual word ***** embedding models such as BERT in NLP tasks has led many to question whether these models have learned linguistic information, but up till now, most research has focused on syntactic information. | ||
| P19-1604 We study the effect of several strategies to deal with out-of-vocabulary words such as copy mechanisms, placeholders, and ***** contextual word ***** embeddings. | ||
| N19-1112 To investigate the transferability of ***** contextual word ***** representations, we quantify differences in the transferability of individual layers within contextualizers, especially between recurrent neural networks (RNNs) and transformers. | ||
| D19-3026 The first analyses gender issues in ***** contextual word ***** embeddings. | ||
| 2020.acl-main.422 This paper investigates ***** contextual word ***** representation models from the lens of similarity analysis. | ||
| grounded language | 19 | |
| 2021.acl-srw.8 The impressive performances of pre-trained visually ***** grounded language ***** models have motivated a growing body of research investigating what has been learned during the pre-training. | ||
| W19-1808 Recent work on visually ***** grounded language ***** learning has focused on broader applications of grounded representations, such as visual question answering and multimodal machine translation. | ||
| C16-1124 We present a model of visually-***** grounded language ***** learning based on stacked gated recurrent neural networks which learns to predict visual features given an image description in the form of a sequence of phonemes. | ||
| D18-1411 Previous work on ***** grounded language ***** learning did not fully capture the semantics underlying the correspondences between structured world state representations and texts, especially those between numerical values and lexical terms. | ||
| 2021.naacl-main.348 We investigate *****grounded language***** learning through real - world data , by modelling a teacher - learner dynamics through the natural interactions occurring between users and search engines ; in particular , we explore the emergence of semantic generalization from unsupervised dense representations outside of synthetic environments . | ||
| trained | 19 | |
| 2020.coling-main.581 Deep pre-***** trained ***** language models tend to become ubiquitous in the field of Natural Language Processing (NLP). | ||
| 2021.mmsr-1.5 However, a lot of work mainly focused on models ***** trained ***** for uni-modal tasks, e.g. | ||
| D18-1270 This enables our approach to: (a) augment the limited supervision in the target language with additional supervision from a high-resource language (like English), and (b) train a single entity linking model for multiple languages, improving upon individually ***** trained ***** models for each language. | ||
| 2020.repl4nlp-1.24 We highlight that on several tasks while such perturbations are natural, state of the art ***** trained ***** models are surprisingly brittle. | ||
| 2021.alta-1.26 Our empirical experiments reveal that these modern pre***** trained ***** language models suffer from high variance, and the ensemble method can improve the model performance. | ||
| similar | 19 | |
| L14-1031 The second method makes use of recent advances in distributional ***** similar *****ity representation to transfer existing norms to their closest neighbors in a high-dimensional vector space. | ||
| P18-2091 This paper presents the first study aimed at capturing stylistic ***** similar *****ity between words in an unsupervised manner. | ||
| 2003.mtsummit-papers.3 Based on these assumptions, new valency entries are constructed from words in a plain bilingual dictionary, using entries with ***** similar ***** source-language meaning and the same target-language translations. | ||
| L16-1072 We have seen that many resources exist which are useful for MT and ***** similar ***** work, but the majority are for (academic) research or educational use only, and as such not available for commercial use. | ||
| E17-2054 We show that a model capitalizing on a `fuzzy' measure of ***** similar *****ity is effective for learning quantifiers, whereas the learning of exact cardinals is better accomplished when information about number is provided. | ||
| computational linguistic | 19 | |
| C18-1272 Such models can provide fertile ground for (cognitive) ***** computational linguistic *****s studies. | ||
| W16-4812 This is the first preliminary study for a dialect that has not been widely studied in ***** computational linguistic *****s, evidencing the possible existence of distinct subdialects. | ||
| N18-5004 We present CL Scholar, the ACL Anthology knowledge graph miner to facilitate high-quality search and exploration of current research progress in the ***** computational linguistic *****s community. | ||
| 2021.eacl-main.312 Automatic detection of the four MBTI personality dimensions from texts has recently attracted noticeable attention from the natural language processing and ***** computational linguistic ***** communities. | ||
| 2021.trustnlp-1.6 We discuss future work that would benefit immensely from a ***** computational linguistic *****s perspective. | ||
| abstractive text | 19 | |
| R19-1146 In this paper we describe how an ***** abstractive text ***** summarization method improved the informativeness of automatic summaries by integrating syntactic text simplification, subject-verb-object concept frequency scoring and a set of rules that transform text into its semantic representation. | ||
| 2020.lrec-1.222 Recently, generative language models have shown promise in ***** abstractive text ***** summarization tasks. | ||
| 2020.nlpbt-1.7 Prior work on multimodal ***** abstractive text ***** summarization only utilized information from the text and video modalities. | ||
| 2020.emnlp-main.34 Unsupervised methods are promising for ***** abstractive text *****summarization in that the parallel corpora is not required. | ||
| N19-4012 Neural ***** abstractive text ***** summarization (NATS) has received a lot of attention in the past few years from both industry and academia. | ||
| scoring | 19 | |
| 2021.acl-long.96 We also carry out multiple experiments to measure how much each augmentation strategy improves the performance of automatic ***** scoring ***** systems. | ||
| S17-2148 This paper describes our system for fine-grained sentiment ***** scoring ***** of news headlines submitted to SemEval 2017 task 5–subtask 2. | ||
| P19-1390 Nevertheless, progress on dimension-specific essay ***** scoring ***** is limited in part by the lack of annotated corpora. | ||
| 2021.naacl-main.405 Alternatives include energy-based models (which give up efficient sampling) and latent-variable autoregressive models (which give up efficient ***** scoring ***** of a given string). | ||
| S17-2150 It ranked first for both of the subtasks by the scores achieved by an alternate ***** scoring ***** system. | ||
| patent | 19 | |
| L10-1471 To integrate the advantages of both tools, we have been proposing methods for encyclopedic search targeting information on the Web and ***** patent ***** information. | ||
| C16-1113 There are growing needs for ***** patent ***** analysis using Natural Language Processing (NLP)-based approaches. | ||
| 2003.mtsummit-systems.10 In response to growing needs for cross-lingual ***** patent ***** retrieval, we propose PRIME (Patent Retrieval In Multilingual Environment system), in which users can retrieve and browse ***** patent *****s in foreign languages only by their native language. | ||
| 2014.amta-researchers.18 This paper presents a Japanese - to - English statistical machine translation system specialized for *****patent***** translation . | ||
| W16-4612 We participate in scientific paper subtask ( ASPEC - EJ / CJ ) and patent subtask ( JPC - EJ / CJ / KJ ) with phrase - based SMT systems which are trained with its own *****patent***** corpora . | ||
| class | 19 | |
| 1997.mtsummit-workshop.4 (1) From a linguistic viewpoint, the expected benefits include a refinement of the aspectual model in (Olsen, 1994; Olsen, 1997) (which provides necessary but not sufficient conditions for aspectual com- position), and a refinement of the verb ***** class *****ifications in (Levin, 1993); we also expect our approach to eventually produce a systematic definition (in terms of LCSs and compositional operations) of the precise meaning components responsible for Levin's ***** class *****ification. | ||
| 2020.wanlp-1.32 In this paper, several techniques with multiple algorithms are applied for Arabic dialects identification starting from removing noise till ***** class *****ification task using all Arabic countries as 21 ***** class *****es. | ||
| 2020.emnlp-main.52 Specifically, we devise two components, prototype enhanced retrospection and hierarchical distillation, to mitigate the adverse effects of semantic ambiguity and ***** class ***** imbalance, respectively. | ||
| 2021.emnlp-main.643 Here, we introduce the application of balancing loss functions for multi-label text ***** class *****ification. | ||
| C16-1249 However, the ***** class *****ification performance greatly suffers when the size of the labeled data is limited. | ||
| foreign | 19 | |
| 2020.wat-1.4 However, it is not practical to translate them all manually into a new ***** foreign ***** language. | ||
| W19-9006 Focus being not just on ***** foreign ***** language tuition, but above all on people, places and events in the history and culture of the EU member states, the annotation modules of the e-Platform have been accordingly extended. | ||
| K17-1025 We present a feature-rich knowledge tracing method that captures a student's acquisition and retention of knowledge during a ***** foreign ***** language phrase learning task. | ||
| D17-1240 Based on this likeliness estimate we asked annotators to re-annotate the language tags of ***** foreign ***** words in predominantly native contexts. | ||
| 2003.mtsummit-systems.10 In response to growing needs for cross-lingual patent retrieval, we propose PRIME (Patent Retrieval In Multilingual Environment system), in which users can retrieve and browse patents in ***** foreign ***** languages only by their native language. | ||
| verbal irony | 19 | |
| S18-1093 In task 2, three types of irony are considered; “Irony by contrast” - ironic instances where evaluative expression portrays inverse polarity (positive, negative) of the literal proposition; “Situational irony” - ironic instances where output of a situation do not comply with its expectation; “Other *****verbal irony*****” - instances where ironic intent does not rely on polarity contrast or unexpected outcome. | ||
| W17-5201 Sarcasm is a form of *****verbal irony***** that is intended to express contempt or ridicule. | ||
| S18-1106 Task A is a classical binary classification task to determine whether a tweet is ironic or not, while Task B is a multiclass classification task devoted to distinguish different types of irony, where systems have to predict one out of four labels describing *****verbal irony***** by clash, other *****verbal irony*****, situational irony, and non-irony. | ||
| L16-1283 In this research, we present the construction of a Twitter dataset for two languages, being English and Dutch, and the development of new guidelines for the annotation of *****verbal irony***** in social media texts. | ||
| 2016.lilt-14.7 *****Verbal irony*****, or sarcasm, presents a significant technical and conceptual challenge when it comes to automatic detection. | ||
| multi - turn | 19 | |
| W19-4102 Conversational machine comprehension ( CMC ) requires understanding the context of *****multi - turn***** dialogue . | ||
| P17-1046 We study response selection for *****multi - turn***** conversation in retrieval based chatbots . | ||
| W18-5709 Multimodal search - based dialogue is a challenging new task : It extends visually grounded question answering systems into *****multi - turn***** conversations with access to an external database . | ||
| 2020.sigdial-1.37 There is a growing interest in developing goal - oriented dialog systems which serve users in accomplishing complex tasks through *****multi - turn***** conversations . | ||
| 2021.eancs-1.2 The evaluation of dialogue systems in interaction with simulated users has been proposed to improve turn - level , corpus - based metrics which can only evaluate test cases encountered in a corpus and can not measure system 's ability to sustain *****multi - turn***** interactions . | ||
| slot | 19 | |
| D18-1417 Spoken Language Understanding ( SLU ) , which typically involves intent determination and *****slot***** filling , is a core component of spoken dialogue systems . | ||
| 2021.emnlp-main.620 We present a novel hybrid architecture that augments GPT-2 with representations derived from Graph Attention Networks in such a way to allow causal , sequential prediction of *****slot***** values . | ||
| 2021.naacl-main.200 Challenging problems such as open - domain question answering , fact checking , *****slot***** filling and entity linking require access to large , external knowledge sources . | ||
| 2020.acl-main.128 In this paper , we explore the *****slot***** tagging with only a few labeled support sentences ( a.k.a . | ||
| D19-1214 Intent detection and *****slot***** filling are two main tasks for building a spoken language understanding ( SLU ) system . | ||
| fine - grained | 19 | |
| N18-1051 Target - dependent classification tasks , such as aspect - level sentiment analysis , perform *****fine - grained***** classifications towards specific targets . | ||
| 2021.cl-1.4 We demonstrate how the resultant data set can be used for *****fine - grained***** analyses and evaluation of representation learning models on the intrinsic tasks of semantic clustering and semantic similarity . | ||
| P19-1490 Grounded in cognitive linguistics , graded lexical entailment ( GR - LE ) is concerned with *****fine - grained***** assertions regarding the directional hierarchical relationships between concepts on a continuous scale . | ||
| D19-1468 User - generated reviews can be decomposed into *****fine - grained***** segments ( e.g. , sentences , clauses ) , each evaluating a different aspect of the principal entity ( e.g. , price , quality , appearance ) . | ||
| P17-1192 We present a method for populating *****fine - grained***** classes ( e.g. , 1950s American jazz musicians ) with instances ( e.g. , Charles Mingus ) . | ||
| fact | 19 | |
| C18-1283 The recently increased focus on misinformation has stimulated research in *****fact***** checking , the task of assessing the truthfulness of a claim . | ||
| 2021.naacl-main.200 Challenging problems such as open - domain question answering , *****fact***** checking , slot filling and entity linking require access to large , external knowledge sources . | ||
| C18-2002 We introduce INCEpTION , a new annotation platform for tasks including interactive and semantic annotation ( e.g. , concept linking , *****fact***** linking , knowledge base population , semantic frame annotation ) . | ||
| 2021.acl-short.51 This work explores a framework for *****fact***** verification that leverages pretrained sequence - to - sequence transformer models for sentence selection and label prediction , two key sub - tasks in fact verification . | ||
| 2021.eacl-main.201 Growing concern with online misinformation has encouraged NLP research on *****fact***** verification . | ||
| document - level | 19 | |
| P17-2062 We propose a new method for extracting pseudo - parallel sentences from a pair of large monolingual corpora , without relying on any *****document - level***** information . | ||
| D19-1233 Rhetorical structure trees have been shown to be useful for several *****document - level***** tasks including summarization and document classification . | ||
| 2020.wmt-1.41 Even though sentence - centric metrics are used widely in machine translation evaluation , *****document - level***** performance is at least equally important for professional usage . | ||
| 2020.acl-main.693 We present Neural Machine Translation ( NMT ) training using *****document - level***** metrics with batch - level documents . | ||
| E17-1090 While previous research on readability has typically focused on *****document - level***** measures , recent work in areas such as natural language generation has pointed out the need of sentence - level readability measures . | ||
| Raw | 19 | |
| K18-2022 We present SParse , our Graph - Based Parsing model submitted for the CoNLL 2018 Shared Task : Multilingual Parsing from *****Raw***** Text to Universal Dependencies ( Zeman et al . , 2018 ) . | ||
| K18-2004 This paper describes the ICS PAS system which took part in CoNLL 2018 shared task on Multilingual Parsing from *****Raw***** Text to Universal Dependencies . | ||
| K17-3019 This paper describes our dependency parsing system in CoNLL-2017 shared task on Multilingual Parsing from *****Raw***** Text to Universal Dependencies . | ||
| K18-2017 We introduce NLP - Cube : an end - to - end Natural Language Processing framework , evaluated in CoNLL 's Multilingual Parsing from *****Raw***** Text to Universal Dependencies 2018 Shared Task . | ||
| K17-3018 This paper presents RACAI 's approach , experiments and results at CONLL 2017 Shared Task : Multilingual Parsing from *****Raw***** Text to Universal Dependencies . | ||
| Statistical Machine Translation ( SMT | 19 | |
| 2002.amta-papers.11 Despite the exciting work accomplished over the past decade in the field of *****Statistical Machine Translation ( SMT***** ) , we are still far from the point of being able to say that machine translation fully meets the needs of real - life users . | ||
| C16-1295 Although more additional corpora are now available for *****Statistical Machine Translation ( SMT***** ) , only the ones which belong to the same or similar domains of the original corpus can indeed enhance SMT performance directly . | ||
| 2010.amta-papers.19 With the steadily increasing demand for high - quality translation , the localisation industry is constantly searching for technologies that would increase translator throughput , in particular focusing on the use of high - quality *****Statistical Machine Translation ( SMT***** ) supplementing the established Translation Memory ( TM ) technology . | ||
| L14-1575 In *****Statistical Machine Translation ( SMT***** ) , the constraints on word reorderings have a great impact on the set of potential translations that are explored . | ||
| L12-1220 Wikipedia articles in different languages have been mined to support various tasks , such as Cross - Language Information Retrieval ( CLIR ) and *****Statistical Machine Translation ( SMT***** ) . | ||
| distant | 19 | |
| N18-1075 We study the problem of textual relation embedding with *****distant***** supervision . | ||
| 2020.acl-main.579 Recent neural models for relation extraction with *****distant***** supervision alleviate the impact of irrelevant sentences in a bag by learning importance weights for the sentences . | ||
| N19-1294 Recently , *****distant***** supervision has gained great success on Fine - grained Entity Typing ( FET ) . | ||
| 2021.emnlp-main.839 Distantly supervised named entity recognition ( DS - NER ) efficiently reduces labor costs but meanwhile intrinsically suffers from the label noise due to the strong assumption of *****distant***** supervision . | ||
| P19-1246 Inspired by Labov 's seminal work on stylisticvariation as a function of social stratification , we develop and compare neural models thatpredict a person 's presumed socio - economicstatus , obtained through *****distant***** supervision , from their writing style on social media . | ||
| Realization Shared | 18 | |
| D19-6307 This study describes the approach developed by the Tilburg University team to the shallow track of the Multilingual Surface ***** Realization Shared ***** Task 2019 (SR'19) (Mille et al., 2019). | ||
| D19-6310 The Multilingual Surface ***** Realization Shared ***** Task 2019 focuses on generating sentences from lemmatized sets of universal dependency parses with rich features. | ||
| D19-6306 We introduce the IMS contribution to the Surface ***** Realization Shared ***** Task 2019. | ||
| W18-3602 Surface ***** Realization Shared ***** Task 2018 is a workshop on generating sentences from lemmatized sets of dependency triples. | ||
| 2020.msr-1.3 In this paper, we describe the ADAPT submission to the Surface ***** Realization Shared ***** Task 2020 | ||
| integrating | 18 | |
| C16-2028 TextPro-AL is a web-based application ***** integrating ***** four components: a machine learning based NLP pipeline, an annotation editor for task definition and text annotations, an incremental re-training procedure based on active learning selection from a large pool of unannotated data, and a graphical visualization of the learning status of the system. | ||
| L06-1319 The LIMSI team is working towards the definition of a coding scheme ***** integrating ***** emotion, context and multimodal annotations. | ||
| L10-1252 In order to overcome the fragmentation that affects the field of Language Resources and Technologies, an Open and Distributed Resource Infrastructure is the necessary step for building on each other achievements, ***** integrating ***** resources and technologies and avoiding dispersed or conflicting efforts. | ||
| L06-1074 The DAM-LR project aims at virtually ***** integrating ***** various European language resource archives that allow users to navigate and operate in a single unified domain of language resources. | ||
| 2020.coling-main.373 We present a two-level annotation scheme for modality that captures both content and intent, ***** integrating ***** a logic-based, semantic representation and a task-oriented, pragmatic representation that maps to our robot's capabilities | ||
| cQA | 18 | |
| E17-2115 The experiments of our model on a SemEval challenge dataset for ***** cQA ***** show a 20% of relative improvement over standard DNNs. | ||
| S19-2204 In this task, we aim to identify factual questions posted on ***** cQA ***** and verify the veracity of answers to these questions. | ||
| C16-1237 Our supervised models for text selection boost the performance of a tree kernel-based machine learning model, allowing it to overtake the current state of the art on a recently released ***** cQA ***** evaluation framework. | ||
| I17-2075 The automation of tasks in community question answering (***** cQA *****) is dominated by machine learning approaches, whose performance is often limited by the number of training examples. | ||
| W18-6119 While there has recently been a lot of work on solving this problem using deep learning models applied to question/answer text, this work has not looked at how to make use of the rich metadata available in ***** cQA ***** forums | ||
| bigrams | 18 | |
| S17-2149 Amongst the methods examined, unigrams and ***** bigrams ***** coupled with simple linear regression obtained the best baseline accuracy. | ||
| L14-1110 Using a Naive Bayes document classification approach based on words, stem ***** bigrams ***** and MeSH descriptors we achieve a macro-average F-score of 61% on a subset of 8 action terms. | ||
| W18-3928 Discriminating between these two languages turned out to be a very hard task, not only for a machine: human performance is only around 0.51 F1 score; our best system is still a simple Naive Bayes model with word unigrams and ***** bigrams *****. | ||
| W18-6206 The baseline (Max-Ent bag of words and ***** bigrams *****) obtains an F1 score of 60 % which was available to the participants during the development phase. | ||
| 2020.lrec-1.328 Moreover, we compiled sentiment lexicons of positive and negative unigrams and ***** bigrams ***** reflecting the code-switches present in the language | ||
| Ablation | 18 | |
| 2021.sigdial-1.52 ***** Ablation ***** experiments are also presented to demonstrate the efficacy of SAM. | ||
| D18-1118 ***** Ablation ***** studies confirm the effectiveness of CMM to comprehend natural language logics under the guidence of images. | ||
| 2020.emnlp-main.515 ***** Ablation ***** studies on different subgraphs and a case study about attribute types further demonstrate the effectiveness of our method. | ||
| 2021.naacl-main.193 ***** Ablation ***** studies on pre-training and downstream tasks show that adding dense captions and constrained attention loss help improve the model performance. | ||
| P19-1234 ***** Ablation ***** further shows a positive effect of normalizing flow, context embeddings and proposed regularizers | ||
| Parallel corpora | 18 | |
| L10-1019 ***** Parallel corpora ***** are indispensable resources for a variety of multilingual natural language processing tasks. | ||
| P17-2094 ***** Parallel corpora ***** are widely used in a variety of Natural Language Processing tasks, from Machine Translation to cross-lingual Word Sense Disambiguation, where parallel sentences can be exploited to automatically generate high-quality sense annotations on a large scale. | ||
| 2020.acl-srw.25 ***** Parallel corpora ***** are key to developing good machine translation systems. | ||
| 1998.amta-papers.7 ***** Parallel corpora ***** are a valuable resource for machine translation, but at present their availability and utility is limited by genre- and domain-specificity, licensing restrictions, and the basic dificulty of locating parallel texts in all but the most dominant of the world's languages. | ||
| W17-3209 ***** Parallel corpora ***** are often not as parallel as one might assume: non-literal translations and noisy translations abound, even in curated corpora routinely used for training and evaluation | ||
| ca | 18 | |
| W17-2707 We present an approach at identifying a specific class of events, movement action events (MAEs), in a data set that consists of ***** ca *****. | ||
| L16-1507 12 students were recruited for the annotation ***** ca *****mpaign of ***** ca *****. | ||
| L08-1512 670 minutes long with a vo***** ca *****bulary of ***** ca *****. | ||
| S19-2084 For this reason, it is important to develop systems ***** ca *****- | ||
| L12-1083 The paper describes a rule-based system for tagging clause boundaries, implemented for annotating the Estonian Reference Corpus of the University of Tartu, a collection of written texts containing ***** ca ***** 245 million running words and available for querying via Keeleveeb language portal | ||
| 10k | 18 | |
| 2021.emnlp-main.582 To do the same for AI systems, we present two datasets: 1) A collection of 1k real-world FPs sourced from quizzes and olympiads; and 2) a bank of ***** 10k ***** synthetic FPs of intermediate complexity to serve as a sandbox for the harder real-world challenge. | ||
| 2021.winlp-1.2 In this paper we are introducing a larger annotated dataset composed of approximately ***** 10k ***** of comments. | ||
| 2020.semeval-1.230 To facilitate better representation learning, we also collect a corpus of ***** 10k ***** news articles, and use it for fine-tuning the model. | ||
| L10-1174 Using transcriptions from the House of Parliament debates and ***** 10k ***** words from news reports, we examine the reality of MND variants in written transcripts of speech. | ||
| N19-1053 Our dataset contains 1k claims, accompanied with pools of ***** 10k ***** and 8k perspective sentences and evidence paragraphs, respectively | ||
| adapters | 18 | |
| 2021.eacl-main.39 We then combine the ***** adapters ***** in a separate knowledge composition step. | ||
| 2021.emnlp-main.383 In this paper, we proposed Mixture-of-Partitions (MoP), an infusion approach that can handle a very large knowledge graph (KG) by partitioning it into smaller sub-graphs and infusing their specific knowledge into various BERT models using lightweight ***** adapters *****. | ||
| 2021.naacl-industry.18 To this end, we propose a multi-lingual multi-task continual learning framework, with auxiliary tasks and language ***** adapters ***** to train universal language representation across regions. | ||
| 2020.emnlp-main.180 This approach enables to learn ***** adapters ***** via language embeddings while sharing model parameters across languages. | ||
| 2021.wmt-1.64 In this work we study the compositionality of language and domain ***** adapters ***** in the context of Machine Translation | ||
| Valence | 18 | |
| 2021.rocling-1.51 ***** Valence ***** represents the degree of pleasant and unpleasant (or positive and negative) feelings, and arousal represents the degree of excitement and calm. | ||
| I17-4002 ***** Valence ***** represents the degree of pleasant and unpleasant (or positive and negative) feelings, and arousal represents the degree of excitement and calm. | ||
| I17-4015 The Mean Absolute Error (MAE) and Pearson's Correlation Coefficient (PCC) for ***** Valence ***** are 0.723 and 0.835, respectively, and those for Arousal are 0.914 and 0.756, respectively. | ||
| I17-4013 This IJCNLP2017-Task2 competition seeks to automatically calculate ***** Valence ***** and Arousal ratings within the hierarchies of vocabulary and phrases in Chinese | ||
| L06-1164 *****Valence***** dictionaries are dictionaries in which logical predicates ( most of the times verbs ) are inventoried alongside with the semantic and syntactic information regarding the role of the arguments with which they combine , as well as the syntactic restrictions these arguments have to obey . | ||
| subtitle | 18 | |
| L16-1436 Firstly, the parallel ***** subtitle ***** data and its corresponding monolingual movie script data are crawled and collected from Internet. | ||
| L12-1027 Subtitling and audiovisual translation have been recognized as areas that could greatly benefit from the introduction of Statistical Machine Translation (SMT) followed by post-editing, in order to increase efficiency of ***** subtitle ***** production process. | ||
| L14-1335 Alignment accuracy results are reported at phoneme, word and ***** subtitle ***** level. | ||
| 2021.iwslt-1.30 Compared to a baseline with unannotated training, this architecture increased the BLEU score of German to English film ***** subtitle ***** translation outputs by 1.61 points using named entity tags; however, the BLEU score decreased by 0.38 points using part-of-speech tags | ||
| W18-6109 We perform automatic paraphrase detection on *****subtitle***** data from the Opusparcus corpus comprising six European languages : German , English , Finnish , French , Russian , and Swedish . | ||
| LRE | 18 | |
| L12-1450 In this paper we describe the ***** LRE ***** Map, reporting statistics on resources associated with ***** LRE *****C2012 papers and providing comparisons with ***** LRE *****C2010 data. | ||
| L12-1264 KALAKA-2 was created to support the Albayzin 2010 Language Recognition Evaluation (***** LRE *****), organized by the Spanish Network on Speech Technologies from June to November 2010. | ||
| L14-1612 This paper describes a serialization of the ***** LRE ***** Map database according to the RDF model. | ||
| 1993.eamt-1.8 Attention will be focused on the plans, work in progress, and a few preliminary results of the ***** LRE ***** project EAGLES (Expert Advisory Group on Language Engineering Standards) | ||
| L16-1716 In this paper we describe the new developments brought to *****LRE***** Map , especially in terms of the user interface of the Web application , of the searching of the information therein , and of the data model updates . | ||
| simplified | 18 | |
| L16-1361 The program has an option to convert ***** simplified ***** forms of phrases into correct phrases in the nominal case. | ||
| W19-2402 They either take character identification for granted (e.g., using simple heuristics on referring expressions), or rely on ***** simplified ***** definitions that do not capture important distinctions between characters and other referents in the story. | ||
| 2020.acl-main.302 In the deep learning (DL) era, parsing models are extremely ***** simplified ***** with little hurt on performance, thanks to the remarkable capability of multi-layer BiLSTMs in context representation. | ||
| C18-1039 However, a valid ***** simplified ***** sentence should also be logically entailed by its input sentence. | ||
| W18-5456 In this submission I report work in progress on learning ***** simplified ***** interpreted languages by means of recurrent models | ||
| Pearson | 18 | |
| S18-1058 Our system ranks 32nd out of 48 participants with a ***** Pearson ***** score of 0.557 in the first subtask, and 20th out of 35 in the fifth subtask with an accuracy score of 0.464. | ||
| N18-1104 This method highly correlates with the gold standard evaluation, obtaining a ***** Pearson ***** correlation coefficient of 0.95. | ||
| S18-1045 We evaluate the effectiveness of our ensemble feature sets on the SemEval-2018 Task 1 datasets and achieve a ***** Pearson ***** correlation of 72% on the task of tweet emotion intensity prediction. | ||
| 2021.emnlp-main.574 Specifically, ValNorm achieves a ***** Pearson ***** correlation of r=0.88 for human judgment scores of valence for 399 words collected to establish pleasantness norms in English. | ||
| S17-2013 The best run out of three submitted runs of our model achieved a ***** Pearson ***** correlation score of 0.8004 compared to a hidden human annotation of 250 pairs | ||
| spatiotemporal | 18 | |
| L08-1561 In order to meet these desiderata we need the MiniSTEx system to be able to draw the conclusions human readers would also draw, e.g. based on their (***** spatiotemporal *****) world knowledge, i.e. the common knowledge such readers share. | ||
| W17-2624 Thus, our work is the first to provide both intrinsic (qualitative) and extrinsic (quantitative) evaluation of text representations for ***** spatiotemporal ***** trends. | ||
| L12-1663 In this paper, we describe the methodology being used to develop certain aspects of ISO-Space, an annotation language for encoding spatial and ***** spatiotemporal ***** information as expressed in natural language text. | ||
| L10-1143 The STEVIN-funded SoNaR project aims to produce a diverse 500-million-word reference corpus of written Dutch, with four semantic annotation layers: named entities, coreference relations, semantic roles and ***** spatiotemporal ***** expressions | ||
| L08-1463 We present a new coding mechanism , *****spatiotemporal***** coding , that allows coders to annotate points and regions in the video frame by drawing directly on the screen . | ||
| EmoContext | 18 | |
| S19-2028 In this paper, I describe a fusion model combining contextualized and static word representations for approaching the ***** EmoContext ***** task in the SemEval 2019 competition. | ||
| S19-2046 In this paper, we present our system submission for the ***** EmoContext *****, the third task of the SemEval 2019 workshop. | ||
| S19-2006 This paper describes the system submitted by ANA Team for the SemEval-2019 Task 3: ***** EmoContext *****. | ||
| S19-2056 The approach we proposed for the ***** EmoContext ***** task is based on the combination of a CNN and an LSTM using a concatenation of word embeddings. | ||
| S19-2032 Task 3, ***** EmoContext *****, in the International Workshop SemEval 2019 provides training and testing datasets for the participant teams to detect emotion classes (Happy, Sad, Angry, or Others) | ||
| tags | 18 | |
| L10-1325 For all ***** tags ***** options in a certain position in a sentence, we normalize P(t) in HMM and MEM separately. | ||
| W19-3607 We introduced two models; ***** tags ***** and words and linear interpolation that use part of speech tag information in addition to word n-grams in order to maximize the likelihood of syntactic appropriateness of the suggestions. | ||
| L14-1088 We look for ***** tags ***** that reflect a goal mention, reward, or a perception of control. | ||
| P19-1014 Yet, the relations between all ***** tags ***** are provided in a tag hierarchy, covering the test ***** tags ***** as a combination of training ***** tags *****. | ||
| C16-2010 In this study we develop a system that ***** tags ***** and extracts financial concepts called financial named entities (FNE) along with corresponding numeric values – monetary and temporal | ||
| adapter | 18 | |
| P19-1616 In this paper, we propose a simple mapping method, named representation ***** adapter *****, to learn the representation mapping for both seen and unseen relations based on previously learned relation embedding. | ||
| 2021.acl-long.47 State-of-the-art parameter-efficient fine-tuning methods rely on introducing ***** adapter ***** modules between the layers of a pretrained language model. | ||
| D19-1165 Our proposed approach consists of injecting tiny task specific ***** adapter ***** layers into a pre-trained model. | ||
| 2021.acl-short.103 Experiments show that ***** adapter ***** tuning offer competitive results to full fine-tuning, while being much more parameter-efficient. | ||
| 2020.emnlp-main.180 To address this, we propose a novel multilingual task adaptation approach based on contextual parameter generation and ***** adapter ***** modules | ||
| Sorani | 18 | |
| L16-1529 In ***** Sorani ***** Kurdish, one of the most useful orthographic features in named-entity recognition – capitalization – is absent, as the language's Perso-Arabic script does not make a distinction between uppercase and lowercase letters. | ||
| 2020.loresmt-1.12 Therefore, in this paper, we are addressing the main issues in creating a machine translation system for the Kurdish language, with a focus on the ***** Sorani ***** dialect. | ||
| 2020.signlang-1.19 This paper reports on a project which aims to develop the necessary data and tools to process the Sign language for ***** Sorani ***** as one of the spoken Kurdish dialects. | ||
| 2020.vardial-1.11 In this paper, as a preliminary study of its kind, we propose an approach for the tokenization of the ***** Sorani ***** and Kurmanji dialects of Kurdish using a lexicon and a morphological analyzer | ||
| C16-1095 This paper describes our construction of named - entity recognition ( NER ) systems in two Western Iranian languages , *****Sorani***** Kurdish and Tajik , as a part of a pilot study of Linguistic Rapid Response to potential emergency humanitarian relief situations . | ||
| SDP | 18 | |
| N19-1298 To extract the relationship between two entities in a sentence, two common approaches are (1) using their shortest dependency path (***** SDP *****) and (2) using an attention model to capture a context-based representation of the sentence. | ||
| P18-1035 In this paper we tackle the challenging task of improving semantic parsing performance, taking UCCA parsing as a test case, and AMR, ***** SDP ***** and Universal Dependencies (UD) parsing as auxiliary tasks. | ||
| 2020.sdp-1.37 The Scholarly Document Processing (***** SDP *****) workshop is to encourage more efforts on natural language understanding of scientific task. | ||
| D19-1392 Experiments separately conducted on three broad-coverage semantic parsing tasks – AMR, ***** SDP ***** and UCCA – demonstrate that our attention-based neural transducer improves the state of the art on both AMR and UCCA, and is competitive with the state of the art on ***** SDP *****. | ||
| 2020.sdp-1.13 We introduce SciWING, an open-source soft-ware toolkit which provides access to state-of-the-art pre-trained models for scientific document processing (***** SDP *****) tasks, such as citation string parsing, logical structure recovery and citation intent classification | ||
| homographic | 18 | |
| S17-2075 This paper describes the participation of ELiRF-UPV team at task 7 (subtask 2: ***** homographic ***** pun detection and subtask 3: ***** homographic ***** pun interpretation) of SemEval2017. | ||
| D18-1272 In this work, we first use WordNet to understand and expand word embedding for settling the polysemy of ***** homographic ***** puns, and then propose a WordNet-Encoded Collocation-Attention network model (WECA) which combined with the context weights for recognizing the puns. | ||
| S17-2011 Our system achieved f-score of calculating, 0.663, and 0.07 in ***** homographic ***** puns and 0.8439, 0.6631, and 0.0806 in heterographic puns in task 1, task 2, and task 3 respectively. | ||
| L06-1349 The main difficulty is to disambiguate these place names by distinguishing places from persons and by selecting the most likely place out of a list of ***** homographic ***** place names world-wide | ||
| N19-1217 A pun is a form of wordplay for an intended humorous or rhetorical effect , where a word suggests two or more meanings by exploiting polysemy ( *****homographic***** pun ) or phonological similarity to another word ( heterographic pun ) . | ||
| clarification | 18 | |
| D19-1172 The ability to ask ***** clarification ***** questions is essential for knowledge-based question answering (KBQA) systems, especially for handling ambiguous phenomena. | ||
| D18-1233 It is further complicated due to the fact that, in practice, most questions are underspecified, and a human assistant will regularly have to ask ***** clarification ***** questions such as “How long have you been working abroad?” | ||
| 2021.sigdial-1.32 The most indicative changes of this latter kind tend to be associated with relatively rare dialogue acts (DAs), such as those involved in ***** clarification ***** exchanges and responses to particular kinds of questions. | ||
| 2021.naacl-main.320 This paper frames dialogue ***** clarification ***** mechanisms as an understudied research problem and a key missing piece in the giant jigsaw puzzle of natural language understanding. | ||
| 2021.splurobonlp-1.6 Our experimental results show that a combination of entropy-based uncertainty detection and beam search, together with multi-source training on ***** clarification ***** question, initial parse, and user answer, results in improvements of 1.2% F1 score on a parser that already performs at 90.26% on the NLMaps dataset for OSM semantic parsing | ||
| tagged | 18 | |
| W18-3503 In this work, we present a unique language ***** tagged ***** and POS-***** tagged ***** dataset of code-mixed English-Hindi tweets related to five incidents in India that led to a lot of Twitter activity. | ||
| L10-1464 This paper introduces a new corpus of consulting dialogues designed for training a dialogue manager that can handle consulting dialogues through spontaneous interactions from the ***** tagged ***** dialogue corpus. | ||
| 2005.mtsummit-ebmt.16 If a match is obtained, the ***** tagged ***** headline in Bengali is retrieved from the example base, the output Bengali headline is generated after retrieving the Bengali equivalents of the English words from appropriate dictionaries and then applying relevant synthesis rules for generating the Bengali surface level words. | ||
| L10-1414 In this paper we exploit a corpus of political discourses collected from various Web sources, ***** tagged ***** with audience reactions, such as applause, as indicators of persuasive expressions. | ||
| 1995.iwpt-1.22 We describe and evaluate experimentally a method to parse a *****tagged***** corpus without grammar modeling a natural language on context - free language . | ||
| oriented | 18 | |
| K19-1071 Task ***** oriented ***** language understanding (LU) in human-to-machine (H2M) conversations has been extensively studied for personal digital assistants. | ||
| 2021.mmtlrl-1.7 In this paper, We explore the effectiveness of different state-of-the-art MNMT methods, which use various data ***** oriented ***** techniques including multimodal pre-training, for low resource languages. | ||
| L10-1382 MorphoPro is the morphological component of TextPro, a suite of tools ***** oriented ***** towards a number of NLP tasks. | ||
| D19-5014 Twitter shows state-sponsored examples designed to maximize division occurring across political lines, ranging from “Obama calls me a clinger, Hillary calls me deplorable, ... and Trump calls me an American” ***** oriented ***** to the political right, to Russian propaganda featuring “Black Lives Matter” material with suggestions of institutional racism in US police forces ***** oriented ***** to the political left | ||
| L12-1062 The chat transcripts of interactions in VWs pose unique opportunities and challenges for language analysis : Firstly , the language of the transcripts is very brief , informal , and task - *****oriented***** . | ||
| learned | 18 | |
| 2021.emnlp-main.77 This paper describes a compact and effective model for low-latency passage retrieval in conversational search based on ***** learned ***** dense representations. | ||
| 2021.sigtyp-1.4 CLAN differs from prior work in that it allows the adversarial training to be conditioned on both ***** learned ***** features and the sentiment prediction, to increase discriminativity for ***** learned ***** representation in the cross-lingual setting. | ||
| 2021.mrl-1.12 We present Mr. TyDi, a multi-lingual benchmark dataset for mono-lingual retrieval in eleven typologically diverse languages, designed to evaluate ranking with ***** learned ***** dense representations. | ||
| L08-1308 The first two strategies stay with the same corpus but try to extract new similar relations with ***** learned ***** rules. | ||
| 2020.coling-main.50 The classic deep learning paradigm learns a model from the training data of a single task and the ***** learned ***** model is also tested on the same task | ||
| literal | 18 | |
| W18-1404 Prior methodologies for understanding spatial language have treated ***** literal ***** expressions such as “Mary pushed the car over the edge” differently from metaphorical extensions such as “Mary's job pushed her over the edge”. | ||
| L14-1113 New annotation guidelines and new processing methods were developed to accommodate English treebank annotation of a parallel English/Chinese corpus of web data that includes alternate English translations (one fluent, one ***** literal *****) of expressions that are idiomatic in the Chinese source. | ||
| W18-5047 Here, substructures within ***** literal ***** concept definition are investigated to reveal the relationship between concepts. | ||
| 2020.lrec-1.719 The first one is FigAN and consists of isolated phrases which are divided into three types: phrases with only ***** literal ***** meaning, with only metaphorical meaning, and phrases which can be interpreted as ***** literal ***** or metaphorical ones depending on a context of use. | ||
| L10-1407 Based on the analysis of 100 nods drawn from the SSPNet corpus of TV political debates, a typology of nods is presented that distinguishes Speakers, Interlocutors and Third Listeners nods, with their subtypes (confirmation, agreement, approval, submission and permission, greeting and thanks, backchannel giving and backchannel request, emphasis, ironic agreement, ***** literal ***** and rhetoric question, and others) | ||
| linguistically annotated | 18 | |
| L06-1216 In this paper we present on-going investigations on how complex syntactic annotation, combined with linguistic semantics, can possibly help in supporting the semi-automatic building of (shallow) ontologies from text by proposing an automated extraction of (possibly underspecified) semantic relations from ***** linguistically annotated ***** text. | ||
| L14-1502 One of the challenges of corpus querying is making sense of the results of a query, especially when a large number of results and ***** linguistically annotated ***** data are concerned. | ||
| L08-1469 A new, ***** linguistically annotated *****, video database for automatic sign language recognition is presented. | ||
| L12-1123 We present results that are based on the use of a dataset containing 330 sentences from videos that were collected and ***** linguistically annotated ***** at Boston University. | ||
| 2021.vardial-1.5 As no ***** linguistically annotated ***** Scots data were available, we manually PoS tagged a small set that is used for evaluation and training | ||
| Historical | 18 | |
| 2021.latechclfl-1.13 In this paper, we conducted BAHP: a benchmark of assessing word embeddings in ***** Historical ***** Portuguese, which contains four types of tests: analogy, similarity, outlier detection, and coherence. | ||
| C16-1088 *****Historical***** texts are challenging for natural language processing because they differ linguistically from modern texts and because of their lack of orthographical and grammatical standardisation . | ||
| W19-4734 *****Historical***** change typically is the result of complex interactions between several linguistic factors . | ||
| 2021.nodalida-main.24 *****Historical***** corpora are known to contain errors introduced by OCR ( optical character recognition ) methods used in the digitization process , often said to be degrading the performance of NLP systems . | ||
| W16-4009 *****Historical***** treebanks tend to be manually annotated , which is not surprising , since state - of - the - art parsers are not accurate enough to ensure high - quality annotation for historical texts . | ||
| rumour | 18 | |
| S19-2147 There were two concrete tasks; ***** rumour ***** stance prediction and ***** rumour ***** verification, which we present in detail along with results achieved by participants. | ||
| D19-3020 A significant challenge is how to keep ***** rumour ***** analysis tools up-to-date as new information becomes available for particular ***** rumour *****s that spread in a social network. | ||
| 2020.findings-emnlp.365 Notably, our dataset aligns with an existing Twitter SD dataset: their union thus addresses a key shortcoming of previous works, by providing the first dedicated resource to study multi-genre SD as well as the interplay of signals from social media and news sources in ***** rumour ***** verification. | ||
| C18-1288 Automatic resolution of ***** rumour *****s is a challenging task that can be broken down into smaller components that make up a pipeline, including ***** rumour ***** detection, ***** rumour ***** tracking and stance classification, leading to the final outcome of determining the veracity of a ***** rumour ***** | ||
| S17-2086 This paper describes our submissions to task 8 in SemEval 2017 , i.e. , Determining *****rumour***** veracity and support for rumours . | ||
| parsed corpus | 18 | |
| L14-1138 In this paper, we describe Refractive, an open-source tool to extract propositions from a ***** parsed corpus ***** based on the Hadoop variant of MapReduce. | ||
| 2021.repl4nlp-1.22 Learning SP has generally been seen as a supervised task, because it requires a ***** parsed corpus ***** as a source of syntactically related word pairs. | ||
| 2001.mtsummit-papers.5 This paper describes a system for finding phrasal translation correspondences from parallel ***** parsed corpus ***** that are collections paired English and Japanese sentences. | ||
| L14-1661 We show that even if we only have a relatively small ***** parsed corpus ***** of one language, namely 53,000 words of Faroese, we can obtain better results by adding information about phrase structure from a closely related language which has a similar syntax. | ||
| W17-6302 In this paper, we present an approach to improve the accuracy of a strong transition-based dependency parser by exploiting dependency language models that are extracted from a large ***** parsed corpus ***** | ||
| situated | 18 | |
| 2021.sigdial-1.37 We describe the corpus data and a corresponding annotation scheme to offer insight into the form and content of questions that humans ask to facilitate learning in a ***** situated ***** environment. | ||
| 2021.emnlp-main.85 It provides information that captures partners' beliefs of the world and of each other as an interaction unfolds, bringing abundant opportunities to study human collaborative behaviors in ***** situated ***** language communication. | ||
| 2021.mmsr-1.7 With this paper, we intend to start a discussion on the annotation of referential phenomena in ***** situated ***** dialogue. | ||
| W18-5014 When interacting with robots in a ***** situated ***** spoken dialogue setting, human dialogue partners tend to assign anthropomorphic and social characteristics to those robots | ||
| W18-5010 We present a modular , end - to - end dialogue system for a *****situated***** agent to address a multimodal , natural language dialogue task in which the agent learns complex representations of block structure classes through assertions , demonstrations , and questioning . | ||
| syntactically annotated corpora | 18 | |
| L12-1412 In this paper, we present an algorithm for graph matching that is tailored to the properties of large, ***** syntactically annotated corpora *****. | ||
| W17-8102 There are few morphologically and ***** syntactically annotated corpora ***** for Romanian, and those existing or in progress only deal with the Contemporary Romanian standard. | ||
| L12-1579 However, most of the world's languages do not have large amounts of ***** syntactically annotated corpora ***** available for building parsers. | ||
| L08-1213 This paper addresses the question how to compare ***** syntactically annotated corpora ***** and gain insights into the usefulness of specific design decisions. | ||
| L06-1180 However, it is fairly useful for many purposes: parsing evaluation, researching methods for truly combining different parsing outputs to reach better parsing performances, and building larger ***** syntactically annotated corpora ***** for data-driven approaches | ||
| shallow | 18 | |
| D18-1194 We present: (1) a form of decompositional semantic analysis designed to allow systems to target varying levels of structural complexity (***** shallow ***** to deep analysis), (2) an evaluation metric to measure the similarity between system output and reference semantic analysis, (3) an end-to-end model with a novel annotating mechanism that supports intra-sentential coreference, and (4) an evaluation dataset on which our model outperforms strong baselines by at least 1.75 F1 score. | ||
| D19-1305 We propose a modular approach to surface realisation which models each of these components separately, and evaluate our approach on the 10 languages covered by the SR'18 Surface Realisation Shared Task ***** shallow ***** track. | ||
| D18-1338 While current state-of-the-art NMT models, such as RNN seq2seq and Transformers, possess a large number of parameters, they are still ***** shallow ***** in comparison to convolutional models used for both text and vision applications. | ||
| L12-1349 Many state-of-the-art treebank-based probabilistic parsing approaches are scalable and robust but often also ***** shallow *****: they do not capture LDDs and represent only local information. | ||
| 2018.gwc-1.13 After constructing a new data set on the exams and doing ***** shallow ***** experiments on it, we now employ the OpenWordnet-PT to verify whether using word senses and relations we can improve previous results | ||
| Fact | 18 | |
| E17-3010 In this paper we present our automated fact checking system demonstration which we developed in order to participate in the Fast and Furious ***** Fact ***** Check challenge. | ||
| S19-2149 We present SemEval-2019 Task 8 on *****Fact***** Checking in Community Question Answering Forums , which features two subtasks . | ||
| 2020.acl-main.549 *****Fact***** checking is a challenging task because verifying the truthfulness of a claim requires reasoning about multiple retrievable evidence . | ||
| 2021.emnlp-main.301 Natural language ( NL ) explanations of model predictions are gaining popularity as a means to understand and verify decisions made by large black - box pre - trained models , for tasks such as Question Answering ( QA ) and *****Fact***** Verification . | ||
| 2020.acl-main.655 *****Fact***** Verification requires fine - grained natural language inference capability that finds subtle clues to identify the syntactical and semantically correct but not well - supported claims . | ||
| bilingual lexicons | 18 | |
| W16-4508 And, in this paper, we propose to use word-to-word translations to learn morph-units (comprising of bilingual stems and suffixes) from those ***** bilingual lexicons ***** for two language pairs L1-L2 and L1-L3 to induce a bilingual lexicon for the language pair L2-L3, apart from also learning morph-units for this other language pair. | ||
| P19-1018 We then propose Bilingual Lexicon Induction with Semi-Supervision (BLISS) — a semi-supervised approach that relaxes the isometric assumption while leveraging both limited aligned ***** bilingual lexicons ***** and a larger set of unaligned word embeddings, as well as a novel hubness filtering technique. | ||
| E17-1088 Even in cases where text data is missing, there are some languages for which ***** bilingual lexicons ***** are available, since creating lexicons is a fundamental task of documentary linguistics. | ||
| 2002.amta-papers.7 In the IBM LMT Machine Translation (MT) system, a built-in strategy provides lexical coverage of a particular subset of words that are not listed in its ***** bilingual lexicons ***** | ||
| L12-1400 We illustrate the use of such a trilingual resource for automatic induction of ***** bilingual lexicons *****, which is a real challenge for under-represented languages. | ||
| bilingual corpus | 18 | |
| 2003.mtsummit-papers.12 In this paper, we present a new model of learning-based MT (entitled BTL: Bitext-Transfer Learning) that learns from ***** bilingual corpus ***** to extract disambiguating rules. | ||
| D18-1038 We specifically discuss two types of common parallel resources: ***** bilingual corpus ***** and bilingual dictionary, and design different transfer learning strategies accordingly. | ||
| 2001.mtsummit-papers.69 The final results of the corpus pre-processing are a segmented/bracketed aligned ***** bilingual corpus ***** and a statistical dictionary. | ||
| L08-1184 The strengths of monolingual terminology extraction for each language are exploited to improve the performance of terminology extraction in the other language, thanks to the availability of a sentence-level aligned ***** bilingual corpus *****, and an automatic noun phrase alignment mechanism. | ||
| 2021.wmt-1.85 We created synthetic terms using phrase tables extracted from ***** bilingual corpus ***** to increase the proportion of term translations in training data | ||
| Scientific | 18 | |
| R19-1099 The experiments show that our proposed model achieves 0.5 point gain in BLEU on the Asian ***** Scientific ***** Paper Excerpt Corpus Japanese-to-English translation task. | ||
| 2020.udw-1.13 Our starting assumption is that over time, ***** Scientific ***** English develops specific syntactic choice preferences that increase efficiency in (expert-to-expert) communication. | ||
| S17-2161 This paper presents a system that participated in SemEval 2017 Task 10 ( subtask A and subtask B ): Extracting Keyphrases and Relations from *****Scientific***** Publications ( Augenstein et al . , 2017 ) . | ||
| S17-2163 This paper describes our approach to the SemEval 2017 Task 10 : Extracting Keyphrases and Relations from *****Scientific***** Publications , specifically to Subtask ( B ): Classification of identified keyphrases . | ||
| 2020.sdp-1.41 This paper presents our methods for the LongSumm 2020 : Shared Task on Generating Long Summaries for *****Scientific***** Documents , where the task is to generatelong summaries given a set of scientific papers provided by the organizers . | ||
| temporal annotation | 18 | |
| L10-1513 BAT has been used mainly for ***** temporal annotation *****, but can be considered a more general tool for several kinds of textual annotation. | ||
| L16-1602 These modifications concern mainly (1) Enrichments of well identified features of the norm: temporal function of TIMEX time expressions, additional types for TLINK temporal relations; (2) Deeper modifications concerning the units or features annotated: clarification between time and tense for EVENT units, coherence of representation between temporal signals (the SIGNAL unit) and TIMEX modifiers (the MOD feature); (3) A recommendation to perform ***** temporal annotation ***** on top of a syntactic (rather than lexical) layer (***** temporal annotation ***** on a treebank). | ||
| 2020.emnlp-main.432 We argue that temporal dependency graphs, built on previous research on narrative times and temporal anaphora, provide a representation scheme that achieves a good trade-off between completeness and practicality in ***** temporal annotation ***** | ||
| L14-1439 We test on whether syntactic annotations can be used to validate ***** temporal annotation *****s: to find missing or partial annotations. | ||
| 2020.lrec-1.271 We present a new ***** temporal annotation ***** standard, THEE-TimeML, and a corpus TheeBank enabling precise temporal information extraction (TIE) for event-based surveillance (EBS) systems in the public health domain. | ||
| linked | 18 | |
| L14-1232 Multilingual and cross-lingual information access can be facilitated by the availability of such lexica, e.g., allowing for an easy mapping of natural language expressions in different languages to ***** linked ***** data resources from LOD. | ||
| W18-5025 To this end, we describe the Linked-Data SDS (LD-SDS), a system that exploits semantic knowledge bases that connect to ***** linked ***** data, and supports complex constraints and preferences. | ||
| W17-5544 The system's backend is NPCEditor, a response selection platform trained on ***** linked ***** questions and answers; to our knowledge this is the first retrieval-based chatbot deployed on a large public social network. | ||
| L14-1126 As language resources start to become available in ***** linked ***** data formats, it becomes relevant to consider how ***** linked ***** data interoperability can play a role in active language processing workflows as well as for more static language resource publishing. | ||
| L14-1317 Moreover, each entity is connected to ***** linked ***** and non ***** linked ***** resources, including DBpedia and VIAF | ||
| text similarity | 18 | |
| 2021.adaptnlp-1.13 We demonstrate the effectiveness of LPL-optimized alignment on semantic ***** text similarity ***** (STS), natural language inference (SNLI), multi-genre language inference (MNLI) and cross-lingual word alignment (CLA) showing consistent improvements, finding up to 16% improvement over our baseline in lower resource settings. | ||
| 2021.emnlp-main.689 Thus, ranking methods based on task and ***** text similarity ***** — as suggested in prior work — may not be sufficient to identify promising sources. | ||
| S17-2045 In addition to traditional NLP features, we introduce several neural network based matching features which enable our system to measure ***** text similarity ***** beyond lexicons. | ||
| 2020.knlp-1.3 (2020) formulate medical concept normalization (MCN) as ***** text similarity ***** problem and propose a model based on RoBERTa and graph embedding based target concept vectors. | ||
| W18-4513 Evaluating our technique on three data sets, we find that our approach performs competitive to ***** text similarity ***** scores borrowed from machine translation evaluation, being much harder to interpret. | ||
| neural word embeddings | 18 | |
| S17-2031 The first stage deals with constructing ***** neural word embeddings *****, the components of sentence embeddings. | ||
| P17-1036 The model improves coherence by exploiting the distribution of word co-occurrences through the use of ***** neural word embeddings *****. | ||
| W19-8909 Wordnet hypernym relations are used to extract term-frequency concept information, subsequently concatenated to sentence-level representations produced by aggregated deep ***** neural word embeddings *****. | ||
| S18-1161 As a,b,q are represented with ***** neural word embeddings *****, we tested vector operations allowing us to measure membership, i.e. | ||
| D17-1037 Existing methods of ***** neural word embeddings *****, including SGNS, are multi-pass algorithms and thus cannot perform incremental model update. | ||
| corpus linguistics | 18 | |
| 2020.lrec-1.855 The corpus database is distributed to permit fast indexing, and provides a simple web front-end with ***** corpus linguistics ***** methods for sub-corpus comparison and retrieval. | ||
| L12-1376 Although our Linguistic Analysis Multimodal Platform (LAMP) has been applied to the Classical Arabic language of the Quran, we argue that our annotation model and software architecture may be of interest to other related ***** corpus linguistics ***** projects. | ||
| 2020.sltu-1.3 The high-quality annotated speech datasets described in this paper can be used to, among other things, build text-to-speech systems, serve as adaptation data in automatic speech recognition and provide useful phonetic and phonological insights in ***** corpus linguistics *****. | ||
| L10-1224 This unique data source provides a rich resource for future research in many areas of language impairment and has been constructed to facilitate analysis with natural language processing and ***** corpus linguistics ***** techniques. | ||
| L06-1231 The paper presents the emerging face of ***** corpus linguistics ***** where a corpus is used to bootstrap both the terminology and the significant meaning bearing patterns from the corpus. | ||
| computational semantics | 18 | |
| W19-1101 We conclude that the success of the DH in ***** computational semantics ***** rests on a post hoc effect: DS presupposes a referential semantics on the basis of which utterances can be produced, comprehended and analysed in the first place. | ||
| 2020.acl-tutorials.3 We expect that the tutorial will be of interest to researchers in dialogue systems, ***** computational semantics ***** and cognitive modeling, and hope that it will catalyze research and system building that more directly explores the creative, strategic ways conversational agents might be able to seek and offer evidence about their understanding of their interlocutors. | ||
| D17-1185 Detection of lexico-semantic relations is one of the central tasks of ***** computational semantics *****. | ||
| L06-1117 In this paper we present a novel method for automatic text summarization through text extraction, using ***** computational semantics *****. | ||
| D17-1113 Research in ***** computational semantics ***** is increasingly guided by our understanding of human semantic processing. | ||
| parsing algorithm | 18 | |
| Q18-1005 Specifically, we embed a differentiable non-projective ***** parsing algorithm ***** into a neural model and use attention mechanisms to incorporate the structural biases. | ||
| Q14-1032 We present a polynomial-time ***** parsing algorithm ***** for CCG, based on a new decomposition of derivations into small, shareable parts. | ||
| W03-3020 We present a new formalism, partially ordered multiset context-free grammars (poms-CFG), along with an Earley-style ***** parsing algorithm *****. | ||
| 1993.iwpt-1.12 Because our approach is a modification to a standard context-free ***** parsing algorithm *****, all the techniques and grammars developed for the standard parser can be applied as they are. | ||
| 1995.iwpt-1.15 In this paper we present a robust ***** parsing algorithm ***** based on the link grammar formalism for parsing natural languages. | ||
| computational complexity | 18 | |
| D18-1502 Our results on the challenging Google Billion word corpus show that both FOFE and dual FOFE yield very strong performance while significantly reducing the ***** computational complexity ***** over other NNLMs. | ||
| W17-5302 Meanwhile, ***** computational complexity ***** are remarkably reduced by avoiding traversing the vocabulary. | ||
| W18-1705 Spectral clustering has received a lot of attention due to its ability to separate nonconvex, non-intersecting manifolds, but its high ***** computational complexity ***** has significantly limited its applicability. | ||
| 2021.naacl-main.406 As the excessive pre-training cost arouses the need to improve efficiency, considerable efforts have been made to train BERT progressively–start from an inferior but low-cost model and gradually increase the ***** computational complexity *****. | ||
| 2020.conll-1.14 Using agent-based simulations and ***** computational complexity ***** analyses, we compare the efficiency of these strategies in terms of communicative success, computation cost and interaction cost. | ||
| embedding spaces | 18 | |
| 2020.vardial-1.6 However, these approaches require cross-lingual information such as seed dictionaries to train the model and find a linear transformation between the word ***** embedding spaces *****. | ||
| P19-1312 Despite its remarkable results, unsupervised mapping is also well-known to be limited by the original dissimilarity between the word ***** embedding spaces ***** to be mapped. | ||
| P19-1018 Our proposed method obtains state of the art results on 15 of 18 language pairs on the MUSE dataset, and does particularly well when the ***** embedding spaces ***** don't appear to be isometric. | ||
| P17-1042 Our method exploits the structural similarity of ***** embedding spaces *****, and works with as little bilingual evidence as a 25 word dictionary or even an automatically generated list of numerals, obtaining results comparable to those of systems that use richer resources. | ||
| P17-2071 Here, I show that temporal word analogies (“word w_1 at time t_α is like word w_2 at time t_β”) can effectively be modeled with diachronic word embeddings, provided that the independent ***** embedding spaces ***** from each time period are appropriately transformed into a common vector space. | ||
| vector spaces | 18 | |
| 2021.acl-long.139 Separately embedding the individual knowledge sources into ***** vector spaces ***** has demonstrated tremendous successes in encoding the respective knowledge, but how to jointly embed and reason with both knowledge sources to fully leverage the complementary information is still largely an open problem. | ||
| L16-1733 In this paper, we investigate a new model to induce such ***** vector spaces ***** for medical concepts, based on a joint objective that exploits not only word co-occurrences but also manually labeled documents, as available from sources such as PubMed. | ||
| 2020.wanlp-1.17 Recent work has shown that distributional word ***** vector spaces ***** often encode human biases like sexism or racism. | ||
| P19-1315 Past work has conjectured that linear substructures exist in ***** vector spaces ***** because relations can be represented as ratios; we prove that this holds for SGNS. | ||
| D19-5322 DBee provides a unique data model which operates jointly over large-scale knowledge graphs (KGs) and embedding ***** vector spaces ***** (VSs). | ||
| benchmark dataset | 18 | |
| 2020.coling-main.278 Our proposed LaAP-Net outperforms existing approaches on three ***** benchmark dataset *****s for the text VQA task by a noticeable margin. | ||
| D17-1310 Experiment results over two ***** benchmark dataset *****s demonstrate the effectiveness of our framework. | ||
| D19-6203 The experimental results suggest that dependency-based pooling is the best pooling strategy for RE in the biomedical domain, yielding the state-of-the-art performance on two ***** benchmark dataset *****s for this problem. | ||
| 2020.acl-main.277 Experimental results on three widely-used ***** benchmark dataset *****s show that our proposed model achieves more than 4 times speedup while maintaining comparable performance compared with the corresponding autoregressive model. | ||
| P18-1093 We conduct extensive experiments on six ***** benchmark dataset *****s from Twitter, Reddit and the Internet Argument Corpus. | ||
| generative adversarial | 18 | |
| 2020.acl-main.191 In this paper, we propose GAN-BERT that ex- tends the fine-tuning of BERT-like architectures with unlabeled data in a ***** generative adversarial ***** setting. | ||
| 2020.findings-emnlp.218 We build our model based on the conditional ***** generative adversarial ***** network, and propose to incorporate a simple yet effective diversity loss term into the model in order to improve the diversity of outputs. | ||
| D18-1387 In particular, we investigate context-aware and context-agnostic models for predicting vague words, and explore auxiliary-classifier ***** generative adversarial ***** networks for characterizing sentence vagueness. | ||
| 2021.naacl-industry.30 We propose OodGAN, a sequential ***** generative adversarial ***** network (SeqGAN) based model for OOD data generation. | ||
| N18-1133 Inspired by ***** generative adversarial ***** networks (GANs), we use one knowledge graph embedding model as a negative sample generator to assist the training of our desired model, which acts as the discriminator in GANs. | ||
| perspective | 18 | |
| 2020.emnlp-main.463 Our objective here is to understand __what complicates Transformer training__ from both empirical and theoretical ***** perspective *****s. | ||
| 2020.signlang-1.3 The utterance unit is an original concept for segmenting and annotating sign language dialogue referring to signer's native sense from the ***** perspective *****s of Conversation Analysis (CA) and Interaction Studies. | ||
| 2020.coling-main.235 The novel framework shows an interesting ***** perspective ***** on machine reading comprehension and cognitive science. | ||
| W17-4913 From an empirical ***** perspective *****, the key question is how to operationalize style and thus make it accessible for annotation and quantification. | ||
| 2020.lrec-1.765 In this study, we explore the phenomenon of swearing in Twitter conversations, taking the possibility of predicting the abusiveness of a swear word in a tweet context as the main investigation ***** perspective *****. | ||
| semantic lexicons | 18 | |
| L10-1446 This paper describes a Web service for accessing WordNet-type ***** semantic lexicons *****. | ||
| L16-1101 In this paper, we propose using word embeddings and ***** semantic lexicons ***** for OOV paraphrasing. | ||
| E17-5004 using context selection, extracting co-occurrence information from word patterns, attending over contexts); and b) Knowledge-base driven approaches which exploit available resources to encode external information into distributional vector spaces, injecting knowledge from ***** semantic lexicons ***** (e.g., WordNet, FrameNet, PPDB). | ||
| L16-1416 In this paper, we report on the construction of large-scale multilingual ***** semantic lexicons ***** for twelve languages, which employ the unified Lancaster semantic taxonomy and provide a multilingual lexical knowledge base for the automatic UCREL semantic annotation system (USAS). | ||
| W17-1908 Creating high-quality wide-coverage multilingual ***** semantic lexicons ***** to support knowledge-based approaches is a challenging time-consuming manual task. | ||
| predicate argument structure | 18 | |
| N18-2065 Here, it is important that the parser processes the sentences consistently; failing to recognize the similar syntactic structure results in inconsistent ***** predicate argument structure *****s among them, in which case the succeeding theorem proving is doomed to failure. | ||
| W19-3309 Meta-semantic representation consists of three parts, entities, ***** predicate argument structure *****s, and discourse attributes, that derive rich knowledge graphs. | ||
| P18-1054 Our experimental results demonstrate the proposed method can improve the performance of the inter-sentential zero anaphora resolution drastically, which is a notoriously difficult task in ***** predicate argument structure ***** analysis. | ||
| C16-1269 In this paper, we propose utilising eye gaze information for estimating parameters of a Japanese ***** predicate argument structure ***** (PAS) analysis model. | ||
| L14-1011 This research focuses on expanding PropBank, a corpus annotated with ***** predicate argument structure *****s, with new predicate types; namely, noun, adjective and complex predicates, such as Light Verb Constructions. | ||
| text documents | 18 | |
| C18-1290 Standard word embedding algorithms learn vector representations from large corpora of ***** text documents ***** in an unsupervised fashion. | ||
| L14-1504 This framework provides a simple interface to end users via which they can deploy one or more NLPCURATOR instances on EC2, upload plain ***** text documents *****, specify a set of Text Analytics tools (NLP annotations) to apply, and process and store or download the processed data. | ||
| 2020.clssts-1.2 The Machine Translation for English Retrieval of Information in Any Language (MATERIAL) research program, sponsored by the Intelligence Advanced Research Projects Activity (IARPA), focuses on rapid development of end-to-end systems capable of retrieving foreign language speech and ***** text documents ***** relevant to different types of English queries that may be further restricted by domain. | ||
| 2021.ranlp-1.45 In this work, we propose a method of de-identifying free-form ***** text documents ***** by carefully redacting sensitive data in them. | ||
| 2021.louhi-1.4 Biomedical entity linking is the task of identifying mentions of biomedical concepts in ***** text documents ***** and mapping them to canonical entities in a target thesaurus. | ||
| announcements | 18 | |
| 2002.amta-studies.2 Most large companies are very good at “getting the message out” –publishing reams of ***** announcements ***** and documentation to their employees and customers. | ||
| 2021.smm4h-1.1 Our results confirm a generally positive connection between the ***** announcements ***** of NPIs and Twitter sentiment, and we document a promising correlation between the results of this study and a public-health survey of popular compliance with NPIs. | ||
| 2021.eacl-main.175 Moreover, we annotate a large dataset FinReason for evaluation, which provides Reasons annotation for Financial events in company ***** announcements *****. | ||
| 2021.econlp-1.8 Most policy ***** announcements ***** have taken the form of text to inform their new policies or changes to the public. | ||
| 2020.nlpcovid19-acl.14 We see, for example, that lockdown ***** announcements ***** correlate with a deterioration of mood in almost all surveyed countries, which recovers within a short time span. | ||
| offline speech translation | 18 | |
| 2020.iwslt-1.2 This paper describes the ON-TRAC Consortium translation systems developed for two challenge tracks featured in the Evaluation Campaign of IWSLT 2020, ***** offline speech translation ***** and simultaneous speech translation. | ||
| 2020.iwslt-1.8 This paper describes FBK's participation in the IWSLT 2020 ***** offline speech translation ***** (ST) task. | ||
| 2020.iwslt-1.6 This paper describes the system that was submitted by DiDi Labs to the ***** offline speech translation ***** task for IWSLT 2020. | ||
| 2021.iwslt-1.6 We participate in the ***** offline speech translation ***** and text-to-text simultaneous translation tracks. | ||
| 2020.iwslt-1.10 This paper describes the University of Helsinki Language Technology group's participation in the IWSLT 2020 ***** offline speech translation ***** task, addressing the translation of English audio into German text. | ||
| decision making | 18 | |
| 2021.econlp-1.9 In ***** decision making ***** in the economic field, an especially important requirement is to rapidly understand news to absorb ever-changing economic situations. | ||
| W16-4206 Text mining of such clinical records has gained huge attention in various medical applications like treatment and ***** decision making *****. | ||
| 2020.findings-emnlp.273 However, the large amount of computation necessary to adequately train and explore the search space of sequential ***** decision making *****, under a reinforcement learning paradigm, precludes the inclusion of large contextualized language models, which might otherwise enable the desired generalization ability. | ||
| 2021.cinlp-1.2 Using observed language to understand interpersonal interactions is important in high-stakes ***** decision making *****. | ||
| 2021.acl-demo.10 Our tool also provides additional useful information including explanations, to help the regulatory staff interpret the prediction results, and similar past cases as well as non-compliance to regulations, to support the ***** decision making *****. | ||
| proposed method | 18 | |
| 2020.coling-main.45 Experiments over an Amazon review dataset indicate superior performance of the ***** proposed method *****. | ||
| L14-1587 We experimented with the ***** proposed method *****ology over a sample of triples extracted from 10 DBpedia ontology properties. | ||
| W19-0509 We evaluate the ***** proposed method ***** on a set of over 15,000 hospital reviews. | ||
| 2021.naacl-main.311 Experimental results on several language pairs show that the ***** proposed method *****s substantially outperform conventional UNMT systems. | ||
| 2020.acl-main.327 Our experiments show that our ***** proposed method ***** using Cross-lingual Language Model (XLM) trained with a translation language modeling (TLM) objective achieves a higher correlation with human judgments than a baseline method that uses only hypothesis and reference sentences. | ||
| abstractive document summarization | 18 | |
| 2020.emnlp-main.297 These objectives include sentence reordering, next sentence generation and masked document generation, which have close relations with the ***** abstractive document summarization ***** task. | ||
| P17-1108 Unfortunately, attempts on ***** abstractive document summarization ***** are still in a primitive stage, and the evaluation results are worse than extractive methods on benchmark datasets. | ||
| 2021.naacl-srw.20 This paper proposes a new ***** abstractive document summarization ***** model, hierarchical BART (Hie-BART), which captures hierarchical structures of a document (i.e., sentence-word structures) in the BART model. | ||
| 2021.ranlp-1.178 Neural sequence-to-sequence (Seq2Seq) models and BERT have achieved substantial improvements in ***** abstractive document summarization ***** (ADS) without and with pre-training, respectively. | ||
| 2020.acl-main.173 In this paper we have analyzed limitations of these models for ***** abstractive document summarization ***** and found that these models are highly prone to hallucinate content that is unfaithful to the input document. | ||
| term memory | 18 | |
| 2021.ltedi-1.22 This paper proposes a bidirectional long short-***** term memory ***** (BiLSTM) with the attention-based approach, in solving the hope speech detection problem. | ||
| D18-1296 The model consists of a long-short-***** term memory ***** (LSTM) recurrent network that reads the entire word-level history of a conversation, as well as information about turn taking and speaker overlap, in order to predict each next word. | ||
| 2020.cogalex-1.1 The results suggest that corpora representative for an individual's long-***** term memory ***** structure can better explain reading performance than a norm corpus, and that recently acquired information is lexically accessed rapidly. | ||
| 2020.lrec-1.361 We implement several baseline approaches of conditional random field (CRF) and recent popular state-of-the-art bi-directional long-short ***** term memory ***** (Bi-LSTM) models. | ||
| 2020.semeval-1.281 I present the system based on the architecture of bidirectional long short-***** term memory ***** networks (BiLSTM) concatenated with lexicon-based features and a social-network specific feature and then followed by two fully connected dense layers for detecting Turkish offensive tweets. | ||
| dynamic programming | 18 | |
| N18-1086 Exact marginalization is made tractable through ***** dynamic programming ***** over shift-reduce parsing and minimal RNN-based feature sets. | ||
| 2011.iwslt-evaluation.24 Many syntactic machine translation decoders, including Moses, cdec, and Joshua, implement bottom-up ***** dynamic programming ***** to integrate N-gram language model probabilities into hypothesis scoring. | ||
| Q19-1023 We formalize this idea in a generative model of punctuation that admits efficient ***** dynamic programming *****. | ||
| Q17-1019 Pruning hypotheses during ***** dynamic programming ***** is commonly used to speed up inference in settings such as parsing. | ||
| 1999.mtsummit-1.46 In this paper we describe a language recognition algorithm for multilingual documents that is based on mixed-order n-grams, Markov chains, maximum likelihood, and ***** dynamic programming *****. | ||
| simultaneous interpretation | 18 | |
| C18-2020 Therefore, we offer an automatic ***** simultaneous interpretation ***** service for students. | ||
| L06-1064 4,578 pairs of English-Japanese aligned utterances in CIAIR ***** simultaneous interpretation ***** database were used. | ||
| L14-1178 This makes it possible to compare translation data with ***** simultaneous interpretation ***** data. | ||
| 2020.findings-emnlp.12 Verb prediction is important for understanding human processing of verb-final languages, with practical applications to real-time ***** simultaneous interpretation ***** from verb-final to verb-medial languages. | ||
| 2013.iwslt-papers.3 In this paper, we examine the possibilities of additionally incorporating ***** simultaneous interpretation ***** data (made by simultaneous interpreters) in the learning process. | ||
| public health | 18 | |
| L14-1739 With the rapid growth of social media, there is increasing potential to augment traditional ***** public health ***** surveillance methods with data from social media. | ||
| 2020.louhi-1.15 Interestingly, we show that those approaches are less effective for capturing the nuances of foodborne illness, our ***** public health ***** application of interest. | ||
| 2020.smm4h-1.1 Social media platforms offer extensive information about the development of the COVID-19 pandemic and the current state of ***** public health *****. | ||
| 2021.socialnlp-1.4 To our knowledge, this is the first work analyzing the effects of COVID-19 on Yelp restaurant reviews and could potentially inform policies by ***** public health ***** departments, for example, to cover resource utilization. | ||
| 2005.mtsummit-invited.4 Information from GPHIN is provided to the WHO, international governments and non-governmental organizations who can then quickly react to ***** public health ***** incidents. | ||
| finite state | 18 | |
| D18-1152 Recently, connections have been shown between convolutional neural networks (CNNs) and weighted ***** finite state ***** automata (WFSAs), leading to new interpretations and insights. | ||
| K19-1012 Our technical contributions include ways of handling large vocabularies, algorithms to correct capitalization errors in user data, and efficient ***** finite state ***** transducer algorithms to convert word language models to word-piece language models and vice versa. | ||
| 2021.iwpt-1.1 This paper describes a general ***** finite state ***** technique for deriving oracles. | ||
| L10-1006 A lattice is a directed acyclic graph (DAG), a subclass of nondeterministic ***** finite state ***** automata (NFA). | ||
| 1995.iwpt-1.24 Error - tolerant recognition enables the recognition of strings that deviate slightly from any string in the regular set recognized by the underlying *****finite state***** recognizer . | ||
| user interface | 18 | |
| L10-1374 Although there are a few publicly available tools which support distributed collaborative text annotation, most of them have complex ***** user interface *****s and require a significant amount of involvement from the annotators/contributors as well as the project developers and administrators. | ||
| L06-1194 This was also the starting point for a specific software product (MEDUSA) which addresses the needs of rapid prototyping of these ***** user interface *****s from the earliest stages of the design and analysis phases. | ||
| 2011.mtsummit-tutorials.4 The expectation had however always been that MT could one day be deployed on the bulk of ***** user interface ***** and product documentation, due to the expected process efficiencies and cost savings. | ||
| 2020.lrec-1.54 The study contributes new insights to the human-agent interaction and the voice ***** user interface ***** design. | ||
| 1998.amta-systems.5 The ***** user interface ***** is done through the web. | ||
| robust parsing | 18 | |
| 1995.iwpt-1.15 In this paper we present a ***** robust parsing ***** algorithm based on the link grammar formalism for parsing natural languages. | ||
| L08-1026 In this paper we propose a partial parsing model which achieves ***** robust parsing ***** with a large HPSG grammar. | ||
| 2000.iwpt-1.11 A transformation-based approach to ***** robust parsing ***** is presented, which achieves a strictly monotonic improvement of its current best hypothesis by repeatedly applying local repair steps to a complex multi-level representation. | ||
| 1993.iwpt-1.12 There have been several other approaches to the problem of ***** robust parsing *****, most of which are special purpose algorithms [Carbonell and Hayes, 1984] , [Ward, 1991] and others. | ||
| W89-0215 As ***** robust parsing ***** is a prerequisite for any practical natural language processing system, there is certainly a need for techniques that go beyond merely formal approaches. | ||
| procedural text | 18 | |
| W19-2609 We propose a novel approach, Text2Quest, where ***** procedural text ***** is interpreted as instructions for an interactive game. | ||
| 2021.naacl-main.362 This result indicates that our approach is effective for ***** procedural text ***** understanding in general. | ||
| 2020.aacl-main.82 We thus frame the semantic comprehension of ***** procedural text ***** such as recipes, as fairly generic NLP subtasks, covering (i) entity recognition (ingredients, tools and actions), (ii) relation extraction (what ingredients and tools are involved in the actions), and (iii) zero anaphora resolution (link actions to implicit arguments, e.g., results from previous recipe steps). | ||
| L08-1135 This paper presents ongoing work dedicated to parsing the textual structure of ***** procedural text *****s. | ||
| D18-1006 We show that the new model significantly outperforms earlier systems on a benchmark dataset for ***** procedural text ***** comprehension (+8% relative gain), and that it also avoids some of the nonsensical predictions that earlier systems make. | ||
| properties | 18 | |
| L12-1283 This work is part of a project for MWE extraction and characterization using different techniques aiming at measuring the ***** properties ***** related to idiomaticity, as institutionalization, non-compositionality and lexico-syntactic fixedness. | ||
| L14-1587 We experimented with the proposed methodology over a sample of triples extracted from 10 DBpedia ontology ***** properties *****. | ||
| 2021.emnlp-main.50 The probing analysis of the features reveals their sensitivity to the surface and syntactic ***** properties *****. | ||
| W17-4913 In authorship attribution, many different approaches have successfully resolved this issue at the cost of linguistic interpretability: The resulting algorithms may be able to distinguish one language variety from the other, but do not give us much information on their distinctive linguistic ***** properties *****. | ||
| N18-5001 Due to NLPf's ***** properties ***** developers and domain experts are able to build domain-specific NLP application more effectively. | ||
| multilingual machine | 18 | |
| 2021.acl-long.21 Existing ***** multilingual machine ***** translation approaches mainly focus on English-centric directions, while the non-English directions still lag behind. | ||
| 2020.emnlp-main.187 Sparse language vectors from linguistic typology databases and learned embeddings from tasks like ***** multilingual machine ***** translation have been investigated in isolation, without analysing how they could benefit from each other's language characterisation. | ||
| D19-6130 Exploiting this, we introduce a new multilingual dataset (X-WikiRE), framing relation extraction as a ***** multilingual machine ***** reading problem. | ||
| 2020.acl-main.754 When training ***** multilingual machine ***** translation (MT) models that can translate to/from multiple languages, we are faced with imbalanced training sets: some languages have much more training data than others. | ||
| 2021.naacl-main.93 We propose a straightforward vocabulary adaptation scheme to extend the language capacity of multilingual machine translation models , paving the way towards efficient continual learning for *****multilingual machine***** translation . | ||
| adversarial learning | 18 | |
| 2021.acl-long.86 We evaluate our algorithm, using BERT-based models, on the GLUE benchmark and demonstrate that MATE-KD outperforms competitive ***** adversarial learning ***** and data augmentation baselines. | ||
| 2020.coling-main.338 To deal with this problem, we propose to improve the contextualized word representations via ***** adversarial learning ***** and fine-tuning BERT processes. | ||
| 2020.coling-main.102 In this paper, we propose a semantically consistent and syntactically variational encoder-decoder framework, which uses ***** adversarial learning ***** to ensure the syntactic latent variable be semantic-free. | ||
| N19-5001 In this tutorial, we provide a gentle introduction to the foundation of deep ***** adversarial learning *****, as well as some practical problem formulations and solutions in NLP. | ||
| 2020.emnlp-main.64 In this paper, we propose a novel ***** adversarial learning ***** framework Debiased-Chat to train dialogue models free from gender bias while keeping their performance. | ||
| existing | 18 | |
| 2020.coling-main.278 Our proposed LaAP-Net outperforms ***** existing ***** approaches on three benchmark datasets for the text VQA task by a noticeable margin. | ||
| D19-1566 Experimental results suggest the efficacy of the proposed model for both sentiment and emotion analysis over various ***** existing ***** state-of-the-art systems. | ||
| 2021.emnlp-main.777 At the script level, most ***** existing ***** studies only consider a single event sequence corresponding to one common protagonist. | ||
| D19-5817 Our study suggests that while current metrics may be suitable for ***** existing ***** QA datasets, they limit the complexity of QA datasets that can be created. | ||
| 2021.emnlp-main.702 Despite achieving good performance on some public benchmarks, we observe that ***** existing ***** text-to-SQL models do not generalize when facing domain knowledge that does not frequently appear in the training data, which may render the worse prediction performance for unseen domains. | ||
| light verb | 18 | |
| L16-1368 The survey shows that the ***** light verb ***** constructions either get special annotations as such, or are treated as ordinary verbs, while VP idioms are handled through different strategies. | ||
| L16-1369 Annotation is done for two types of MWEs: compound nouns and ***** light verb ***** constructions. | ||
| L16-1628 This research describes the iterative process of developing PropBank annotation guidelines for ***** light verb ***** constructions, the current guidelines, and a comparison to related resources. | ||
| 2019.lilt-17.1 The analysis has two key components (i) an underspecified category for the nominal and (ii) combinatorial constraints on the noun and ***** light verb ***** to specify selectional preferences. | ||
| 2016.gwc-1.56 Results show that ontological features are found to be very useful for the detection of ***** light verb ***** constructions, while use of semantic properties for the detection of compound nouns is found to be satisfactory. | ||
| international | 18 | |
| L10-1465 This paper describes an annotation scheme for argumentation in opinionated texts such as newspaper editorials, developed from a corpus of approximately 500 English texts from Nepali and ***** international ***** newspaper sources. | ||
| 2020.globalex-1.10 The project, known as “OLIVATERM”, had two main objectives: on the one hand, to develop the first systematic multilingual terminological dictionary in the scientific and socio-economic area of the olive grove and olive oils in order to facilitate communication in the topic; on the other, to contribute to the expansion of the Andalusia's domestic and ***** international ***** trade and the dissemination of its culture. | ||
| S17-2110 In this paper, we present our contribution in SemEval 2017 ***** international ***** workshop. | ||
| W19-2102 I then apply this model and anchoring approach to two cases, the shift in ***** international *****ist rhetoric in the American presidents' inaugural addresses, and the relationship between bellicosity in American foreign policy decision-makers' deliberations. | ||
| 2005.mtsummit-invited.4 Information from GPHIN is provided to the WHO, ***** international ***** governments and non-governmental organizations who can then quickly react to public health incidents. | ||
| sequence generation | 18 | |
| N19-2007 We experiment with two approaches for response generation: (1) sequence-to-***** sequence generation ***** and (2) template ranking. | ||
| 2020.emnlp-main.216 As a sequence-to-***** sequence generation ***** task, neural machine translation (NMT) naturally contains intrinsic uncertainty, where a single sentence in one language has multiple valid counterparts in the other. | ||
| D18-1149 The proposed model is designed based on the principles of latent variable models and denoising autoencoders, and is generally applicable to any ***** sequence generation ***** task. | ||
| N19-1360 When trained to predict tokens using supervised learning, the proposed architecture substantially outperforms standard ***** sequence generation ***** baselines. | ||
| D18-1396 Our discoveries are confirmed on different model structures including Transformer and RNN, and in other ***** sequence generation ***** tasks such as text summarization. | ||
| dimensional sentiment | 18 | |
| I17-4007 In this task, we propose an approach using a densely connected LSTM network and word features to identify ***** dimensional sentiment ***** on valence and arousal for words and phrases jointly. | ||
| 2021.rocling-1.37 Therefore, the multiple sentiment detection tasks on the video streaming service platform can be solved by the proposed multi-***** dimensional sentiment ***** indicators accompanied with BERT classifier to gain the best result. | ||
| I17-4015 The evaluation results demonstrate that our system is effective in ***** dimensional sentiment ***** analysis for Chinese phrases. | ||
| I17-4017 Categorical sentiment classification has drawn much attention in the field of NLP, while less work has been conducted for ***** dimensional sentiment ***** analysis (DSA). | ||
| 2021.rocling-1.51 This paper presents the ROCLING 2021 shared task on *****dimensional sentiment***** analysis for educational texts which seeks to identify a real - value sentiment score of self - evaluation comments written by Chinese students in the both valence and arousal dimensions . | ||
| neural conversation | 18 | |
| D19-1194 This paper proposes a new task about how to apply dynamic knowledge graphs in ***** neural conversation ***** model and presents a novel TV series conversation corpus (DyKgChat) for the task. | ||
| 2020.lrec-1.668 This paper concerns the problem of realizing consistent personalities in ***** neural conversation *****al modeling by using user generated question-answer pairs as training data. | ||
| 2020.acl-main.6 Our joint ***** neural conversation ***** model which integrates recurrent Knowledge-Interaction and knowledge Copy (KIC) performs well on generating informative responses. | ||
| P19-1539 Although ***** neural conversation *****al models are effective in learning how to produce fluent responses, their primary challenge lies in knowing what to say to make the conversation contentful and non-vacuous. | ||
| D19-1186 The neural encoder - decoder models have shown great promise in *****neural conversation***** generation . | ||
| sparse data | 18 | |
| L16-1663 This leads to under-performing machine translation systems in those ***** sparse data ***** settings. | ||
| L14-1640 To further alleviate the ***** sparse data ***** problem, we further make use of three types of out-links in Wikipedia. | ||
| 2020.coling-main.534 Lemmatization aims to reduce the ***** sparse data ***** problem by relating the inflected forms of a word to its dictionary form. | ||
| E17-1032 Most previous work focus on segmenting surface forms into their constituent morphs (taking: tak +ing), but surface form segmentation does not solve the ***** sparse data ***** problem as the analyses of take and taking are not connected to each other. | ||
| 2000.iwpt-1.14 We focus these experiments on demonstrating one of the main advantages of the SSN parser over the PCFG, handling ***** sparse data *****. | ||
| multimodal machine | 18 | |
| W19-1808 Recent work on visually grounded language learning has focused on broader applications of grounded representations, such as visual question answering and ***** multimodal machine ***** translation. | ||
| 2019.iwslt-1.6 The architecture consists of an automatic speech recognition (ASR) system followed by a Transformer-based ***** multimodal machine ***** translation (MMT) system. | ||
| D19-6406 It is assumed that ***** multimodal machine ***** translation systems are better than text-only systems at translating phrases that have a direct correspondence in the image. | ||
| W18-6402 We present the results from the third shared task on ***** multimodal machine ***** translation. | ||
| 2020.emnlp-main.62 We hence recommend that researchers in ***** multimodal machine ***** learning report the performance not only of unimodal baselines, but also the EMAP of their best-performing model. | ||
| passage | 18 | |
| D19-5809 To properly generate a question coherent to the grounding text and the current conversation history, the proposed framework first locates the focus of a question in the text ***** passage *****, and then identifies the question pattern that leads the sequential generation of the words in a question. | ||
| 2020.clssts-1.11 Since most MT systems expect sentences as input, feeding in longer unsegmented ***** passage *****s can lead to sub-optimal performance. | ||
| L10-1319 In this paper, we extend the method by (1) using neighboring context to index the target ***** passage *****, and (2) applying a language modeling approach for document retrieval. | ||
| 2019.icon-1.28 The comprehension stage converts the ***** passage ***** into a Discourse Collection that comprises of the relation shared amongst logical sentences in given ***** passage ***** along with the key characteristics of each sentence. | ||
| N18-2090 We propose a model that matches the answer with the ***** passage ***** before generating the question. | ||
| 18 | ||
| 2020.lrec-1.550 Our focus is directed at the de-identification of ***** email *****s where personally identifying information does not only refer to the sender but also to those people, locations, dates, and other identifiers mentioned in greetings, boilerplates and the content-carrying body of ***** email *****s. | ||
| L06-1110 We introduce the problem of automatic textual anonymisation and present a new publicly-available, pseudonymised benchmark corpus of personal ***** email ***** text for the task, dubbed ITAC (Informal Text Anonymisation Corpus). | ||
| R17-1047 Extracting the relevant information from these ***** email *****s would let users track their journeys and important updates on applications installed on their devices to give them a consolidated over view of their itineraries and also save valuable time. | ||
| W17-2408 We train on the Enron ***** email ***** corpus, and test on the Enron and Avocado ***** email ***** corpora. | ||
| W18-1111 The approach establishes the importance of affect features in frustration prediction for ***** email ***** data. | ||
| rumour stance classification | 18 | |
| S17-2083 Subtask A addresses the challenge of ***** rumour stance classification *****, which involves identifying the attitude of Twitter users towards the truthfulness of the rumour they are discussing. | ||
| S19-2196 SDQC addresses the challenge of ***** rumour stance classification ***** as an indirect way of identifying potential rumours. | ||
| 2020.rdsm-1.4 In this paper we revisit the task of ***** rumour stance classification *****, aiming to improve the performance over the informative minority classes. | ||
| 2020.aacl-main.92 We re-evaluate the systems submitted to the two RumourEval tasks and show that the two widely adopted metrics – accuracy and macro-F1 – are not robust for the four-class imbalanced task of ***** rumour stance classification *****, as they wrongly favour systems with highly skewed accuracy towards the majority class. | ||
| C16-1230 *****Rumour stance classification*****, the task that determines if each tweet in a collection discussing a rumour is supporting, denying, questioning or simply commenting on the rumour, has been attracting substantial interest. | ||
| natural language generation ( NLG | 18 | |
| 2020.acl-main.63 In modular dialogue systems , natural language understanding ( NLU ) and *****natural language generation ( NLG***** ) are two critical components , where NLU extracts the semantics from the given texts and NLG is to construct corresponding natural language sentences based on the input semantic representations . | ||
| W19-5920 Domain adaptation in *****natural language generation ( NLG***** ) remains challenging because of the high complexity of input semantics across domains and limited data of a target domain . | ||
| P19-1220 This study tackles generative reading comprehension ( RC ) , which consists of answering questions based on textual evidence and *****natural language generation ( NLG***** ) . | ||
| W18-6556 In *****natural language generation ( NLG***** ) , the task is to generate utterances from a more abstract input , such as structured data . | ||
| 2021.acl-long.355 Recent years have witnessed various types of generative models for *****natural language generation ( NLG***** ) , especially RNNs or transformer based sequence - to - sequence models , as well as variational autoencoder ( VAE ) and generative adversarial network ( GAN ) based models . | ||
| technical | 18 | |
| L12-1165 In this paper , we address the problem of extracting *****technical***** terms automatically from an unannotated corpus . | ||
| L12-1206 In contrast to other national corpora , it is conceptualised as a linked collection of many existing and future language resources representing language use in Australia , unified through common *****technical***** standards . | ||
| 2008.amta-papers.19 The continuous emergence of new *****technical***** terms and the difficulty of keeping up with neologism in parallel corpora deteriorate the performance of statistical machine translation ( SMT ) systems . | ||
| L14-1217 The growing amount of available information and the importance given to the access to *****technical***** information enhance the potential role of NLP applications in enabling users to deal with information for a variety of knowledge domains . | ||
| P17-1148 We consider the problem of translating high - level textual descriptions to formal representations in *****technical***** documentation as part of an effort to model the meaning of such documentation . | ||
| multi - task | 18 | |
| P19-1079 We present methods for *****multi - task***** learning that take advantage of natural groupings of related tasks . | ||
| D18-1526 We investigate the effects of *****multi - task***** learning using the recently introduced task of semantic tagging . | ||
| 2021.emnlp-main.646 The primary paradigm for *****multi - task***** training in natural language processing is to represent the input with a shared pre - trained language model , and add a small , thin network ( head ) per task . | ||
| 2021.eacl-demos.22 Transfer learning , particularly approaches that combine *****multi - task***** learning with pre - trained contextualized embeddings and fine - tuning , have advanced the field of Natural Language Processing tremendously in recent years . | ||
| D17-1206 Transfer and *****multi - task***** learning have traditionally focused on either a single source - target pair or very few , similar tasks . | ||
| free | 18 | |
| 2020.clinicalnlp-1.33 Clinical machine learning is increasingly multimodal , collected in both structured tabular formats and unstructured forms such as *****free***** text . | ||
| W18-5415 How much does *****free***** shipping ! | ||
| L06-1349 We are presenting a method to recognise geographical references in *****free***** text . | ||
| L06-1092 We present a non - deterministic finite - state transducer that acts as a tokenizer and normalizer for *****free***** text that is input to a broad - coverage LFG of German . | ||
| W19-3209 Identifying mentions of medical concepts in social media is challenging because of high variability in *****free***** text . | ||
| Natural Language Inference ( NLI | 18 | |
| 2020.starsem-1.1 Domain knowledge is important to understand both the lexical and relational associations of words in natural language text , especially for domain - specific tasks like *****Natural Language Inference ( NLI***** ) in the medical domain , where due to the lack of a large annotated dataset such knowledge can not be implicitly learned during training . | ||
| 2020.coling-demos.9 Advances in *****Natural Language Inference ( NLI***** ) have helped us understand what state - of - the - art models really learn and what their generalization power is . | ||
| 2020.wnut-1.22 The presence of large - scale corpora for *****Natural Language Inference ( NLI***** ) has spurred deep learning research in this area , though much of this research has focused solely on monolingual data . | ||
| 2020.calcs-1.2 *****Natural Language Inference ( NLI***** ) is the task of inferring the logical relationship , typically entailment or contradiction , between a premise and hypothesis . | ||
| 2020.blackboxnlp-1.16 We address whether neural models for *****Natural Language Inference ( NLI***** ) can learn the compositional interactions between lexical entailment and negation , using four methods : the behavioral evaluation methods of ( 1 ) challenge test sets and ( 2 ) systematic generalization tasks , and the structural evaluation methods of ( 3 ) probes and ( 4 ) interventions . | ||
| AMR graphs | 17 | |
| 2020.acl-main.67 Existing graph-to-sequence approaches generally utilize graph neural networks as their encoders, which have two limitations: 1) The message propagation process in ***** AMR graphs ***** is only guided by the first-order adjacency information. | ||
| N18-1106 To test it, we devise an expressive framework to align ***** AMR graphs ***** to dependency graphs, which we use to annotate 200 AMRs. | ||
| S19-1024 However, evaluating a parser on new data by means of comparison to manually created ***** AMR graphs ***** is very costly. | ||
| 2020.tacl-1.2 The model directly encodes the ***** AMR graphs ***** and learns the node representations. | ||
| 2021.iwpt-1.5 This paper presents a novel approach to AMR parsing by combining heterogeneous data (tokens, concepts, labels) as one input to a transformer to learn attention, and use only attention matrices from the transformer to predict all elements in ***** AMR graphs ***** (concepts, arcs, labels) | ||
| overlap | 17 | |
| 2020.ccl-1.88 Previous approaches have paid little attention to the problem of roles ***** overlap ***** which is a common phenomenon in practice. | ||
| 2020.acl-main.452 A good summary is characterized by language fluency and high information ***** overlap ***** with the source sentence. | ||
| 2011.jeptalnrecital-recital.5 We also present some experiments for the reduction of search space on the basis of stem ***** overlap *****, word ***** overlap *****, and cosine similarity measure which help us automatize the process to some extent and reduce human effort for alignment. | ||
| 2021.alta-1.9 A QA model is then employed to probe the candidate summary to evaluate information ***** overlap ***** between candidate and reference summaries. | ||
| W18-5011 In this paper, we introduce a computational model for speech ***** overlap ***** resolution in embodied artificial agents | ||
| Depending | 17 | |
| L14-1226 ***** Depending ***** on the task to process, we used two distinct supervised machine-learning techniques: Conditional Random Fields to perform both named entity identification and classification, and Maximum Entropy to classify given entities. | ||
| P18-3022 ***** Depending ***** upon the input, we generate both factoid and descriptive type questions. | ||
| C18-1004 ***** Depending ***** on system choices, the affinity scores can be further used in clustering or mention ranking. | ||
| L14-1312 ***** Depending ***** on the answers, a given noun-sense pair can be assigned to fine-grained noun classes, spanning the area between count and mass. | ||
| W19-6114 ***** Depending ***** on the use case, this task can be seen as a form of (phrase-level) query rewriting | ||
| normalizing | 17 | |
| R19-1086 Our results reveal that when relying on SMT to perform the normalization it is beneficial to use a background corpus that is close to the genre you are ***** normalizing *****. | ||
| W17-4412 To this purpose, in this work we propose and compare the noisy channel model and the neural encoder-decoder model as ***** normalizing ***** methods. | ||
| 2021.wanlp-1.49 The findings show that despite the simplicity of the proposed approach, using the LSVC model with a ***** normalizing ***** Arabic (NA) preprocessing and the BiLSTM architecture with an Embedding layer as input have yielded an encouraging F1score of 33.71% and 57.80% for sarcasm and sentiment detection, respectively. | ||
| 2020.emnlp-main.116 The existing strategies relying on the discriminative model are poorly to cope with ***** normalizing ***** combined procedure mentions. | ||
| P19-1234 This paper describes a neural PCFG inducer which employs context embeddings (Peters et al., 2018) in a ***** normalizing ***** flow model (Dinh et al., 2015) to extend PCFG induction to use semantic and morphological information | ||
| assistive | 17 | |
| W17-5533 We present the flexdiam dialogue management architecture, which was developed in a series of projects dedicated to tailoring spoken interaction to the needs of users with cognitive impairments in an everyday ***** assistive ***** domain, using a multimodal front-end. | ||
| 2021.reinact-1.5 It is also vital for ***** assistive ***** technologies to be designed with a focus on: (1) privacy, as the camera may capture a user's mail, medication bottles, or other sensitive information; (2) transparency, so that the system's behaviour can be explained and trusted by users; and (3) controllability, to tailor the system for a particular domain or user group. | ||
| 2020.lrec-1.783 Its core ***** assistive ***** feature, however, is its ability to automatically generate semantic annotations. | ||
| P18-2112 In this paper, we focus on the problem of building ***** assistive ***** systems that can help users to write reviews. | ||
| 2021.nlpmc-1.4 This could facilitate the development of automatic ***** assistive ***** technologies for these domains | ||
| PharmaCoNER | 17 | |
| D19-5701 For this task, named ***** PharmaCoNER *****, we generated annotation guidelines together with a corpus of 1,000 manually annotated clinical case studies. | ||
| D19-5708 We present a neural pipeline approach that performs named entity recognition (NER) and concept indexing (CI), which links them to concept unique identifiers (CUIs) in a knowledge base, for the ***** PharmaCoNER ***** shared task on pharmaceutical drugs and chemical entities. | ||
| D19-5707 The approach was evaluated on the ***** PharmaCoNER ***** Corpus obtaining an F-measure of 85.24% for subtask 1 and 49.36% for subtask2. | ||
| D19-5705 In this paper, we describe the system with which we participated in the first subtrack of the ***** PharmaCoNER ***** competition of the BioNLP Open Shared Tasks 2019 | ||
| D19-5709 We present the approach of the Turku NLP group to the *****PharmaCoNER***** task on Spanish biomedical named entity recognition . | ||
| Psycholinguistics | 17 | |
| 2020.sigdial-1.12 In this paper, we introduce a computational framework based on work from ***** Psycholinguistics *****, which is aimed at achieving proper turn-taking timing for situated agents. | ||
| L06-1083 At the MPI for ***** Psycholinguistics ***** a large archive with language resources has been created with contributions from many different individual researchers and research projects. | ||
| L12-1308 This shift from a single language resource center into a federated environment of many language resource centers is discussed in the context of a real world center: The Language Archive supported by the Max Planck Institute for ***** Psycholinguistics *****. | ||
| L06-1073 It was developed at the MPI for ***** Psycholinguistics ***** for stricter control of the archive coherence and consistency and allowing wider use of the archiving facilities without increasing the workload for archive and corpus managers | ||
| J74-2001 Personal Notes ; Computational Semantics Tutorial at Lugano in March ; Artificial Intelligence : Directory Being Compiled ( Donald E. Walker ) ; Letters : Logos Development Corporation on MT System ( Yorick Wilks ) ; Solar Project Distributes Materials ( Tim Diller ; John Olney ; Nathan Ucuzoglu ) ; NAS / NRC Studies International Information Programs ; NFAIS Meeting , Overlap Study , Indexer Training Kit ( Ben H. Weil ) ; On - line Terminal Searching Course at Pratt in January ( Patricia Breivik ) ; Educational Data Systems Association Convention ; Publication Problems : Journal Prices Rising ( Philip H. Abelson ) ; 3rd Pisa Summer School : Report of Courses , Lectures ( Antonio Zampoli ) ; Summer School at Stuttgart : Report of Lectures ( Hans - Jochen Schneider ) ; Information and Philology Conference : Report ( Marichal ) ; Ariosto Concordance in Progress ( Cesare Segre ) ; Text Data : Roundtable on Analytic Procedures Held ; SIGLASH ( Michael Lesk ; Robert Wachal ; Dolores Burton ; Karen Mullen ) ; Political Science Concepts to be Collected and Analyzed ( George J. Graham ) ; Reliable Software Conference , Los Angeles , April ( M. L. Shooman ) ; *****Psycholinguistics***** Conference , New York , January ; NSF : Excerpts from the Organizational Directory ( H. Guyford Stever ) ; Microfiche Equipment : Background Information for Buyers ( Dake Gaddy ) ; Artificial Intelligence in Poland : Bibliography ; AAAS Meeting : January ( W. M. Carlson ) ; Current Bibliography ( Brian Harris ; R. Laskowski ) | ||
| quantifier | 17 | |
| W19-0402 ULFs fully resolve the semantic type structure while leaving issues such as ***** quantifier ***** scope, word sense, and anaphora unresolved; they provide a starting point for further resolution into EL, and enable certain structural inferences without further resolution. | ||
| 2016.lilt-13.2 But because of the specific representation of meaning assumed by modeltheoretic semantics (one where a true model of the world is a priori available), research in the area has primarily focused on one question: what is the relation of a ***** quantifier ***** to the truth value of a sentence? | ||
| 2020.pam-1.6 In this paper, I show how the previous formulation gives trivial truth values when a precise ***** quantifier ***** is used with vague predicates. | ||
| W18-3815 Examples are proverbs, idiomatic constructions, normal usage examples, and, for nouns, phrases containing a ***** quantifier ***** | ||
| L12-1547 Annotating natural language sentences with *****quantifier***** scoping has proved to be very hard . | ||
| interannotator | 17 | |
| L10-1628 Annotators had the same level of training and experience, but ***** interannotator ***** agreement (IA) varied across words. | ||
| W19-3316 We present new guidelines for English and the results of an ***** interannotator ***** agreement study. | ||
| L12-1175 We have created and have been using VPS-30-En to explore the ***** interannotator ***** agreement potential of the Corpus Pattern Analysis. | ||
| L16-1137 This data set has been created to observe the ***** interannotator ***** agreement on PDEV patterns produced using the Corpus Pattern Analysis (Hanks, 2013). | ||
| Q14-1025 Standard agreement measures for ***** interannotator ***** reliability are neither necessary nor sufficient to ensure a high quality corpus | ||
| benchmarks | 17 | |
| 2020.acl-main.683 Our proposed approach shows competitive performance on two different language and vision tasks using public ***** benchmarks ***** and improves the state-of-the-art published results. | ||
| C18-1226 Our results confirm that weighted vector averaging can outperform context-sensitive models in most ***** benchmarks *****, but structural features encoded in RNN models can also be useful in certain classification tasks. | ||
| D19-1406 Under these circumstances, we suggest taking BLEU between input and human-written reformulations into consideration for ***** benchmarks *****. | ||
| P19-1576 While these relation vectors indeed help, we also show that lexical function classification poses a greater challenge than the syntactic and semantic relations that are typically used for ***** benchmarks ***** in the literature. | ||
| 2021.ranlp-1.101 Many of these state-of-the-art approaches are tested against ***** benchmarks ***** with labelled sentences containing tagged entities, and require important pre-training and fine-tuning on task-specific data | ||
| intra | 17 | |
| L14-1304 The result is a comprehensive database of German genitive formations, enriched with a broad range of ***** intra *****- und extralinguistic metadata. | ||
| D19-1498 An inference mechanism on the graph edges enables to learn ***** intra *****- and inter-sentence relations using multi-instance learning internally. | ||
| L06-1407 This material will allow ***** intra ***** and intercorpora comparative studies, which will make visible variations that result from discursive and pragmatic differences of each corpus and aspects of linguistic unity or diversity that characterise the spoken Portuguese of this referred five African countries. | ||
| 2021.emnlp-main.333 Most of existing extractive multi-document summarization (MDS) methods score each sentence individually and extract salient sentences one by one to compose a summary, which have two main drawbacks: (1) neglecting both the ***** intra ***** and cross-document relations between sentences; (2) neglecting the coherence and conciseness of the whole summary. | ||
| 2021.naacl-main.19 To better understand when images are useful for translation, we study image translatability of words, which we define as the translatability of words via images, by measuring ***** intra *****- and inter-cluster similarities of image representations of words that are translations of each other | ||
| homogeneity | 17 | |
| 2020.eval4nlp-1.11 We also study the effects of several variables such as normalization and data ***** homogeneity ***** on PBC. | ||
| L14-1544 We are aiming to show that distractor editing follows rules like syntactic and semantic ***** homogeneity ***** according to associated answer, and the possibility to automatically identify this ***** homogeneity *****. | ||
| 1984.bcs-1.8 One such factor is any degree of ***** homogeneity ***** the greater, the better in the texts he wishes to process. | ||
| L06-1463 Each of the 24 lists is optimized for ***** homogeneity ***** in terms of phoneme-distribution as compared to average French, and for word occurrence frequency of the employed monosyllabic keywords as derived from French language databases. | ||
| 2020.acl-main.86 We also demonstrate that allowing instances of different tasks to be interleaved as much as possible between each epoch and batch has a clear benefit in multitask performance over forcing task ***** homogeneity ***** at the epoch or batch level | ||
| supersense | 17 | |
| 2021.acl-srw.25 We evaluate our inferred representations on ***** supersense ***** prediction task. | ||
| C16-1293 In this paper, we adopt a ***** supersense ***** tagging method to annotate source words with coarse-grained ontological concepts. | ||
| W17-4106 In this paper, we propose a model to leverage various levels of input features to improve on the performance of an ***** supersense ***** tagging task. | ||
| 2020.coling-main.298 We assess the model in a task of ***** supersense ***** tagging for French nouns. | ||
| P18-1018 Unlike previous approaches, our annotations are comprehensive with respect to types and tokens of these markers; use broadly applicable ***** supersense ***** classes rather than fine-grained dictionary definitions; unite prepositions and possessives under the same class inventory; and distinguish between a marker's lexical contribution and the role it marks in the context of a predicate or scene | ||
| normative | 17 | |
| 2020.acl-main.485 We survey 146 papers analyzing “bias” in NLP systems, finding that their motivations are often vague, inconsistent, and lacking in ***** normative ***** reasoning, despite the fact that analyzing “bias” is an inherently ***** normative ***** process. | ||
| D19-5519 As dialect is the common way of communication for people online in Finnish, such a normalization is a necessary step to improve the accuracy of the existing Finnish NLP tools that are tailored for ***** normative ***** Finnish text. | ||
| L08-1433 The ***** normative ***** part SynAF is concerned with a metamodel for syntactic annotation that covers both dimensions of constituency and dependency, and propose thus a multi-layered annotation framework that allows the combined and interrelated annotation of language data along both lines of consideration. | ||
| 2021.emnlp-main.54 To investigate whether language generation models can serve as behavioral priors for systems deployed in social settings, we evaluate their ability to generate action descriptions that achieve predefined goals under ***** normative ***** constraints. | ||
| S19-2063 We analyze the syntax, abbreviations, and informal-writing of Twitter; and perform perfect data preprocessing on the data to convert them to ***** normative ***** text | ||
| linearized | 17 | |
| 2021.emnlp-main.351 Pretrained language models (PLM) have recently advanced graph-to-text generation, where the input graph is ***** linearized ***** into a sequence and fed into the PLM to obtain its representation. | ||
| P19-1527 We encode the nested labels using a ***** linearized ***** scheme. | ||
| I17-1003 In order to assess the performance, we construct model based on an attention mechanism encoder-decoder model in which the source language is input to the encoder as a sequence and the decoder generates the target language as a ***** linearized ***** dependency tree structure. | ||
| W18-3604 Based on (Castro Ferreira et al., 2017), the approach works by first preprocessing an input dependency tree into an ordered ***** linearized ***** string, which is then realized using a statistical machine translation model. | ||
| P18-1150 The current state-of-the-art method uses a sequence-to-sequence model, leveraging LSTM for encoding a ***** linearized ***** AMR structure | ||
| siamese | 17 | |
| 2021.emnlp-main.467 Many recent successes in sentence representation learning have been achieved by simply fine-tuning on the Natural Language Inference (NLI) datasets with triplet loss or ***** siamese ***** loss. | ||
| D19-1410 In this publication, we present Sentence-BERT (SBERT), a modification of the pretrained BERT network that use ***** siamese ***** and triplet network structures to derive semantically meaningful sentence embeddings that can be compared using cosine-similarity. | ||
| 2021.argmining-1.19 One component employs contrastive learning via a ***** siamese ***** neural network for matching arguments to key points; the other is a graph-based extractive summarization model for generating key points. | ||
| C16-1097 Compared to previous works, we train and test on larger and realistic datasets; and, show that ***** siamese ***** architectures consistently perform better than traditional linear classifier approach. | ||
| D19-1604 In this paper , we establish the effectiveness of using hard negatives , coupled with a *****siamese***** network and a suitable loss function , for the tasks of answer selection and answer triggering . | ||
| retraining | 17 | |
| 2020.findings-emnlp.443 However, regarding the fast-growing scale of models in the current NLP area, sometimes we may have difficulty ***** retraining ***** whole NLU and NLG models. | ||
| 2012.iwslt-papers.12 Fast ***** retraining ***** is particularly interesting when we want to almost instantly integrate user feed-back, for instance in a post-editing context or machine translation assisted CAT tool. | ||
| P17-1060 In experiments, the model significantly outperforms baselines that do not use domain adaptation and also performs better than the full ***** retraining ***** approach. | ||
| 2020.emnlp-main.747 We then propose two methods to further characterize rationale quality, one based on model ***** retraining ***** and one on using “fidelity curves” to reveal properties such as irrelevance and redundancy. | ||
| D19-6111 Designing parsers which can simultaneously maintain high accuracy and fast ***** retraining ***** time is challenging | ||
| simulated | 17 | |
| L06-1486 Our study is a follow up on a previous experiment conducted in a similar ***** simulated ***** environment. | ||
| 2020.sigdial-1.37 Our results show that (1) a pipeline dialog system trained using fine-grained supervision signals at different component levels often obtains better performance than the systems that use joint or end-to-end models trained on coarse-grained labels, (2) component-wise, single-turn evaluation results are not always consistent with the overall performance of a dialog system, and (3) despite the discrepancy between simulators and human users, ***** simulated ***** evaluation is still a valid alternative to the costly human evaluation especially in the early stage of development. | ||
| P18-1203 The effectiveness of our approach is demonstrated on a movie-ticket booking task in both ***** simulated ***** and human-in-the-loop settings. | ||
| L14-1039 Other work has suggested another testing technique, ***** simulated ***** speech (SS), as a supplement or an alternative to EI that can provide automated fluency metrics. | ||
| L14-1285 The LAST MINUTE corpus comprises records and transcripts of naturalistic problem solving dialogs between N = 130 subjects and a companion system ***** simulated ***** in a Wizard of Oz experiment | ||
| backdoor | 17 | |
| 2021.acl-long.431 In this work, we point out a potential problem of current ***** backdoor ***** attacking research: its evaluation ignores the stealthiness of ***** backdoor ***** attacks, and most of existing ***** backdoor ***** attacking methods are not stealthy either to system deployers or to system users. | ||
| 2021.emnlp-main.374 In addition, the style transfer-based adversarial and ***** backdoor ***** attack methods show superiority to baselines in many aspects. | ||
| 2021.acl-long.377 Recent studies show that neural natural language processing (NLP) models are vulnerable to ***** backdoor ***** attacks. | ||
| 2021.acl-long.37 These results also reveal the significant insidiousness and harmfulness of textual ***** backdoor ***** attacks. | ||
| 2021.emnlp-main.241 Pre - Trained Models have been widely applied and recently proved vulnerable under *****backdoor***** attacks : the released pre - trained weights can be maliciously poisoned with certain triggers . | ||
| weighted | 17 | |
| 2021.acl-long.302 In particular, we find that ***** weighted ***** discourse trees from auxiliary tasks can benefit key NLP downstream applications, compared to nuclearity-centered approaches. | ||
| 2020.acl-main.165 We show this can be achieved by carefully designing multi-head dot-product attention modules for different domains, and eventually taking ***** weighted ***** averages of their parameters by word-level layer-wise domain proportions. | ||
| L10-1110 We treat this as a text classification problem and apply first information extraction (IE) techniques (voting using keywords weight according to their context), then machine learning (ML), and finally a combined approach in which ML has priority over ***** weighted ***** keywords, but the latter can still make up categorizations for services for which ML does not produce enough. | ||
| 2021.calcs-1.13 Our unsupervised models compete well with their supervised counterparts, with their performance reaching within 1-7% (***** weighted ***** F1 scores) when compared to supervised models trained for a two class problem. | ||
| 2019.gwc-1.29 We fit WordNet relations to word embeddings, using 3CosAvg and LRCos, two set-based methods for analogy resolution, and introduce 3CosWeight, a new, ***** weighted ***** variant of 3CosAvg | ||
| pronominal | 17 | |
| P19-1480 The coreference alignment modeling explicitly aligns coreferent mentions in conversation history with corresponding ***** pronominal ***** references in generated questions, which makes generated questions interconnected to conversation history. | ||
| 2020.udw-1.5 The identification of the subject poses a particular problem in Wolof, due to ***** pronominal ***** indices whose status as a pronoun or a ***** pronominal ***** affix is uncertain. | ||
| 1991.iwpt-1.8 The algorithm uses feature matrixes to tell ***** pronominal ***** classes apart and scores to determine the ranking of candidates for antecedenthood, as well as for restricting the behaviour of proforms and anaphors. | ||
| L10-1222 This is very promising because the experiments have been run on both written and spoken data using a classification of the ***** pronominal ***** functions which is much more fine-grained than the classifications used in other studies. | ||
| D19-1294 The ongoing neural revolution in machine translation has made it easier to model larger contexts beyond the sentence - level , which can potentially help resolve some discourse - level ambiguities such as *****pronominal***** anaphora , thus enabling better translations . | ||
| Abstractive summarization | 17 | |
| 2021.eacl-main.224 ***** Abstractive summarization ***** systems generally rely on large collections of document-summary pairs. | ||
| 2021.acl-srw.7 ***** Abstractive summarization ***** is the task of compressing a long document into a coherent short document while retaining salient information. | ||
| D19-1320 ***** Abstractive summarization ***** approaches based on Reinforcement Learning (RL) have recently been proposed to overcome classical likelihood maximization. | ||
| P17-1098 ***** Abstractive summarization ***** aims to generate a shorter version of the document covering all the salient points in a compact and coherent fashion. | ||
| 2021.newsum-1.1 ***** Abstractive summarization ***** models heavily rely on copy mechanisms, such as the pointer network or attention, to achieve good performance, measured by textual overlap with reference summaries | ||
| arithmetic | 17 | |
| 2021.emnlp-main.759 Numerical reasoning based machine reading comprehension is a task that involves reading comprehension along with using ***** arithmetic ***** operations such as addition, subtraction, sorting and counting. | ||
| P19-1283 We first validate our methods on the case of a simple synthetic language for ***** arithmetic ***** expressions with clearly defined syntax and semantics, and show that they exhibit the expected pattern of results. | ||
| W19-4813 We close this gap by systematically and quantitatively comparing these methods in different settings, namely (1) a toy ***** arithmetic ***** task which we use as a sanity check, (2) a five-class sentiment prediction of movie reviews, and besides (3) we explore the usefulness of word relevances to build sentence-level representations. | ||
| 2020.fnp-1.33 Bootstrapping the parsing process to detect out- of-vocabulary terms at runtime increases parsing accuracy in addition to producing other benefits to a natural-language-processing pipeline, which translates ***** arithmetic ***** calculations written in English into computer-executable operations. | ||
| 2020.sigdial-1.34 numerical reasoning performance by training the model to predict ***** arithmetic ***** expressions, and 2 | ||
| Code | 17 | |
| 2021.emnlp-main.332 *****Code***** summarization aims to generate concise natural language descriptions of source code , which can help improve program comprehension and maintenance . | ||
| 2020.semeval-1.103 *****Code***** switching is a linguistic phenomenon which may occur within a multilingual setting where speakers share more than one language . | ||
| 2020.findings-emnlp.361 *****Code***** retrieval is a key task aiming to match natural and programming languages . | ||
| 2020.findings-emnlp.350 *****Code***** comments are vital for software maintenance and comprehension , but many software projects suffer from the lack of meaningful and up - to - date comments in practice . | ||
| 2020.semeval-1.125 *****Code***** mixing is a common phenomena in multilingual societies where people switch from one language to another for various reasons . | ||
| UTFPR | 17 | |
| 2021.semeval-1.78 We describe the ***** UTFPR ***** systems submitted to the Lexical Complexity Prediction shared task of SemEval 2021. | ||
| 2020.semeval-1.140 We describe the ***** UTFPR ***** system for SemEval-2020's Task 7: Assessing Humor in Edited News Headlines | ||
| S19-2140 We present the *****UTFPR***** system for the OffensEval shared task of SemEval 2019 : A character - to - word - to - sentence compositional RNN model trained exclusively over the training data provided by the organizers . | ||
| W18-6483 We present the *****UTFPR***** systems at the WMT 2018 parallel corpus filtering task . | ||
| W18-6224 We introduce the *****UTFPR***** system for the Implicit Emotions Shared Task of 2018 : A compositional character - to - word recurrent neural network that does not exploit heavy and/or hard - to - obtain resources . | ||
| coarse | 17 | |
| D18-1231 The problem of entity-typing has been studied predominantly as a supervised learning problems, mostly with task-specific annotations (for ***** coarse ***** types) and sometimes with distant supervision (for fine types). | ||
| 2020.emnlp-main.121 Learning to rank helps detect fine-grained sentence-level divergences more accurately than a strong sentence-level similarity model, while token-level predictions have the potential of further distinguishing between ***** coarse ***** and fine-grained divergences. | ||
| 2021.conll-1.23 In the fine labeling stage, the model expands each ***** coarse ***** label into a final label (such as VP, VP*, VV, VV*). | ||
| W17-4505 We propose a novel ***** coarse *****-to-fine attention model that hierarchically reads a document, using ***** coarse ***** attention to select top-level chunks of text and fine attention to read the words of the chosen chunks. | ||
| W19-4316 Existing approaches to training bilingual word embeddings require either large collections of pre-defined seed lexicons that are expensive to obtain, or parallel sentences that comprise ***** coarse ***** and noisy alignment | ||
| parsed corpora | 17 | |
| L10-1117 Previous work has shown that large scale subcategorisation lexicons could be extracted from ***** parsed corpora ***** with reasonably high precision. | ||
| Q18-1046 (3) Despite being computed from un***** parsed corpora *****, our learned task-specific features beat previous work's interpretable typological features that require ***** parsed corpora ***** or expert categorization of the language. | ||
| 2020.lrec-1.886 Creating this tool and enriching existing ***** parsed corpora ***** of Middle English is part of the project Borrowing of Argument Structure in Contact Situations (BASICS) which seeks to explain to which extent verbs copied from Old French had an impact on the grammar of Middle English. | ||
| L08-1113 In this work, we apply a reduction of the cost by taking profit of the bracketing information in ***** parsed corpora ***** and show machine translation results obtained with a bracketed Europarl corpus, yielding interresting improvements when increasing the number of non-terminal symbols. | ||
| L12-1421 The envisaged application of such a lexicon would be: in assigning ontological labels to syntactically ***** parsed corpora *****, and expanding the lexicon and lexical information in the Bulgarian Resource Grammar. | ||
| learner corpora | 17 | |
| 2020.lrec-1.48 The corpus provides valuable information about patterns of learner errors and can be used as a language resource for a number of research tasks, while its creation is much cheaper and faster than for traditional ***** learner corpora *****. | ||
| L12-1016 We show this method is effective in several different ***** learner corpora *****, with bigram features being particularly useful. | ||
| D19-1316 We have taken the first step by creating ***** learner corpora ***** consisting of approximately 1,900 essays where all preposition errors are manually annotated with feedback comments. | ||
| L08-1481 In the present case, corpora of three languages were to be evaluated and corrected: (1) Polish, a large automatically annotated and manually corrected single-speaker TTS unit-selection corpus in the BOSS Label File (BLF) format, (2) German and (3) English, the second and third being manually annotated multi-speaker story-telling ***** learner corpora ***** in Praat TextGrid format | ||
| N19-1132 To overcome this limitation, we evaluate the performance of several GEC models, including NMT-based (LSTM, CNN, and transformer) and an SMT-based model, against various ***** learner corpora ***** (CoNLL-2013, CoNLL-2014, FCE, JFLEG, ICNALE, and KJ). | ||
| comparative | 17 | |
| 2020.emnlp-main.589 Further, a heuristic-based program, built to exploit these patterns, had ***** comparative ***** performance to that of the neural models. | ||
| C18-1097 To investigate generalisability and to enable state of the art ***** comparative ***** evaluations, we carry out the first reproduction studies of three groups of complementary methods and perform the first large-scale mass evaluation on six different English datasets. | ||
| W17-5220 Our models are experimented on both the SemEval'16 Task 4 dataset and the Stanford Sentiment Treebank and show ***** comparative ***** or better results against the existing state-of-the-art systems. | ||
| 2021.inlg-1.24 This paper describes the shared task in detail, summarises results from each of the reproduction studies submitted, and provides further ***** comparative ***** analysis of the results. | ||
| 2001.mtsummit-eval.4 The evaluation is not ***** comparative *****, which means that we tested a specific MT system, not necessarily representative of other MT systems that can be found on the market | ||
| morphological paradigms | 17 | |
| J18-2005 Tree hierarchies are learned along with the corresponding ***** morphological paradigms ***** simultaneously. | ||
| L16-1410 The system we present learns the inflectional behavior of ***** morphological paradigms ***** from examples and converts the learned paradigms into a finite-state transducer that is able to map inflected forms of previously unseen words into lemmas and corresponding morphosyntactic descriptions. | ||
| 2020.sigmorphon-1.11 We present a model for the unsupervised dis- covery of ***** morphological paradigms *****. | ||
| 2020.acl-main.598 We further introduce a system for the task, which generates ***** morphological paradigms ***** via the following steps: (i) EDIT TREE retrieval, (ii) additional lemma retrieval, (iii) paradigm size discovery, and (iv) inflection generation. | ||
| 2020.lrec-1.483 The Universal Morphology (UniMorph) project is a collaborative effort providing broad-coverage instantiated normalized ***** morphological paradigms ***** for hundreds of diverse world languages. | ||
| multimedia | 17 | |
| L08-1043 Language models used in current automatic speech recognition systems are trained on general-purpose corpora and are therefore not relevant to transcribe spoken documents dealing with successive precise topics, such as long ***** multimedia ***** streams, frequently tacking reportages and debates. | ||
| L06-1444 In this paper, we look into the notion of cross-media decision mechanisms, focussing on ones that work within ***** multimedia ***** documents for a variety of applications, such as the generation of intelligent ***** multimedia ***** presentations and ***** multimedia ***** indexing. | ||
| 2020.acl-main.230 We introduce a new task, MultiMedia Event Extraction, which aims to extract events and their arguments from ***** multimedia ***** documents. | ||
| 2020.lrec-1.528 Most of the studies that address this problem rely only on textual documents while an increasing number of sources are ***** multimedia *****, in particular in the context of social media where messages are often illustrated with images. | ||
| 2016.iwslt-1.4 The search space of the ASR is blown up when ***** multimedia ***** content is encountered, resulting in large delays that compromise real-time requirements | ||
| named entity corpus | 17 | |
| C16-1049 In this paper, we first build a manually annotated ***** named entity corpus ***** of Mongolian. | ||
| 2021.emnlp-main.814 We release these cross-lingual entity pairs along with the massively multilingual tagged ***** named entity corpus ***** as a resource to the NLP community. | ||
| 2020.ccl-1.93 Constructing a well-annotated ***** named entity corpus ***** manually is very time-consuming and labor-intensive. | ||
| L14-1540 In this paper, we propose a novel method to automatically build a ***** named entity corpus ***** based on the DBpedia ontology. | ||
| L10-1143 This paper introduces a new ***** named entity corpus ***** for Dutch. | ||
| multilingual neural | 17 | |
| 2021.wmt-1.86 We prepared state-of-the-art ***** multilingual neural ***** machine translation systems for three languages (i.e. | ||
| 2020.emnlp-main.476 In this study, we revisit the ***** multilingual neural ***** machine translation model that only share modules among the same languages (M2) as a practical alternative to 1-1 to satisfy industrial requirements. | ||
| 2017.iwslt-1.15 In this paper, we proposed two strategies which can be applied to a ***** multilingual neural ***** machine translation system in order to better tackle zero-shot scenarios despite not having any parallel corpus. | ||
| 2020.findings-emnlp.283 We present a probabilistic framework for ***** multilingual neural ***** machine translation that encompasses supervised and unsupervised setups, focusing on unsupervised translation. | ||
| 2020.wmt-1.98 We describe parBLEU, parCHRF++, and parESIM, which augment baseline metrics with automatically generated paraphrases produced by PRISM (Thompson and Post, 2020a), a ***** multilingual neural ***** machine translation system. | ||
| definition extraction | 17 | |
| L06-1066 In this paper, we will describe our annotation scheme for definitions and report on two studies: (1) a pilot study that evaluates our ***** definition extraction ***** approach using a German corpus with manually annotated definitions as a gold standard. | ||
| 2020.lrec-1.256 We demonstrate that the use of this dataset for training learning models improves the quality of ***** definition extraction ***** when these models are then used for other definition datasets. | ||
| 2020.semeval-1.93 We explore the performance of Bidirectional Encoder Representations from Transformers (BERT) at ***** definition extraction *****. | ||
| 2020.semeval-1.90 The paper describes our system BERTatDE1 in sentence classification task (subtask 1) and sequence labeling task (subtask 2) in the ***** definition extraction ***** (SemEval-2020 Task 6). | ||
| 2020.semeval-1.96 We propose a joint model to train the tasks of ***** definition extraction ***** and the word level BIO tagging simultaneously. | ||
| answer extraction | 17 | |
| C18-1014 We represent documents as trees, and model an agent that learns to interleave quick navigation through the document tree with more expensive ***** answer extraction *****. | ||
| K18-1042 This architecture benefits from a bilateral attention mechanism which helps the model to focus on a question and the answer sentence at the same time for phrasal ***** answer extraction *****. | ||
| 2021.acl-long.519 We also achieve good results on ***** answer extraction *****, outperforming recent models like REALM and RAG by 3+ points. | ||
| D18-1453 From this study, we observed that (i) the baseline performances for the hard subsets remarkably degrade compared to those of entire datasets, (ii) hard questions require knowledge inference and multiple-sentence reasoning in comparison with easy questions, and (iii) multiple-choice questions tend to require a broader range of reasoning skills than ***** answer extraction ***** and description questions. | ||
| 2021.ranlp-srw.26 Our work aims to leverage a popular approach used for general question answering, ***** answer extraction *****, in order to find answers to temporal questions within a paragraph. | ||
| distantly supervised | 17 | |
| D17-1191 In this paper, we design a novel convolutional neural network (CNN) with residual learning, and investigate its impacts on the task of ***** distantly supervised ***** noisy relation extraction. | ||
| 2021.naacl-main.2 We propose a multi-task, probabilistic approach to facilitate ***** distantly supervised ***** relation extraction by bringing closer the representations of sentences that contain the same Knowledge Base pairs. | ||
| P17-1166 In ***** distantly supervised ***** scenario, one entity tuple may have multiple relation facts. | ||
| 2021.acl-long.483 Therefore, the performance of ***** distantly supervised ***** RE models is bounded. | ||
| N19-1294 In this work, we propose to regularize ***** distantly supervised ***** models with Compact Latent Space Clustering (CLSC) to bypass this problem and effectively utilize noisy data yet. | ||
| syntactic structures | 17 | |
| N19-1018 The parsing strategy is based on the assumption that most ***** syntactic structures ***** can be parsed incrementally and that the set –the memory of the parser– remains reasonably small on average. | ||
| W16-5406 Paratactic ***** syntactic structures ***** are difficult to represent in syntactic dependency tree structures. | ||
| L16-1566 Our higher-order parsing model, gaining thus up to 4 points, establishes the state of the art for parsing French deep ***** syntactic structures *****. | ||
| 2021.wanlp-1.27 Previous work on CEAE has shown the cross-lingual benefits of universal dependency trees in capturing shared ***** syntactic structures ***** of sentences across languages. | ||
| 2020.lrec-1.636 We provide data statistics and demonstrate differences between the new dataset and existing out-of-domain test sets annotated with TIGER ***** syntactic structures *****. | ||
| relevant information | 17 | |
| P17-1172 In this paper, we present an approach of reading text while skipping ir***** relevant information ***** if needed. | ||
| 2021.dash-1.1 Domain-specific conceptual bases use key concepts to capture domain scope and ***** relevant information *****. | ||
| D17-1174 One of the most pressing issues in discontinuous constituency transition-based parsing is that the ***** relevant information ***** for parsing decisions could be located in any part of the stack or the buffer. | ||
| 2021.newsum-1.11 Using keyphrase extraction and semantic role labeling (SRL), we find that SRL captures ***** relevant information ***** without overwhelming the the model architecture. | ||
| R17-1047 Extracting the ***** relevant information ***** from these emails would let users track their journeys and important updates on applications installed on their devices to give them a consolidated over view of their itineraries and also save valuable time. | ||
| optical character | 17 | |
| 2020.findings-emnlp.184 The experimental results show that the proposed approach achieves a new state-of-the-art performance on three benchmark datasets, as well as an ***** optical character ***** recognition dataset. | ||
| 2020.emnlp-main.268 To better align social media style texts and images, we propose: (1) a novel Multi-Modality MultiHead Attention (M3H-Att) to capture the intricate cross-media interactions; (2) image wordings, in forms of ***** optical character *****s and image attributes, to bridge the two modalities. | ||
| 2020.coling-main.278 Existing text VQA systems generate an answer by selecting from ***** optical character ***** recognition (OCR) texts or a fixed vocabulary. | ||
| 2020.aacl-srw.9 A corpus of sentences was created after correcting errors in the text scanned through ***** optical character ***** reading (OCR). | ||
| 2021.acl-long.320 The final system integrates ***** optical character ***** recognition (OCR), active-learning-based text classification, and geographic information system visualization in order to effectively extract, query, and visualize this information for any area of interest. | ||
| logic | 17 | |
| W17-7902 Market pressure on translation productivity joined with techno***** logic *****al innovation is likely to fragment and decontextualise translation jobs even more than is cur-rently the case. | ||
| 1993.iwpt-1.4 For the ***** logic *****al language we use the language EL, defined and implemented earlier for computational semantic purposes. | ||
| 2021.blackboxnlp-1.19 Tsetlin Machine (TM) is an interpretable pattern recognition algorithm based on propositional ***** logic *****, which has demonstrated competitive performance in many Natural Language Processing (NLP) tasks, including sentiment analysis, text classification, and Word Sense Disambiguation. | ||
| R19-1156 In this paper, we present a two-level morpho***** logic *****al analyzer for Turkish. | ||
| D17-1305 In the *****logic***** approach to Recognizing Textual Entailment , identifying phrase - to - phrase semantic relations is still an unsolved problem . | ||
| cultural heritage | 17 | |
| 2021.wmt-1.43 The system aims to solve the Subtask 2: Wikipedia ***** cultural heritage ***** articles, which involves translation in four Romance languages: Catalan, Italian, Occitan and Romanian. | ||
| 2021.latechclfl-1.16 Knowledge Graphs for this ***** cultural heritage ***** domain, when being developed with appropriate ontologies and vocabularies, enable to integrate and reconcile this diverse information. | ||
| K18-1024 As a precious part of the human ***** cultural heritage *****, Chinese poetry has influenced people for generations. | ||
| L08-1420 We describe the process of converting plain text ***** cultural heritage ***** data to elements of a domain-specific knowledge base, using general machine learning techniques. | ||
| L08-1035 In the context of the CATCH research program that is currently carried out at a number of large Dutch ***** cultural heritage ***** institutions our ambition is to combine and exchange heterogeneous multimedia annotations between projects and institutions. | ||
| medical text | 17 | |
| L16-1068 In ***** medical text *****, the process of manual chart review (of a patient's medical record) is error-prone due to its complexity. | ||
| L10-1229 Although several studies have focused on processing negation in bio***** medical text *****s, we are not aware of publicly available resources that describe the scope of negation cues in detail. | ||
| 2021.ranlp-srw.30 For the “Diagnosis” section a deep learning text-based encoding into ICD-10 codes is applied using MBG-ClinicalBERT - a fine-tuned ClinicalBERT model for Bulgarian ***** medical text *****. | ||
| W18-2323 In recent years , a surge of interest on how to learn good embeddings and evaluate embedding quality based on English medical text has become increasing evident , however a limited number of studies based on Chinese *****medical text***** , particularly Chinese clinical records , were performed . | ||
| N18-1001 We study the problem of named entity recognition ( NER ) from electronic medical records , which is one of the most fundamental and critical problems for *****medical text***** mining . | ||
| implicit discourse | 17 | |
| P19-1065 We firstly propose a method to automatically extract the ***** implicit discourse ***** relation argument pairs and labels from a dataset of dialogic turns, resulting in a novel corpus of discourse relation pairs; the first of its kind to attempt to identify the discourse relations connecting the dialogic turns in open-domain discourse. | ||
| E17-1027 Inferring ***** implicit discourse ***** relations in natural language text is the most difficult subtask in discourse parsing. | ||
| 2021.unimplicit-1.1 In the current study, we perform ***** implicit discourse ***** relation classification without relying on any labeled implicit relation. | ||
| P19-1058 In the literature, most of the previous studies on English ***** implicit discourse ***** relation recognition only use sentence-level representations, which cannot provide enough semantic information in Chinese due to its unique paratactic characteristics. | ||
| 2020.lrec-1.145 (2) In expectation of knowledge transfer from explicit discourse relations to ***** implicit discourse ***** relations, we add a task named explicit connective prediction at the additional pre-training step. | ||
| speech technology | 17 | |
| W17-4414 NSWs pose a challenge to the proper functioning of text-to-***** speech technology *****, and the solution is to spell them out in such a way that they can be pronounced appropriately. | ||
| L12-1042 Speech-text alignment tools are frequently used in ***** speech technology ***** and research. | ||
| L14-1284 Here, we propose such an evaluation toolbox, drawing ideas from both ***** speech technology ***** and natural language processing. | ||
| L08-1529 Corpus based methods are increasingly used for *****speech technology***** applications and for the development of theoretical or computer models of spoken languages . | ||
| L10-1234 Language resources are typically defined and created for application in *****speech technology***** contexts , but the documentation of languages which are unlikely ever to be provided with enabling technologies nevertheless plays an important role in defining the heritage of a speech community and in the provision of basic insights into the language oriented components of human cognition . | ||
| tree structure | 17 | |
| 2020.codi-1.8 We propose a neural network based approach to learn the match between pairs of discourse ***** tree structure *****s. | ||
| D18-1509 In this paper, we (1) propose an NMT model that can naturally generate the topology of an arbitrary ***** tree structure ***** on the target side, and (2) experiment with various target ***** tree structure *****s. | ||
| 2021.emnlp-main.317 The aspect and opinion words are expected to be closer along such ***** tree structure ***** compared to the standard dependency parse tree. | ||
| C16-1022 We also induce embeddings to generalize over elementary ***** tree structure *****s and exploit a tree recurrence over the input structure to model long-distance influences between NLG choices. | ||
| W16-5406 Paratactic syntactic structures are difficult to represent in syntactic dependency ***** tree structure *****s. | ||
| low resource language | 17 | |
| 2020.repl4nlp-1.16 We find that better models for ***** low resource language *****s require more efficient pretraining techniques or more data. | ||
| 2020.loresmt-1.5 Workshop on Technologies for MT of Low Resource Languages (LoResMT 2020) organized shared tasks of ***** low resource language ***** pair translation using zero-shot NMT. | ||
| P18-5008 We will also present EL methods that work for both name tagging and linking in very ***** low resource language *****s. | ||
| 2020.acl-main.747 We also present a detailed empirical analysis of the key factors that are required to achieve these gains, including the trade-offs between (1) positive transfer and capacity dilution and (2) the performance of high and ***** low resource language *****s at scale. | ||
| 2020.loresmt-1.9 The corpus preparation is one of the important challenging task for the domain of machine translation especially in *****low resource language***** scenarios . | ||
| poetry | 17 | |
| W19-4727 Here, we provide a large corpus of German ***** poetry ***** which consists of about 75k poems with more than 11 million tokens, with poems ranging from the 16th to early 20th century. | ||
| W16-4006 In part of thiswork, we construct a corpus of Manyosyu, which is an old Japanese ***** poetry ***** anthology. | ||
| C16-1100 Chinese ***** poetry ***** generation is a very challenging task in natural language processing. | ||
| 2021.emnlp-main.97 Recent text generation research has increasingly focused on open-ended domains such as story and ***** poetry ***** generation. | ||
| W17-7802 The UAIC-RoDia-DepTb is a balanced treebank, containing texts in non-standard language: 2,575 chats sentences, old Romanian texts (a Gospel printed in 1648, a codex of laws printed in 1818, a novel written in 1910), regional popular ***** poetry *****, legal texts, Romanian and foreign fiction, quotations. | ||
| image description | 17 | |
| L16-1489 Such an evaluation process affords various insights into the ***** image description ***** datasets and evaluation metrics, such as the variations of ***** image description *****s within and across datasets and also what the metrics capture. | ||
| 2020.coling-main.280 We propose a way to build an image-specific representation of the geographic context and adapt the caption generation network to produce appropriate geographic names in the ***** image description *****s. | ||
| P17-1175 We introduce a Multi-modal Neural Machine Translation model in which a doubly-attentive decoder naturally incorporates spatial visual features obtained using pre-trained convolutional neural networks, bridging the gap between ***** image description ***** and translation. | ||
| C16-1124 We present a model of visually-grounded language learning based on stacked gated recurrent neural networks which learns to predict visual features given an ***** image description ***** in the form of a sequence of phonemes. | ||
| E17-4001 I present a general model of the human ***** image description ***** process, and propose to study this process using corpus analysis, experiments, and computational modeling. | ||
| experiments | 17 | |
| 2021.acl-long.96 We also carry out multiple ***** experiments ***** to measure how much each augmentation strategy improves the performance of automatic scoring systems. | ||
| C18-1281 Our ***** experiments ***** show how annotators diverge in language annotation tasks due to a range of ineliminable factors. | ||
| 2021.alta-1.26 Our empirical ***** experiments ***** reveal that these modern pretrained language models suffer from high variance, and the ensemble method can improve the model performance. | ||
| 2020.emnlp-main.748 We hope that these architectures and ***** experiments ***** may serve as strong points of comparison for future work. | ||
| L14-1645 Along with the methodology for coping with this diversity in the speech data, we also describe a set of ***** experiments ***** performed in order to investigate the efficiency of different approaches for automatic data pruning. | ||
| person | 17 | |
| E17-1062 This study introduces a statistical model able to generate variations of a proper name by taking into account the ***** person ***** to be mentioned, the discourse context and variation. | ||
| N18-3019 Extensive experimentation over a dataset of 10 domains drawn from data relevant to our commercial ***** person *****al digital assistant shows that our BoE models outperform the baseline models with a statistically significant average margin of 5.06% in absolute F1-score when training with 2000 instances per domain, and achieve an even higher improvement of 12.16% when only 25% of the training data is used. | ||
| 2021.wassa-1.26 We explicitly examine the impact of transcription errors on the downstream performance of a multi-modal system on three related tasks from three datasets: emotion, sarcasm, and ***** person *****ality detection. | ||
| P18-1205 Chit-chat models are known to have several problems: they lack specificity, do not display a consistent ***** person *****ality and are often not very captivating. | ||
| C18-1156 Also, since the sarcastic nature and form of expression can vary from ***** person ***** to ***** person *****, CASCADE utilizes user embeddings that encode stylometric and ***** person *****ality features of users. | ||
| general purpose | 17 | |
| W19-3601 The result of this work is a ***** general purpose ***** lexicon – IgboSentilex. | ||
| W19-4028 In this context, this paper presents an effort to build a ***** general purpose ***** AMR-annotated corpus for Brazilian Portuguese by translating and adapting AMR English guidelines. | ||
| D19-5504 We present the first results on adapting a ***** general purpose ***** neural GEC system to both the proficiency level and the first language of a writer, using only a few thousand annotated sentences. | ||
| 2020.acl-main.589 We propose Differentiable Window, a new neural module and ***** general purpose ***** component for dynamic window selection. | ||
| L14-1320 Although data sets for ***** general purpose ***** anaphora resolution exist, they are not suitable for dialogue based Intelligent Tutoring Systems. | ||
| electronic medical | 17 | |
| W19-1902 The shift to ***** electronic medical ***** records (EMRs) has engendered research into machine learning and natural language technologies to analyze patient records, and to predict from these clinical outcomes of interest. | ||
| D18-1258 We demonstrate an instance of this methodology in generating a large-scale QA dataset for ***** electronic medical ***** records by leveraging existing expert annotations on clinical notes for various NLP tasks from the community shared i2b2 datasets. | ||
| 2020.lrec-1.561 After annotating around 1100 ***** electronic medical ***** records following the annotation scheme, we demonstrated its feasibility using an NER task. | ||
| 2021.bionlp-1.23 Chinese word segmentation (CWS) and medical concept recognition are two fundamental tasks to process Chinese ***** electronic medical ***** records (EMRs) and play important roles in downstream tasks for understanding Chinese EMRs. | ||
| N18-3001 In this work, we present our work to build a machine learning based scalable system for predicting ICD-10 codes from ***** electronic medical ***** records. | ||
| multilingual language | 17 | |
| 2020.findings-emnlp.364 We compare the two baselines with key configurations and find that: automatic Vietnamese word segmentation improves the parsing results of both baselines; the normalized pointwise mutual information (NPMI) score (Bouma, 2009) is useful for schema linking; latent syntactic features extracted from a neural dependency parser for Vietnamese also improve the results; and the monolingual language model PhoBERT for Vietnamese (Nguyen and Nguyen, 2020) helps produce higher performances than the recent best ***** multilingual language ***** model XLM-R (Conneau et al., 2020). | ||
| 2021.eacl-main.189 Using a novel layer ablation technique and analyses of the model's internal representations, we show that multilingual BERT, a popular ***** multilingual language ***** model, can be viewed as the stacking of two sub-networks: a multilingual encoder followed by a task-specific language-agnostic predictor. | ||
| 2020.acl-main.558 Recent advances in pre-trained ***** multilingual language ***** models lead to state-of-the-art results on the task of quality estimation (QE) for machine translation. | ||
| 2021.calcs-1.14 The NLP community has witnessed steep progress in a variety of tasks across the realms of monolingual and ***** multilingual language ***** processing recently. | ||
| L06-1227 The focus is on aspects of the W3C i18n Activity which are of benefit for the creation and manipulation of ***** multilingual language ***** resources. | ||
| source code | 17 | |
| N19-1105 The datasets and ***** source code ***** can be obtained from https://github.com/thunlp/Adv-ED. | ||
| D19-1307 The learned reward function and our ***** source code ***** are available at https://github.com/yg211/summary-reward-no-reference. | ||
| 2021.emnlp-main.48 We make the ***** source code ***** and dataset splits accessible. | ||
| D19-1584 The ***** source code ***** can be obtained from https://github.com/thunlp/HMEAE. | ||
| 2020.coling-main.194 Our ***** source code *****s are available online. | ||
| units | 17 | |
| 2008.amta-papers.19 We also build a cascaded translation model that dynamically shifts translation ***** units ***** from phrase level to word and morpheme phrase levels. | ||
| C16-1289 Upon the generated source and target phrase structures, we stack a convolutional neural network to integrate vector representations of linguistic ***** units ***** on the structures into bilingual phrase embeddings. | ||
| P19-1410 Our framework comprises a discourse segmenter to identify the elementary discourse ***** units ***** (EDU) in a text, and a discourse parser that constructs a discourse tree in a top-down fashion. | ||
| L06-1451 We focus here on criteria to identify and delimit reasonable subword ***** units *****, to group them into functionally adequate synonymy classes and relate them by two types of lexical relations. | ||
| 2020.acl-main.569 In particular, we cast discourse parsing as a recursive split point ranking task, where a split point is classified to different levels according to its rank and the elementary discourse ***** units ***** (EDUs) associated with it are arranged accordingly. | ||
| predicting | 17 | |
| W19-6129 Finally we discuss the dataset in light of the results and point to future research and plans for further improving both the dataset and methods of ***** predicting ***** prosodic prominence from text. | ||
| C16-1121 We present a successful collaboration of word embeddings and co-training to tackle in the most difficult test case of semantic role labeling: ***** predicting ***** out-of-domain and unseen semantic frames. | ||
| 2020.emnlp-main.189 However, they sometimes result in ***** predicting ***** the correct answer text but in a context irrelevant to the given question. | ||
| W18-3028 We show that the notions of concreteness and imageability are highly predictable both within and across languages, with a moderate loss of up to 20% in correlation when ***** predicting ***** across languages. | ||
| W19-4819 Here we present a suite of experiments probing whether neural language models trained on linguistic data induce these stack-like data structures and deploy them while incrementally ***** predicting ***** words. | ||
| computational social | 17 | |
| 2020.acl-main.51 The problem of comparing two bodies of text and searching for words that differ in their usage between them arises often in digital humanities and ***** computational social ***** science. | ||
| P18-1066 The framework could be useful for machine translation applications and research in ***** computational social ***** science. | ||
| 2021.emnlp-main.788 Understanding differences of viewpoints across corpora is a fundamental task for ***** computational social ***** sciences. | ||
| D19-1661 However, lack of interpretability and the unsupervised nature of word embeddings have limited their use within ***** computational social ***** science and digital humanities. | ||
| W18-4512 We present a simple approach to the generation and labeling of extraction patterns for coding political event data, an important task in ***** computational social ***** science. | ||
| speech and language | 17 | |
| N19-1367 There is growing evidence that changes in ***** speech and language ***** may be early markers of dementia, but much of the previous NLP work in this area has been limited by the size of the available datasets. | ||
| L16-1710 It does so by facilitating the cooperation between documentary and theoretical linguistics, and ***** speech and language ***** technologies research and development, in particular for low-resourced and endangered languages. | ||
| W18-5804 Computational Language Documentation attempts to make the most recent research in *****speech and language***** technologies available to linguists working on language preservation and documentation . | ||
| 2020.acl-srw.3 Aphasia is a *****speech and language***** disorder which results from brain damage , often characterized by word retrieval deficit ( anomia ) resulting in naming errors ( paraphasia ) . | ||
| 2020.lrec-1.814 Although many efforts have been made in the last decade to enhance the *****speech and language***** resources for Romanian , this language is still considered under - resourced . | ||
| student | 17 | |
| W17-5001 We also find that difficulty is mirrored in the amount of variation in ***** student ***** answers, which can be computed before grading. | ||
| R19-1021 The ability to produce high-quality publishable material is critical to academic success but many Post-Graduate ***** student *****s struggle to learn to do so. | ||
| 2021.teachingnlp-1.16 Introducing biomedical informatics (BMI) ***** student *****s to natural language processing (NLP) requires balancing technical depth with practical know-how to address application-focused needs. | ||
| C18-2020 Therefore, we offer an automatic simultaneous interpretation service for ***** student *****s. | ||
| L10-1267 The aim of this study was to assess the retrieval effectiveness of nursing ***** student *****s in the Dutch-speaking part of Belgium. | ||
| phrase translation table | 17 | |
| L12-1393 The proposed method uses a word lattice representation of the pivot-language candidates and word lattice decoding to deal with the ambiguity; the lattice expansion is accomplished by using a pivot―target *****phrase translation table***** to compensate for the incompleteness. | ||
| L10-1482 Then, as a toolkit of a phrase-based SMT (Statistical Machine Translation) model, Moses is applied and Japanese-English translation pairs are obtained in the form of a *****phrase translation table*****. | ||
| W17-5709 The selected phrases are then replaced with tokens during training and post-translated by the *****phrase translation table***** of SMT. | ||
| 2008.amta-papers.3 Two string-to-chunks translation models are proposed: a factored model, which augments phrase-based SMT with layered dependencies, and a joint model, that extends the *****phrase translation table***** with microtags, i.e. per-word projections of chunk labels. | ||
| 2008.amta-papers.14 The approach taken in the proposed technique is based on integrating the *****phrase translation table***** of a state-of-the-art statistical phrase-based machine translation model, and compositional translation generation based on an existing bilingual lexicon for human use. | ||
| episodic logic | 17 | |
| W19-0402 This document describes underspecified logical forms (ULF) for *****Episodic Logic***** (EL), which is an initial form for a semantic representation that balances these needs. | ||
| W19-3306 Abstract Unscoped *****episodic logic*****al form (ULF) is a semantic representation capturing the predicate-argument structure of English within the *****episodic logic***** formalism in relation to the syntactic structure, while leaving scope, word sense, and anaphora unresolved. | ||
| 2021.naloma-1.1 Our schemas are represented with *****Episodic Logic*****, a logical form that closely mirrors natural language. | ||
| 2021.naloma-1.5 We present a method of making natural logic inferences from Unscoped Logical Form of *****Episodic Logic*****. | ||
| 2021.iwcs-1.18 “*****Episodic Logic*****: Unscoped Logical Form” (EL-ULF) is a semantic representation capturing predicate-argument structure as well as more challenging aspects of language within the *****Episodic Logic***** formalism. | ||
| neural tensor network | 17 | |
| C16-1276 We show that adaptive composition model improves standard solution such as *****neural tensor network***** in terms of translation accuracy. | ||
| N18-1047 Although *****neural tensor networks***** (NTNs) have been successful in many NLP tasks, they require a large number of parameters to be estimated, which often leads to overfitting and a long training time. | ||
| E17-2083 In this paper we present a cross-lingual extension of a *****neural tensor network***** model for knowledge base completion. | ||
| C18-1046 In this paper, we propose a novel *****neural Tensor network***** framework with Interactive Attention and Sparse Learning (TIASL) for implicit discourse relation recognition. | ||
| 2021.mtsummit-research.15 We carry out domain classification for computing sentence weights with 1) language model cross entropy difference 2) a convolutional neural network 3) a Recursive *****Neural Tensor Network*****. | ||
| multi - hop reading comprehension | 17 | |
| 2021.emnlp-main.490 How can we generate concise explanations for *****multi-hop Reading Comprehension***** (RC)? | ||
| P19-1260 *****Multi-hop reading comprehension***** (RC) across documents poses new challenge over single-document RC because it requires reasoning over multiple documents to reach the final answer. | ||
| P19-1263 We propose a novel, path-based reasoning approach for the *****multi-hop reading comprehension***** task where a system needs to combine facts from multiple passages to answer a question. | ||
| P19-1416 *****Multi-hop reading comprehension***** (RC) questions are challenging because they require reading and reasoning over multiple paragraphs. | ||
| P19-1261 *****Multi-hop reading comprehension***** requires the model to explore and connect relevant information from multiple sentences/documents in order to answer the question about the context. | ||
| task - oriented | 17 | |
| 2021.naacl-main.124 Existing dialogue corpora and models are typically designed under two disjoint motives : while *****task - oriented***** systems focus on achieving functional goals ( e.g. , booking hotels ) , open - domain chatbots aim at making socially engaging conversations . | ||
| D19-3031 We present PolyResponse , a conversational search engine that supports *****task - oriented***** dialogue . | ||
| N18-3004 In *****task - oriented***** dialog , agents need to generate both fluent natural language responses and correct external actions like database queries and updates . | ||
| 2020.sigdial-1.3 Natural language generators ( NLGs ) for *****task - oriented***** dialogue typically take a meaning representation ( MR ) as input , and are trained end - to - end with a corpus of MR / utterance pairs , where the MRs cover a specific set of dialogue acts and domain attributes . | ||
| N19-1178 Learning a shared dialog structure from a set of *****task - oriented***** dialogs is an important challenge in computational linguistics . | ||
| knowledge base ( KB | 17 | |
| D17-1307 First - order factoid question answering assumes that the question can be answered by a single fact in a *****knowledge base ( KB***** ) . | ||
| 2021.emnlp-main.296 The key challenge of question answering over knowledge bases ( KBQA ) is the inconsistency between the natural language questions and the reasoning paths in the *****knowledge base ( KB***** ) . | ||
| 2020.acl-main.760 In traditional approaches to entity linking , linking decisions are based on three sources of information the similarity of the mention string to an entity 's name , the similarity of the context of the document to the entity , and broader information about the *****knowledge base ( KB***** ) . | ||
| 2021.naacl-main.65 In entity linking , mentions of named entities in raw text are disambiguated against a *****knowledge base ( KB***** ) . | ||
| 2020.lrec-1.528 The task of Entity linking , which aims at associating an entity mention with a unique entity in a *****knowledge base ( KB***** ) , is useful for advanced Information Extraction tasks such as relation extraction or event detection . | ||
| semantic role labeling ( SRL | 17 | |
| P17-1044 We introduce a new deep learning model for *****semantic role labeling ( SRL***** ) that significantly improves the state of the art , along with detailed analyses to reveal its strengths and limitations . | ||
| 2021.iwcs-1.20 Active learning has been shown to reduce annotation requirements for numerous natural language processing tasks , including *****semantic role labeling ( SRL***** ) . | ||
| 2021.spnlp-1.8 In this work , we empirically compare span extraction methods for the task of *****semantic role labeling ( SRL***** ) . | ||
| W18-0530 We present a novel rule - based system for automatic generation of factual questions from sentences , using *****semantic role labeling ( SRL***** ) as the main form of text analysis . | ||
| D19-1538 Recently , *****semantic role labeling ( SRL***** ) has earned a series of success with even higher performance improvements , which can be mainly attributed to syntactic integration and enhanced word representation . | ||
| Machine Translation | 17 | |
| 1999.mtsummit-1.96 We describe a *****Machine Translation***** framework aimed at the rapid development of large scale robust machine translation systems for assimilation purposes , where the MT system is incorporated as one of the tools in an analyst 's workstation . | ||
| L06-1355 This article outlines the evaluation protocol and provides the main results of the French Evaluation Campaign for *****Machine Translation***** Systems , CESTA . | ||
| 2000.amta-papers.11 *****Machine Translation***** evaluation has been more magic and opinion than science . | ||
| 2021.wat-1.6 In this paper , we report the experimental results of *****Machine Translation***** models conducted by a NECTEC team for the translation tasks of WAT-2021 . | ||
| N18-5015 *****Machine Translation***** systems are usually evaluated and compared using automated evaluation metrics such as BLEU and METEOR to score the generated translations against human translations . | ||
| mobile | 17 | |
| 2021.eacl-main.250 Recently , there has been a strong interest in developing natural language applications that live on personal devices such as *****mobile***** phones , watches and IoT with the objective to preserve user privacy and have low memory . | ||
| 2014.amta-wptp.16 We present Kanjingo , a *****mobile***** app for post - editing currently running under iOS . | ||
| C16-2039 In this paper , we introduce papago - a translator for *****mobile***** device which is equipped with new features that can provide convenience for users . | ||
| 2021.emnlp-demo.41 Although such application cases often use crowd - sourcing mechanisms to gather a variety of annotators , most real - world users use *****mobile***** devices . | ||
| L08-1386 In this paper we present a usability measure adapted to *****mobile***** services , which is based on the well - known theoretical framework defined in the ISO 9241 - 11 standard . | ||
| code - mixed | 17 | |
| 2021.ranlp-srw.1 Computational humor generation is one of the hardest tasks in natural language generation , especially in *****code - mixed***** languages . | ||
| 2021.dravidianlangtech-1.47 This paper describes the models submitted by the team MUCS for Offensive Language Identification in Dravidian Languages - EACL 2021 shared task that aims at identifying and classifying *****code - mixed***** texts of three language pairs namely , Kannada - English ( Kn - En ) , Malayalam - English ( Ma - En ) , and Tamil - English ( Ta - En ) into six predefined categories ( 5 categories in Ma - En language pair ) . | ||
| 2021.wnut-1.48 The increasing use of social media sites in countries like India has given rise to large volumes of *****code - mixed***** data . | ||
| W18-6107 Building tools for *****code - mixed***** data is rapidly gaining popularity in the NLP research community as such data is exponentially rising on social media . | ||
| D18-1344 We compare three existing bilingual word embedding approaches , and a novel approach of training skip - grams on synthetic code - mixed text generated through linguistic models of code - mixing , on two tasks - sentiment analysis and POS tagging for *****code - mixed***** text . | ||
| Sequence - to - sequence | 17 | |
| W17-4505 *****Sequence - to - sequence***** models with attention have been successful for a variety of NLP problems , but their speed does not scale well for tasks with long source sequences such as document summarization . | ||
| 2021.naacl-main.435 *****Sequence - to - sequence***** models have delivered impressive results in word formation tasks such as morphological inflection , often learning to model subtle morphophonological details with limited training data . | ||
| 2020.sigmorphon-1.18 *****Sequence - to - sequence***** models have proven to be highly successful in learning morphological inflection from examples as the series of SIGMORPHON / CoNLL shared tasks have shown . | ||
| D17-1235 *****Sequence - to - sequence***** models have been applied to the conversation response generation problem where the source sequence is the conversation history and the target sequence is the response . | ||
| 2020.acl-main.457 *****Sequence - to - sequence***** models for abstractive summarization have been studied extensively , yet the generated summaries commonly suffer from fabricated content , and are often found to be near - extractive . | ||
| e - | 17 | |
| D18-1377 Sentiment analysis has immense implications in *****e -***** commerce through user feedback mining . | ||
| 2021.naacl-industry.35 Advertising on *****e -***** commerce and social media sites deliver ad impressions at web scale on a daily basis driving value to both shoppers and advertisers . | ||
| 2020.aacl-demo.3 Despite the growth of *****e -***** commerce , brick - and - mortar stores are still the preferred destinations for many people . | ||
| N19-1242 Question - answering plays an important role in *****e -***** commerce as it allows potential customers to actively seek crucial information about products or services to help their purchase decision making . | ||
| 2020.ecnlp-1.8 In *****e -***** commerce , recommender systems have become an indispensable part of helping users explore the available inventory . | ||
| Italian | 17 | |
| L06-1307 The paper reports on the results of the exploitation of two *****Italian***** lexicons ( ItalWordNet and SIMPLE - CLIPS ) in an Open - Domain Question Answering application for Italian . | ||
| L10-1492 In this paper we report the first results of an annotation exercise of argument coercion phenomena performed on *****Italian***** texts . | ||
| L14-1447 The goal of this paper is to propose a classification of the syntactic alternations admitted by the most frequent *****Italian***** verbs . | ||
| L16-1148 This paper introduces LexFr , a corpus - based French lexical resource built by adapting the framework LexIt , originally developed to describe the combinatorial potential of *****Italian***** predicates . | ||
| L12-1131 Regularities in position and level of prosodic prominences associated to patterns of Information Structure are identified for some *****Italian***** varieties . | ||
| natural language understanding ( NLU ) | 17 | |
| 2020.acl-main.769 Several recent studies have shown that strong *****natural language understanding ( NLU )***** models are prone to relying on unwanted dataset biases without learning the underlying task , resulting in models that fail to generalize to out - of - domain datasets and are likely to perform poorly in real - world scenarios . | ||
| 2020.coling-main.419 The advent of *****natural language understanding ( NLU )***** benchmarks for English , such as GLUE and SuperGLUE allows new NLU models to be evaluated across a diverse set of tasks . | ||
| 2021.nlp4convai-1.22 Entity tags in human - machine dialog are integral to *****natural language understanding ( NLU )***** tasks in conversational assistants . | ||
| L16-1501 Annotated in - domain corpora are crucial to the successful development of dialogue systems of automated agents , and in particular for developing *****natural language understanding ( NLU )***** components of such systems . | ||
| L10-1631 We present an evaluation framework to enable developers of information seeking , transaction based spoken dialogue systems to compare the robustness of *****natural language understanding ( NLU )***** approaches across varying levels of word error rate and contrasting domains . | ||
| multi - task learning | 17 | |
| 2021.naacl-demos.1 We present the first *****multi - task learning***** model named PhoNLP for joint Vietnamese part - of - speech ( POS ) tagging , named entity recognition ( NER ) and dependency parsing . | ||
| 2020.coling-main.43 We present a *****multi - task learning***** framework to enable the training of one universal incremental dialogue processing model with four tasks of disfluency detection , language modelling , part - of - speech tagging and utterance segmentation in a simple deep recurrent setting . | ||
| D19-1337 Based on the attention - based pointer generator model , we propose to incorporate an auxiliary task of language modeling to help question generation in a hierarchical *****multi - task learning***** structure . | ||
| W19-5049 This paper presents a *****multi - task learning***** approach to natural language inference ( NLI ) and question entailment ( RQE ) in the biomedical domain . | ||
| S18-1058 We take a *****multi - task learning***** approach to the shared Task 1 at SemEval-2018 . | ||
| second language | 17 | |
| W18-5005 The role of alignment between interlocutors in *****second language***** learning is different to that in fluent conversational dialogue . | ||
| W18-0534 In this paper we present NLI - PT , the first Portuguese dataset compiled for Native Language Identification ( NLI ) , the task of identifying an author 's first language based on their *****second language***** writing . | ||
| L14-1463 A system for human machine interaction is presented , that offers *****second language***** learners of Italian the possibility of assessing their competence by performing a map task , namely by guiding the a virtual follower through a map with written instructions in natural language . | ||
| 2020.nlptea-1.14 In the process of learning Chinese , *****second language***** learners may have various grammatical errors due to the negative transfer of native language . | ||
| I17-2031 Assessing summaries is a demanding , yet useful task which provides valuable information on language competence , especially for *****second language***** learners . | ||
| grading | 16 | |
| 2021.alta-1.26 In this paper, we investigate the utility of modern pretrained language models for the evidence ***** grading ***** system in the medical literature based on the ALTA 2021 shared task. | ||
| 2020.coling-main.535 Automated essay scoring (AES) is the task of automatically assigning scores to essays as an alternative to ***** grading ***** by human raters. | ||
| W18-3718 In this paper, we report a short answer ***** grading ***** system in Chinese. | ||
| W16-4904 We address the problem of automatic short answer ***** grading *****, evaluating a collection of approaches inspired by recent advances in distributional text representations. | ||
| D19-1628 Empirical evaluation on multi-domain datasets shows that task-specific fine-tuning on the enhanced pre-trained language model achieves superior performance for short answer ***** grading ***** | ||
| XGBoost | 16 | |
| S19-2203 Our system uses the embeddings obtained from Universal Sentence Encoder combined with ***** XGBoost ***** for the classification sub-task A. We also evaluate other combinations of embeddings and off-the-shelf machine learning algorithms to demonstrate the efficacy of the various representations and their combinations. | ||
| W17-5234 In stage2, we use two regression models including linear regression and ***** XGBoost *****. | ||
| S19-2159 We developed an ***** XGBoost ***** based system which uses character and word level n-gram features represented using TF-IDF, count vector based correlation matrix, and predicts if an input news article is a hyperpartisan news article. | ||
| 2020.vardial-1.23 From simple models for regression, such as Support Vector Regression, to deep neural networks, such as Long Short-Term Memory networks and character-level convolutional neural networks, and, finally, to ensemble models based on meta-learners, such as ***** XGBoost *****, our interest is focused on approaching the problem from a few different perspectives, in an attempt to minimize the prediction error. | ||
| 2020.semeval-1.197 This paper also compares the performances of different deep learning model architectures, such as the Bi-LSTM, LSTM, BERT, and ***** XGBoost ***** models, on the detection of news promotion techniques | ||
| Europarl corpus | 16 | |
| P19-1491 In prior work (Cotterell et al., 2018) we attempted to address this question for language modeling, and observed that recurrent neural network language models do not perform equally well over all the high-resource European languages found in the ***** Europarl corpus *****. | ||
| L10-1283 We use the ***** Europarl corpus ***** and therein concentrate on French verbs. | ||
| Q17-1020 Experiments show improvements over the state-of-the-art in several languages used in previous work, in a setting where the only source of translation data is the Bible, a considerably smaller corpus than the ***** Europarl corpus ***** used in previous work. | ||
| L08-1131 We conducted experiments using ***** Europarl corpus ***** to evaluate our approach. | ||
| L14-1716 It includes different document types produced between 2001 and 2012, excluding only the documents already exist in the ***** Europarl corpus ***** to avoid overlapping | ||
| deletion | 16 | |
| 2020.readi-1.7 We demonstrate that despite our small parallel corpus, our neural models were able to learn essential features of simplified language, such as lexical substitutions, ***** deletion ***** of less relevant words and phrases, and sentence shortening. | ||
| W18-6455 In addition to including the basic edit operations in TER, namely - insertion, ***** deletion *****, substitution and shift, our metric also allows stem matching, optimizable edit costs and better normalization so as to correlate better with human judgement scores. | ||
| 2021.naacl-main.277 In this paper, we propose a novel hybrid approach that leverages linguistically-motivated rules for splitting and ***** deletion *****, and couples them with a neural paraphrasing model to produce varied rewriting styles. | ||
| P19-1271 Tackling hate speech in the standard way of content ***** deletion ***** or user suspension may be charged with censorship and overblocking. | ||
| N18-1097 We evaluate a variety of local explanation approaches using automatic measures based on word ***** deletion ***** | ||
| decode | 16 | |
| C16-1011 Interlingua based Machine Translation (MT) aims to encode multiple languages into a common linguistic representation and then ***** decode ***** sentences in multiple target languages from this representation. | ||
| 2021.emnlp-main.350 Concretely, we transform a sentence into a variety of different semantic or syntactic representations (including AMR, UD, and latent semantic representation), and then ***** decode ***** the sentence back from the semantic representations. | ||
| Q17-1002 Splitting the fMRI data according to human concreteness ratings, we indeed observe that both models significantly ***** decode ***** the most concrete nouns; however, accuracy is significantly greater using the text-based models for the most abstract nouns. | ||
| P19-1418 To keep the model aware of the underlying grammar in target sequences, many constrained ***** decode *****rs were devised in a multi-stage paradigm, which ***** decode ***** to the sketches or abstract syntax trees first, and then ***** decode ***** to target semantic tokens. | ||
| 2020.findings-emnlp.385 The standard neural machine translation model can only ***** decode ***** with the same depth configuration as training | ||
| conditioning | 16 | |
| D18-1113 We study the automatic generation of syntactic paraphrases using four different models for generation: data-to-text generation, text-to-text generation, text reduction and text expansion, We derive training data for each of these tasks from the WebNLG dataset and we show (i) that ***** conditioning ***** generation on syntactic constraints effectively permits the generation of syntactically distinct paraphrases for the same input and (ii) that exploiting different types of input (data, text or data+text) further increases the number of distinct paraphrases that can be generated for a given input. | ||
| 2021.emnlp-main.122 In a case study, we find that after ***** conditioning ***** on non-contextual word embeddings, properties like part-of-speech are accessible at deeper layers of a network than previously thought. | ||
| D18-1325 The model is integrated in the original NMT architecture as another level of abstraction, ***** conditioning ***** on the NMT model's own previous hidden states. | ||
| 2019.iwslt-1.31 We investigate two methods for biasing the output length with a transformer architecture: i) ***** conditioning ***** the output to a given target-source length-ratio class and ii) enriching the transformer positional embedding with length information. | ||
| P17-1122 We also show that ***** conditioning ***** the generation on topic models makes generated responses more relevant to the document content | ||
| Previously | 16 | |
| N18-1015 ***** Previously ***** proposed models for lyrics generation suffer from the inability of capturing the relationship between lyrics and melody partly due to the unavailability of lyrics-melody aligned data. | ||
| L14-1514 ***** Previously *****, we introduced an inter-disciplinary methodology to enable collecting a large amount of recordings under consistent conditions (Aguiar et al., 2013). | ||
| 2020.lrec-1.357 ***** Previously *****, limited work has been done on dependency grammar for Icelandic. | ||
| W16-4825 ***** Previously ***** the HLTCOE has explored the use of compression-inspired language modeling for language and dialect identification, using news, Wikipedia, blog post, and Twitter corpora. | ||
| D19-1112 ***** Previously *****, researchers have proposed to use language model pre-training and multi-task learning to learn robust representations | ||
| simplifying | 16 | |
| W16-4110 In this paper, we present a comparative analysis of statistically predictive syntactic features of complexity and the treatment of these features by humans when ***** simplifying ***** texts. | ||
| D19-1089 Multilingual neural machine translation (NMT), which translates multiple languages using a single model, is of great practical importance due to its advantages in ***** simplifying ***** the training process, reducing online maintenance costs, and enhancing low-resource and zero-shot translation. | ||
| 2020.aacl-demo.1 This article presents the AMesure platform, which aims to assist writers of French administrative texts in ***** simplifying ***** their writing. | ||
| 2020.coling-main.122 Instead, semi-automated approaches can be used that assist a human writer in ***** simplifying ***** text faster and at a higher quality. | ||
| 2020.ldl-1.5 We describe our approach to ***** simplifying ***** the transformation and hosting of terminological resources in the remainder of this paper | ||
| bounds | 16 | |
| D19-1522 We show that TuckER is a fully expressive model, derive sufficient ***** bounds ***** on its embedding dimensionalities and demonstrate that several previously introduced linear models can be viewed as special cases of TuckER. | ||
| E17-1073 We provide a theoretical analysis of this new algorithm in terms of regret ***** bounds *****, and evaluate it on both synthetic data and NLP classification problems, including text classification and sentiment analysis. | ||
| D18-1292 The present work instead applies depth ***** bounds ***** within a chart-based Bayesian PCFG inducer, where bounding can be switched on and off, and then samples trees with or without bounding. | ||
| W03-3020 The formalism, which can be thought of as a generalization of context-free grammars with partially ordered right-hand sides, is of interest in its own right, and also as infrastructure for obtaining tighter complexity ***** bounds ***** for more expressive context-free formalisms intended to express free or multiple word-order, such as ID/LP grammars. | ||
| W17-5019 The field of grammatical error correction (GEC) has made tremendous ***** bounds ***** in the last ten years, but new questions and obstacles are revealing themselves | ||
| AL | 16 | |
| 2014.amta-workshop.1 In addition, ***** AL ***** provides better translation quality than ITP for the same levels of user effort. | ||
| P18-1174 Heuristic-based active learning (***** AL *****) methods are limited when the data distribution of the underlying learning problems vary. | ||
| 2021.naacl-main.159 In this paper, we argue that since ***** AL ***** strategies choose examples independently, they may potentially select similar examples, all of which may not contribute significantly to the learning process. | ||
| 2021.mtsummit-research.2 Various sampling techniques in active learning (***** AL *****) exist to update the neural MT (NMT) model in the interactive-predictive scenario. | ||
| D18-1318 Over the course of one ***** AL ***** run, an agent annotates its dataset exhausting its labeling budget | ||
| Experimentally | 16 | |
| P19-1313 ***** Experimentally *****, we show that our approach achieves state-of-the-art performance on several commonly-used benchmarks. | ||
| 2021.emnlp-main.264 ***** Experimentally *****, our approaches significantly outperform the Transformer baseline and vanilla scheduled sampling on three large-scale WMT tasks. | ||
| P18-1104 ***** Experimentally *****, we show in our quantitative and qualitative analyses that the proposed models can successfully generate high-quality abstractive conversation responses in accordance with designated emotions. | ||
| 2021.naacl-main.115 ***** Experimentally *****, we set up a testbed based on four tagging tasks and thirteen datasets. | ||
| N18-2019 ***** Experimentally *****, we show that leveraging these two representations can significantly improve the f-score of a strong bidirectional LSTM baseline model by 10.1% | ||
| syllabification | 16 | |
| L06-1179 We describe the way ***** syllabification ***** was performed and explain how we have constructed the data base. | ||
| W17-2207 The presented ***** syllabification ***** method and soundscape analysis offer themselves as cross-disciplinary tools for low-resource languages. | ||
| L14-1279 Lexical statistics of stems and their ***** syllabification ***** are compiled by us from BOUN corpus of 490 million words. | ||
| L08-1437 An entry of RoSyllabiDict, in both formats, contains information about unsyllabified word, its syllabified correspondent, grammatical information and/or type of ***** syllabification *****, if it is the case. | ||
| L10-1150 Finally, performances are evaluated and compared to 3 other French ***** syllabification ***** systems and show significant improvements | ||
| WASSA | 16 | |
| W18-6209 In this paper we present our approach to tackle the Implicit Emotion Shared Task (IEST) organized as part of ***** WASSA ***** 2018 at EMNLP 2018. | ||
| W18-6227 In this paper, we describe our participation in ***** WASSA ***** 2018 | ||
| W17-5214 In this demo, I will talk about experiments to annotate and detect factual arguments that are linked to human needs/motivations from text and in consequence trigger emotion in the media audience and propose a new task for next year's ***** WASSA *****. | ||
| W18-6231 In this paper, we present neural models submitted to Shared Task on Implicit Emotion Recognition, organized as part of ***** WASSA ***** 2018. | ||
| W17-5236 This paper presents the combined LIPN-UAM participation in the ***** WASSA ***** 2017 Shared Task on Emotion Intensity | ||
| Transformer architectures | 16 | |
| P19-2049 We propose some structural changes to allow scheduled sampling to be applied to ***** Transformer architectures *****, via a two-pass decoding strategy. | ||
| 2020.coling-main.203 Experimental results show that S2S-SLS (with either RNN or ***** Transformer architectures *****) consistently outperforms baselines in various settings, especially when we have limited data. | ||
| 2021.calcs-1.12 ***** Transformer architectures ***** obtain the best results, despite not considering Guarani during pre-training, but traditional machine learning models perform close due to the low-resource nature of the problem. | ||
| 2021.maiworkshop-1.10 Recent progress in natural language processing has led to ***** Transformer architectures ***** becoming the predominant model used for natural language tasks. | ||
| 2020.conll-1.52 This paper investigates various ***** Transformer architectures ***** on the WikiReading Information Extraction and Machine Reading Comprehension dataset | ||
| CITATION | 16 | |
| 2021.wmt-1.18 Our news task submissions for English-German (En-De) and English-Russian (En-Ru) are built on top of a baseline transformer-based sequence-to-sequence model (***** CITATION *****). | ||
| 2020.findings-emnlp.67 To address this problem, we make two design choices: first, we focus on OneCommon Corpus (***** CITATION *****), a simple yet challenging common grounding dataset which contains minimal bias by design. | ||
| 2021.eacl-main.156 (***** CITATION *****) argued for using random splits rather than standard splits in NLP experiments. | ||
| 2020.spnlp-1.11 In particular, we start by considering a noisy channel approach (***** CITATION *****) that combines a target-to-source translation model and a language model. | ||
| 2021.adaptnlp-1.17 In this work, we rely on a fast implementation of C-PCFGs to conduct evaluation complementary to that of (***** CITATION *****) | ||
| Bantu | 16 | |
| L10-1167 These approaches are examined in a unique new context: the African, and in particular, the ***** Bantu ***** languages. | ||
| C18-1223 While this has been solved for the widely-used languages, this is still a challenge for the languages in the ***** Bantu ***** language family. | ||
| 2021.bucc-1.5 We demonstrate the proposed pipeline on two under-resourced agglutinating languages: the Dravidian language Malayalam and the ***** Bantu ***** language isiZulu. | ||
| L04-1251 We explain how the basic software tools of computational morphology are used in linguistic processing, more specifically for automatic word form recognition and morphological tagging of the growing stock of electronic text corpora of a ***** Bantu ***** language such as Zulu. | ||
| 2020.rail-1.4 Setswana language is one of the *****Bantu***** languages written disjunctively . | ||
| Switchboard | 16 | |
| L16-1018 We present examples of some of the tags and concepts with stories from ***** Switchboard *****, and some initial statistics of frequencies of the tags. | ||
| N18-1115 The experimental results on public datasets in the dialog problem (Babi dialog Task 6 and Frame), contextual language model (***** Switchboard ***** and Penn Tree Bank) and question answering (Trec QA) show that our novel CARNN-based architectures outperform previous methods. | ||
| P18-4022 We are able to train state-of-the-art models for translation and end-to-end models for speech recognition and show results on WMT 2017 and ***** Switchboard *****. | ||
| C16-1027 Experiments show that our model achieves the state-of-the-art f-score of 86.7% on the commonly used English ***** Switchboard ***** test set | ||
| 2020.coling-main.508 In response to ( i ) inconclusive results in the literature as to the properties of coreference chains in written versus spoken language , and ( ii ) a general lack of work on automatic coreference resolution on both spoken language and social media , we undertake a corpus study involving the various genre sections of Ontonotes , the *****Switchboard***** corpus , and a corpus of Twitter conversations . | ||
| Boolean | 16 | |
| S18-2013 A position paper arguing that purely graphical representations for natural language semantics lack a fundamental degree of expressiveness, and cannot deal with even basic ***** Boolean ***** operations like negation or disjunction. | ||
| D19-6103 We propose a new set of syntactic tasks focused on contradiction detection that require specific capacities over linguistic logical forms such as: ***** Boolean ***** coordination, quantifiers, definite description, and counting operators. | ||
| Q15-1027 We combine the advantages of the two views by inducing a mapping from distributional vectors of words (or sentences) into a ***** Boolean ***** structure of the kind in which natural language terms are assumed to denote. | ||
| 2021.emnlp-main.497 This problem is exacerbated when the questions are unanswerable or when the answers are ***** Boolean *****, since the model cannot rely on lexical overlap to make a connection between the answer and supporting evidence. | ||
| 2021.blackboxnlp-1.19 To obtain human-level interpretability, legacy TM employs ***** Boolean ***** input features such as bag-of-words (BOW) | ||
| GATE | 16 | |
| L08-1249 This paper details the basic processing steps for reported speech analysis and reports on performance of an implementation in form of a ***** GATE ***** resource. | ||
| L16-1587 ***** GATE ***** is a widely used open-source solution for text processing with a large user community. | ||
| L10-1563 In this paper, we explain our method which includes using a set of NLP modules developed using ***** GATE ***** (a General Architecture for Text Engineering), as well as a general purpose editing tool that we built to facilitate the IE rule creation process. | ||
| L10-1633 We developed a ***** GATE ***** resource called the OwlExporter that allows to easily map existing NLP analysis pipelines to OWL ontologies, thereby allowing language engineers to create ontology population systems without requiring extensive knowledge of ontology APIs | ||
| R17-1006 We present *****GATE***** DictLemmatizer , a multilingual open source lemmatizer for the GATE NLP framework that currently supports English , German , Italian , French , Dutch , and Spanish , and is easily extensible to other languages . | ||
| Stack | 16 | |
| P19-1211 Evaluation on adapting to/from news articles and ***** Stack ***** Exchange posts indicates that the use of these techniques can boost performance for both unsupervised adaptation as well as fine-tuning with limited target data. | ||
| W18-6122 Specifically, we learn definitions of software entities from a large corpus built from the user forum ***** Stack ***** Overflow. | ||
| W19-3519 By conducting a review of existing work, I show how gender should be explored in multiplicity in computational research through clustering techniques, and layout how this is being achieved in a study in progress on gender hostility on ***** Stack ***** Overflow. | ||
| D19-1262 However, in developer forums such as ***** Stack ***** Overflow, questions cover more diverse tasks including table manipulation or performance issues, where a table is not specified. | ||
| W19-3624 In this paper , we propose a novel online topic tracking framework , named IEDL , for tracking the topic changes related to deep learning techniques on *****Stack***** Exchange and automatically interpreting each identified topic . | ||
| permutation | 16 | |
| 1998.amta-papers.28 Following [Wu 1997], we assume that the ***** permutation ***** of binary-branching structures is a sufficient reordering mechanism for MT. | ||
| C16-1204 Existing approaches for evaluating word order in machine translation work with metrics computed directly over a ***** permutation ***** of word positions in system output relative to a reference translation. | ||
| D18-1376 Thus, the prediction of Q-value is invariant to the ***** permutation ***** of the comments, which leads to a more consistent agent behavior. | ||
| C18-1321 In order to detect the number of clusters automatically, concepts of variable length solutions and a vast range of ***** permutation ***** operators are introduced in the clustering process | ||
| 2021.emnlp-main.236 Transformer models are *****permutation***** equivariant . | ||
| reusable | 16 | |
| L08-1429 We propose a universal approach that makes the conversion tools ***** reusable *****. | ||
| 2005.mtsummit-osmtw.2 The main objective is the construction of an open, ***** reusable ***** and interoperable framework. | ||
| 2020.lrec-1.500 In addition to ***** reusable ***** tips for handling multilingual syntax, we provide a parallel benchmarking data set for further research. | ||
| N19-1063 Meanwhile, research on learning ***** reusable ***** text representations has begun to explore sentence-level texts, with some sentence encoders seeing enthusiastic adoption. | ||
| 2021.acl-demo.4 Meanwhile, our library maintains sufficient modularity and extensibility by properly decomposing the model architecture, inference, and learning process into highly ***** reusable ***** modules, which allows users to easily incorporate new models into our framework | ||
| constructing | 16 | |
| 2021.acl-long.5 Recent studies ***** constructing ***** direct interactions between the claim and each single user response (a comment or a relevant article) to capture evidence have shown remarkable success in interpretable claim verification. | ||
| L06-1497 In this paper, we propose to use FCA as a tool to help ***** constructing ***** an ontology through an existing lexical base. | ||
| 2020.nlpcss-1.5 Finally, we propose a method for predicting users subsequent topic, and by consequence their emotions, through ***** constructing ***** an Emotion Topic Hidden Markov Model, augmenting emotion transition states with topic information. | ||
| 2020.coling-main.439 We do so by turning the TED-Q dataset into a binary classification task, ***** constructing ***** an analogous task from explicit questions we extract from the BookCorpus (Zhu et al., 2015), and fitting a BERT-based classifier alongside models based on different notions of similarity. | ||
| 2021.naacl-main.243 We consider fine-tuning on auxiliary tasks, ***** constructing ***** a new topic classification task, integrating the topic classification objective directly into topic model training, and continued pre-training | ||
| backend | 16 | |
| 2000.iwpt-1.12 Moreover, SOUP is very efficient, which allows for practically instantaneous ***** backend ***** response. | ||
| W17-4212 In the present prototype they are generated in a ***** backend ***** service using clustering methods that operate on the extracted events. | ||
| 2021.germeval-1.13 Our contribution focuses on a feature-engineering approach with a conventional classification ***** backend *****. | ||
| L06-1169 Spoken dialogue systems are common interfaces to ***** backend ***** data in information retrieval domains. | ||
| D19-3015 We provide a web-based interface for annotation visualization and document ranking, with a modular ***** backend ***** to support interoperability with existing annotation tools | ||
| abstractness | 16 | |
| 2020.conll-1.17 BERT's token level knowledge also allows the testing of a type-level hypothesis about lexical ***** abstractness *****, demonstrating the relationship between token-level phenomena and type-level concreteness ratings. | ||
| L16-1193 Our results support the tested hypothesis, namely that agentivity and ***** abstractness ***** influence lexical aspect. | ||
| 2020.readi-1.6 Literature in psycholinguistics and neurosciences has showed that abstract and concrete concepts are perceived differently by our brain, and that the ***** abstractness ***** of a word can cause difficulties in reading. | ||
| W17-1903 Our contribution to this topic are as follows: i) we compare supervised techniques to learn and extend ***** abstractness ***** ratings for huge vocabularies ii) we learn and investigate norms for larger units by propagating ***** abstractness ***** to verb-noun pairs which lead to better metaphor detection iii) we overcome the limitation of learning a single rating per word and show that multi-sense ***** abstractness ***** ratings are potentially useful for metaphor detection. | ||
| W19-0506 At the same time, the studies reveal empirical evidence why contextual ***** abstractness ***** represents a valuable indicator for automatic non-literal language identification | ||
| undesired | 16 | |
| 2021.wnut-1.37 In this paper, we analyze how a controlled amount of desired and ***** undesired ***** text alterations impacts performance of BERT. | ||
| 2021.ranlp-1.108 Labels are provided by a multi-label CamemBERT classifier trained and checked on a manually annotated subset of the corpus, while the tweets are selected to avoid ***** undesired ***** biases. | ||
| 2020.emnlp-demos.12 The system also provides a revision module which enables users to revise ***** undesired ***** sentences or words of lyrics repeatedly. | ||
| 2020.acl-main.110 Recently research has started focusing on avoiding ***** undesired ***** effects that come with content moderation, such as censorship and overblocking, when dealing with hatred online. | ||
| W18-4421 Given the massive information overload on the Web, there is an imperious need to develop intelligent techniques to automatically detect harmful content, which would allow the large-scale social media monitoring and early detection of ***** undesired ***** situations | ||
| recency | 16 | |
| 2020.wosp-1.5 In this paper, we propose a modification of TF-IDF and other term-weighting schemes that weighs the terms based on the ***** recency ***** and the usage in the corpus. | ||
| N18-1124 However, the target-side context is solely based on the sequence model which, in practice, is prone to a ***** recency ***** bias and lacks the ability to capture effectively non-sequential dependencies among words. | ||
| 2021.cmcl-1.20 In particular, a lossy context model that assumes prior context to be affected by predictability and ***** recency ***** captures the distribution of the predicted verb class and error sources best. | ||
| N19-1356 Among other findings, (1) performance was higher in subject-verb-object order (as in English) than in subject-object-verb order (as in Japanese), suggesting that RNNs have a ***** recency ***** bias; (2) predicting agreement with both subject and object (polypersonal agreement) improves over predicting each separately, suggesting that underlying syntactic knowledge transfers across the two tasks; and (3) overt morphological case makes agreement prediction significantly easier, regardless of word order. | ||
| N19-1073 The following serial recall effects are generally investigated in studies with humans: word length and frequency, primacy and ***** recency *****, semantic confusion, repetition, and transposition effects | ||
| TensorFlow | 16 | |
| 2021.emnlp-demo.24 The library is fully open source, and compatible with both PyTorch and ***** TensorFlow *****, which allows existing neural network layers to be replaced with or transformed into boxes easily. | ||
| 2021.emnlp-main.160 Experimental results show that our method is 8.2x faster than HuggingFace Tokenizers and 5.1x faster than ***** TensorFlow ***** Text on average for general text tokenization. | ||
| P19-3027 Texar supports both ***** TensorFlow ***** and PyTorch, and is released under Apache License 2.0 at https://www.texar.io. | ||
| D19-3029 Besides, OpenNRE provides various functional RE modules based on both ***** TensorFlow ***** and PyTorch to maintain sufficient modularity and extensibility, making it becomes easy to incorporate new models into the framework. | ||
| C18-1030 This paper presents the results of a reproduction study and analysis of this technique using only openly available datasets (GigaWord, SemCor, OMSTI) and software (***** TensorFlow *****) | ||
| SCAN | 16 | |
| D19-1438 In the ***** SCAN ***** domain, it boosts accuracies from 14.0% to 98.8% in Jump task, and from 92.0% to 99.7% in TurnLeft task. | ||
| 2020.emnlp-main.447 On tasks that require strong compositional generalization such as ***** SCAN ***** and semantic parsing, SeqMix also offers further improvements. | ||
| W18-5407 Here, we take a closer look at ***** SCAN ***** and show that it does not always capture the kind of generalization that it was designed for. | ||
| 2020.findings-emnlp.208 We also present experimental results on German-English translation on the Multi30k dataset, and qualitatively analyse the induced tree structures our model learns for the ***** SCAN ***** tasks and the German-English translation task. | ||
| 2021.ranlp-1.11 We address the compositionality challenge presented by the ***** SCAN ***** benchmark | ||
| gestural | 16 | |
| L10-1245 In this paper, we discuss the theoretical, sociolinguistic, methodological and technical objectives and issues of the French Creagest Project (2007-2012) in setting up, documenting and annotating a large corpus of adult and child French Sign Language (LSF) and of natural ***** gestural ***** language. | ||
| L16-1552 We present a corpus of 44 human-agent verbal and ***** gestural ***** story retellings designed to explore whether humans would ***** gestural *****ly entrain to an embodied intelligent virtual agent. | ||
| L16-1345 In this paper we present a multimodal database of affect bursts, which are very short non-verbal expressions with facial, vocal, and ***** gestural ***** components that are highly synchronized and triggered by an identifiable event. | ||
| L16-1235 As part of a human - robot interaction project , we are interested by *****gestural***** modality as one of many ways to communicate . | ||
| L14-1013 The Active Listening Corpus ( ALICO ) is a multimodal database of spontaneous dyadic conversations with diverse speech and *****gestural***** annotations of both dialogue partners . | ||
| offline | 16 | |
| 2021.naacl-industry.5 The proposed finetuning technique based on a small amount of high-quality, annotated data resulted in 26% ***** offline ***** and 33% online performance improvement in Recall@1 over the pretrained model. | ||
| W16-4105 While the online models give us a better understanding of the cognitive correlates of reading with text complexity and language proficiency, modeling of the ***** offline ***** measures can be particularly relevant for incorporating user aspects into readability models. | ||
| 2021.acl-demo.34 We not only released an online platform at the website but also make our evaluation tool an API with MIT Licence at Github and PyPi that allows users to conveniently assess their models ***** offline *****. | ||
| 2020.bionlp-1.16 Research on analyzing reading patterns of dyslectic children has mainly been driven by classifying dyslexia types ***** offline *****. | ||
| 2021.emnlp-main.537 To illustrate this argument, we propose an interpretation test set and conduct a realistic evaluation of SiMT trained on ***** offline ***** translations | ||
| subcorpus | 16 | |
| L14-1014 93%, as measured on the one million ***** subcorpus ***** of the National Corpus of Polish (NCP). | ||
| L16-1561 Apart from the pairwise aligned documents, a fully aligned ***** subcorpus ***** for the six official UN languages is distributed. | ||
| 2020.lrec-1.416 A frequency dictionary provides much sought after information about word frequency statistics, computed for each ***** subcorpus ***** as well as aggregate, disambiguating homographs based on their respective lemmas and morphosyntactic tags. | ||
| L16-1513 Consisting of two subcorpora, i.e. the clinical ***** subcorpus *****, consisting of written texts produced by speakers with various types of language disorders, and the healthy speakers ***** subcorpus *****, as well as by the levels of its annotation, it offers an opportunity for different lines of research. | ||
| L16-1031 Depending on the ***** subcorpus *****, learner texts may contain additional information, such as text genres, topics, grades | ||
| QE | 16 | |
| 2020.emnlp-main.205 In order to make use of different types of human evaluation data for supervised learning, we present a multi-task learning ***** QE ***** model that jointly learns two tasks: score a translation and rank two translations. | ||
| 2020.bionlp-1.2 Our sequence-to-set modeling approach to predict semantic tags, gives to the best of our knowledge, the state-of-the-art for both, an unsupervised query expansion (***** QE *****) task for the TREC CDS 2016 challenge dataset when evaluated on an Okapi BM25–based document retrieval system; and also over the MLTM system baseline baseline (Soleimani and Miller, 2016), for both supervised and semi-supervised multi-label prediction tasks on the del.icio.us and Ohsumed datasets. | ||
| 2021.emnlp-main.267 Experimental results show that our method outperforms previous unsupervised methods on several ***** QE ***** tasks in different language pairs and domains. | ||
| 2021.emnlp-main.799 Achieving accurate automatic word-level ***** QE ***** is very hard, and it is currently not known (i) at what quality threshold ***** QE ***** is actually beginning to be useful for human PE, and (ii), how to best present word-level ***** QE ***** information to translators. | ||
| 2021.wmt-1.67 Thus, in order to be useful, ***** QE ***** systems should be able to detect such errors | ||
| obtain | 16 | |
| L14-1573 This lexicon and language model were used in an Automatic Speech Recognition system for the Luxembourgish language which ***** obtain ***** a 25\% WER on the Quaero development data. | ||
| D17-1300 We approach this problem from two angles: First, we describe several techniques for speeding up an NMT beam search decoder, which ***** obtain ***** a 4.4x speedup over a very efficient baseline decoder without changing the decoder output. | ||
| D18-1121 Most existing models ***** obtain ***** training data using distant supervision, and inevitably suffer from the problem of noisy labels. | ||
| L08-1503 Standard GMM speaker recognition techniques ***** obtain ***** an overall correct classification rate of 82% | ||
| 2020.restup-1.4 It is not only that manual annotation is expensive , it is also the case that the phenomenon is sparse , causing human annotators having to go through a large number of irrelevant examples in order to *****obtain***** some significant data . | ||
| humorous | 16 | |
| N19-1217 A pun is a form of wordplay for an intended ***** humorous ***** or rhetorical effect, where a word suggests two or more meanings by exploiting polysemy (homographic pun) or phonological similarity to another word (heterographic pun). | ||
| D17-1051 We propose a generative language model, based on the theory of incongruity, to model ***** humorous ***** text, which allows us to leverage background text sources, such as Wikipedia entry descriptions, and enables construction of multiple features for identifying ***** humorous ***** reviews. | ||
| 2021.semeval-1.34 0 means the task is not ***** humorous ***** or not offensive, 5 means the task is very ***** humorous ***** or very offensive. | ||
| S17-2005 A pun is a form of wordplay in which a word suggests two or more meanings by exploiting polysemy, homonymy, or phonological similarity to another word, for an intended ***** humorous ***** or rhetorical effect. | ||
| 2021.semeval-1.170 To address these challenges SemEval-2021 introduced a HaHackathon task focusing on detecting and rating ***** humorous ***** and offensive texts | ||
| register | 16 | |
| W19-8706 We use a range of morpho-syntactic features inspired by research in ***** register ***** studies (e.g. Biber, 1995; Neumann, 2013) and translation studies (e.g. Ilisei et al., 2010; Zanettin, 2013; Kunilovskaya and Kutuzov, 2018) to reveal the association between translationese and human translation quality. | ||
| 2020.lrec-1.875 This system enables querying of multiple corpora by certain categories, such as ***** register ***** type and period. | ||
| 2021.eacl-srw.24 We explore cross-lingual transfer of ***** register ***** classification for web documents. | ||
| C16-1072 Previous linguistic research on scientific writing has shown that language use in the scientific domain varies considerably in ***** register ***** and style over time | ||
| 2021.nodalida-main.16 This article studies *****register***** classification of documents from the unrestricted web , such as news articles or opinion blogs , in a multilingual setting , exploring both the benefit of training on multiple languages and the capabilities for zero - shot cross - lingual transfer . | ||
| Toponym | 16 | |
| S19-2229 Tsinghua University ***** Toponym ***** resolution is an important and challenging task in the neural language processing field, and has wide applications such as emergency response and social media geographical event analysis. | ||
| 2020.coling-main.58 We created a family of ***** Toponym ***** Identification Models based on BERT (TIMBERT), in order to learn in an end-to-end fashion the mapping from an input sentence to the associated sentence labeled with toponyms. | ||
| S19-2228 This article describes the system submitted by the RGCL-WLV team to the SemEval 2019 Task 12: ***** Toponym ***** resolution in scientific papers. | ||
| S19-2230 We focus on Subtask 1: ***** Toponym ***** Detection which is the identification of spans of text for place names mentioned in a document | ||
| W19-2607 *****Toponym***** detection in scientific papers is an open task and a key first step in place entity enrichment of documents . | ||
| lingual | 16 | |
| L12-1106 In the future, we are planning to carry out a syntactic annotation of the HunOr corpus, which will further enhance the usability of the corpus in various NLP fields such as transfer-based machine translation or cross ***** lingual ***** information retrieval. | ||
| W19-3622 Fixing the grammatical gender bias results in a positive effect on the quality of the resulting word embeddings, both in mono***** lingual ***** and cross ***** lingual ***** settings. | ||
| S19-2015 We chose a standard neural tagger and we focus on our recursive parsing strategy and on the cross ***** lingual ***** transfer problem to develop a robust model for the French language, using only few training samples | ||
| D19-6121 His- torically, g2p systems were transition- or rule- based, making generalization beyond a mono- ***** lingual ***** (high resource) domain impractical | ||
| 2021.emnlp-main.499 With the widespread adoption of conversational agents and chat platforms , code - switching has become an integral part of written conversations in many multi - *****lingual***** communities worldwide . | ||
| noising | 16 | |
| 2020.wmt-1.83 In our experiment, we implemented a ***** noising ***** module that simulates four types of post-editing errors, and we introduced this module into a Transformer-based multi-source APE model. | ||
| W19-5206 We propose a simpler alternative to ***** noising ***** techniques, consisting of tagging back-translated source sentences with an extra token. | ||
| 2020.acl-main.138 To this end, we formulate the noisy sequence labeling problem, where the input may undergo an unknown ***** noising ***** process and propose two Noise-Aware Training (NAT) objectives that improve robustness of sequence labeling performed on perturbed input: Our data augmentation method trains a neural model using a mixture of clean and noisy samples, whereas our stability training algorithm encourages the model to create a noise-invariant latent representation. | ||
| 2020.acl-main.703 We evaluate a number of ***** noising ***** approaches, finding the best performance by both randomly shuffling the order of sentences and using a novel in-filling scheme, where spans of text are replaced with a single mask token. | ||
| 2021.naacl-main.434 We explore a two-stage and a gradual schedule, and find that, compared with standard single-stage training, curriculum data augmentation trains faster, improves performance, and remains robust to high amounts of ***** noising ***** from augmentation | ||
| Hebrew | 16 | |
| L14-1572 The final task was to create an ontology for a language with far fewer resources (***** Hebrew *****). | ||
| K18-2021 Our results on ***** Hebrew ***** underscore the importance of CoNLL-UL, a UD-compatible standard for accessing external lexical resources, for enhancing end-to-end UD parsing, in particular for morphologically rich and low-resource languages. | ||
| L10-1112 We introduce a dedicated transcription scheme for the spoken ***** Hebrew ***** data that is aware both of the phonology and of the standard orthography of the language. | ||
| 2020.lrec-1.727 We present a semantic role labeling resource for ***** Hebrew ***** built semi-automatically through annotation projection from English. | ||
| 2020.findings-emnlp.442 We focus on the ***** Hebrew ***** language as a test case due to the unusual regularity of its noun formation | ||
| overlapping | 16 | |
| L16-1344 In a final set of experiments, it presents some facts about ***** overlapping ***** laughs. | ||
| D18-1019 In this work, we propose a novel segmental hypergraph representation to model ***** overlapping ***** entity mentions that are prevalent in many practical datasets. | ||
| 2020.bea-1.5 The CIMA collection, which we make publicly available, is novel in that students are exposed to ***** overlapping ***** grounded concepts between exercises and multiple relevant tutoring responses are collected for the same input. | ||
| D17-1276 We present some theoretical analysis on the differences between our model and a recently proposed model for recognizing ***** overlapping ***** mentions, and discuss the possible implications of the differences. | ||
| 2014.iwslt-papers.12 We show how to extract ***** overlapping ***** phrases offline for hierarchical phrasebased SMT, and how to extract features and tune weights for the new phrases | ||
| capsule | 16 | |
| D18-1350 Capsule networks achieve state of the art on 4 out of 6 datasets, which shows the effectiveness of ***** capsule ***** networks for text classification. | ||
| D19-1074 To the best of our knowledge, this is the first work that ***** capsule ***** networks have been empirically investigated for sequence to sequence problems. | ||
| P19-2045 Our results confirm the hypothesis that ***** capsule ***** networks are especially advantageous for rare events and structurally diverse categories, which we attribute to their ability to combine latent encoded information. | ||
| P19-1150 Obstacles hindering the development of ***** capsule ***** networks for challenging NLP applications include poor scalability to large output spaces and less reliable routing processes. | ||
| N19-1226 In this paper , we introduce an embedding model , named CapsE , exploring a *****capsule***** network to model relationship triples ( subject , relation , object ) . | ||
| preferences | 16 | |
| D18-1006 In this paper, we show how the predicted effects of actions in the context of a paragraph can be improved in two ways: (1) by incorporating global, commonsense constraints (e.g., a non-existent entity cannot be destroyed), and (2) by biasing reading with ***** preferences ***** from large-scale corpora (e.g., trees rarely move). | ||
| D18-1445 The merit of preference-based interactive summarisation is that ***** preferences ***** are easier for users to provide than reference summaries. | ||
| 2021.cmcl-1.4 We test two variants of the MPT-based model on experimental data from English and Turkish and demonstrate that our method can provide deeper insight into the processes underlying participants' answering behavior and their interpretation ***** preferences ***** than an analysis based on raw percentages. | ||
| D18-1394 Gender was associated with ***** preferences ***** for different continuous sentiment trajectories. | ||
| P18-2112 In addition, the learned aspect-aware representations discover those aspects that users are more inclined to discuss and bias the generated text toward their personalized aspect ***** preferences ***** | ||
| Active | 16 | |
| L10-1443 The data reveals that ***** Active ***** Learning keeps its competitive advantage over random sampling in both scenarios though the difference is less marked for the time metric than for the token metric. | ||
| L10-1554 Our case study on 1,000 manually annotated instances of the German verb ”“drohen”” (threaten) shows that the best performance is not obtained when training on the full data set, but by carefully selecting new training instances with regard to their informativeness for the learning process (***** Active ***** Learning) | ||
| D18-1165 *****Active***** learning identifies data points to label that are expected to be the most useful in improving a supervised model . | ||
| 2021.iwcs-1.20 *****Active***** learning has been shown to reduce annotation requirements for numerous natural language processing tasks , including semantic role labeling ( SRL ) . | ||
| D17-1063 *****Active***** learning aims to select a small subset of data for annotation such that a classifier learned on the data is highly accurate . | ||
| recipe | 16 | |
| 2020.lrec-1.527 In this paper, we provide a dataset that gives visual grounding annotations to ***** recipe ***** flow graphs. | ||
| P18-1221 Experimental results show that our model achieves 52.2% better perplexity in ***** recipe ***** generation and 22.06% on code generation than state-of-the-art language models. | ||
| W16-4603 The recent growth in ***** recipe ***** sharing websites and food blogs has resulted in numerous ***** recipe ***** texts being available for diverse foods in various languages. | ||
| D19-1613 Existing approaches to ***** recipe ***** generation are unable to create ***** recipe *****s for users with culinary preferences but incomplete knowledge of ingredients in specific dishes | ||
| 2021.eacl-srw.20 However, there is a large number of new online ***** recipe *****s generated daily with a large number of users reviews, with recommendations to improve the ***** recipe ***** flavor and ideas to modify them. | ||
| partial | 16 | |
| Q15-1039 In this paper, we combine ideas from imitation learning and agenda-based parsing to train a semantic parser that searches ***** partial ***** logical forms in a more strategic order. | ||
| N19-1384 Our empirical evaluation points out that our ***** partial ***** translations can be used in combination with back-translation to further improve NMT models. | ||
| 2020.emnlp-main.26 We investigate how they behave under incremental interfaces, when ***** partial ***** output must be provided based on ***** partial ***** input seen up to a certain time step, which may happen in interactive systems. | ||
| 1999.mtsummit-1.74 It describes how units of translation are defined, ***** partial ***** translation is derived and composed into a whole. | ||
| P18-2052 We empirically investigate learning from ***** partial ***** feedback in neural machine translation (NMT), when ***** partial ***** feedback is collected by asking users to highlight a correct chunk of a translation | ||
| unsupervised lexical | 16 | |
| 2020.lrec-1.881 The tool was already used to build a manually annotated corpus with semantic frames and its arguments for task 2 of SemEval 2019 regarding ***** unsupervised lexical ***** frame induction (QasemiZadeh et al., 2019). | ||
| L12-1174 We conclude with an example on how SMALLWorlds may be used: ***** unsupervised lexical ***** learning from phonetic transcription. | ||
| 2020.semeval-1.23 We present a system for the task of ***** unsupervised lexical ***** change detection: given a target word and two corpora spanning different periods of time, automatically detects whether the word has lost or gained senses from one corpus to another. | ||
| L16-1102 This benchmark enables the evaluation of parser robustness as well as text normalization methods, including normalization as machine translation and ***** unsupervised lexical ***** normalization, directly on syntactic trees. | ||
| S18-2016 We present a method for ***** unsupervised lexical ***** frame acquisition at the syntax–semantics interface. | ||
| discourse structures | 16 | |
| 2021.acl-long.499 We conjecture that this is because of the difficulty for the decoder to capture the high-level semantics and ***** discourse structures ***** in the context beyond token-level co-occurrence. | ||
| 2020.lrec-1.149 We characterize the performance of various classification algorithms on this dataset and perform ablation studies to understand the nature of the linguistic models suitable for capturing the nuances of the embedded ***** discourse structures ***** in the presented corpus. | ||
| L12-1362 We focus on the comparison of two particular text types, interviews and popular science texts, as instances of spoken and written texts since they display quite different ***** discourse structures *****. | ||
| 2021.eacl-main.218 We apply richly contextualized deep representation learning pre-trained on biomedical domain corpus to the analysis of scientific ***** discourse structures ***** and the extraction of “evidence fragments” (i.e., the text in the results section describing data presented in a specified subfigure) from a set of biomedical experimental research articles. | ||
| W17-4810 As unaligned ***** discourse structures ***** may also result in the loss of discourse information in the MT training data, we hope to deliver information in support of discourse-aware machine translation (MT). | ||
| 16 | ||
| 2020.lrec-1.685 Authors: Mohamed Abdellatif and Ahmed Elgammal Gitlab URL: https://gitlab.com/abdollatif/lrec_app Commit hash: 3f20b2ddb96d8c865e5f56f5566edf371214785f Tag name: Splits2 Dataset file md5: 5aee3dac5e48d1ac3d279083212734c9 Dataset URL: https://drive.***** google *****.com/file/d/1cv5HuQhgFVizupFI40dzreemS2gMM498/view?usp=sharing | ||
| L10-1013 ScriptTranscriber is available as part of the nltk contrib source tree at http://code.***** google *****.com/p/nltk/. | ||
| 2021.acl-long.549 The dataset is publicly available at https://github.com/***** google *****-research-datasets/timedial. | ||
| 2021.eacl-demos.27 We demonstrate our proposed method as a chrome plugin on ***** google ***** search. | ||
| D17-3005 Few examples include language modeling, question answering, visual question answering, and dialogue systems.For updated information and material, please refer to our tutorial website: https://sites.***** google *****.com/view/mann-emnlp2017/. | ||
| multilingual natural language | 16 | |
| 2021.acl-demo.8 With more than 7000 languages worldwide, ***** multilingual natural language ***** processing (NLP) is essential both from an academic and commercial perspective. | ||
| L10-1019 Parallel corpora are indispensable resources for a variety of ***** multilingual natural language ***** processing tasks. | ||
| K19-1020 Not only is it a key task in crosslingual natural language understanding (XLU), it is also particularly useful for identifying parallel resources for training and evaluating downstream ***** multilingual natural language ***** processing (NLP) applications, such as machine translation. | ||
| C16-1142 This can be beneficial for a series of ***** multilingual natural language ***** processing (NLP) tasks. | ||
| L06-1044 We describe SProUTomat, a tool for daily building, testing and evaluating a complex general-purpose ***** multilingual natural language ***** text processor including its linguistic resources (lingware). | ||
| automatic classification | 16 | |
| W17-0813 The resource will consist of a lexicon that describes constructions that trigger causality as well as the participants of the causal event, and will be augmented by a corpus with annotated instances for each entry, that can be used as training data to develop a system for ***** automatic classification ***** of causal relations. | ||
| C16-1051 In this work, we propose an ***** automatic classification ***** tool to predict the matching category for a given product title and description. | ||
| W16-3915 We test our ***** automatic classification ***** system with four categories: politics, economy, sports and the medical field. | ||
| 2020.smm4h-1.31 Task 1 targets the ***** automatic classification ***** of tweets that mention medication. | ||
| U18-1012 We present methods for the ***** automatic classification ***** of patent applications using an annotated dataset provided by the organizers of the ALTA 2018 shared task - Classifying Patent Applications. | ||
| typed feature structures | 16 | |
| 1995.iwpt-1.33 We emphasize the notion of abstract ***** typed feature structures ***** (AFSs) that encode the essential information of TFSs and define unification over AFSs rather than over TFSs. | ||
| 1993.eamt-1.16 The approach advocated is corpus-based, computationally supported, and aimed at the construction of parallel monolingual dictionary fragments which can be linked to form translation dictionaries without many problems.The parallelism of the monolingual fragments is achieved through the use of a shared inventory of descriptive devices, one common representation formalism (***** typed feature structures *****) for linguistic information from all levels, as well as a working methodology inspired by onomasiology: treating all elements of a given lexical semantic field consistently with common descriptive devices at the same time.It is claimed that such monolingual dictionaries are particularly easy to relate in a machine translation application. | ||
| L10-1538 This is the reason why we propose a preliminary formal annotation model, represented with ***** typed feature structures *****. | ||
| 1991.iwpt-1.17 Using ***** typed feature structures ***** with multiple inheritance for our linguistic representations and definite attribute-value logic clauses to express constraints, we will develop the bare essentials required for an implementation of a parser and generator for the Head-driven Phrase Structure Grammar (HPSG) formalism of Pollard and Sag (1987). | ||
| 2020.isa-1.3 The paper presents an annotation schema with the following characteristics: it is formally compact; it systematically and compositionally expands into fullfledged analytic representations, exploiting simple algorithms of ***** typed feature structures *****; its representation of various dimensions of semantic content is systematically integrated with morpho-syntactic and lexical representation; it is integrated with a `deep' parsing grammar. | ||
| instructions | 16 | |
| D19-1218 We build a game environment to study this scenario, and learn to map user ***** instructions ***** to system actions. | ||
| 2020.lrec-1.702 Do these edits improve ***** instructions ***** only in terms of style and correctness, or do they provide clarifications necessary to follow the ***** instructions ***** and to accomplish the goal? | ||
| 2020.coling-main.3 However, some ingredients occur frequently within the ***** instructions ***** while most occur rarely. | ||
| 2020.acl-main.644 This “point anywhere” approach leads to more linguistically complex ***** instructions *****, as shown in our analyses. | ||
| L12-1094 We introduce a generic semantic representation of procedures for analysing ***** instructions *****, using which natural language techniques are applied to automatically extract structured procedures from ***** instructions *****. | ||
| semantic processing | 16 | |
| 2020.wildre-1.11 There are three goals of SATS - to make manuscript summaries, to enrich the ***** semantic processing ***** of Sanskrit, and to improve the information retrieval systems in the language. | ||
| L12-1542 Together with the ontology, the WordNet mappings provide a extremely rich and powerful basis for ***** semantic processing ***** of text in any domain. | ||
| W18-1304 This paper describes the first version of an open-source semantic parser that creates graphical representations of sentences to be used for further ***** semantic processing *****, e.g. | ||
| W03-3021 We observe that the MM does not always hold for tree-bank models, and that optimizing weak metrics is not interesting for ***** semantic processing *****. | ||
| D17-1113 Research in computational semantics is increasingly guided by our understanding of human ***** semantic processing *****. | ||
| biomedical translation shared | 16 | |
| 2020.wmt-1.95 This paper describes the machine translation systems developed by the University of Sheffield (UoS) team for the ***** biomedical translation shared ***** task of WMT20. | ||
| W19-5422 This paper describes the machine translation systems developed by the Barcelona Supercomputing (BSC) team for the ***** biomedical translation shared ***** task of WMT19. | ||
| 2020.wmt-1.93 This paper describes Huawei's submissions to the WMT20 ***** biomedical translation shared ***** task. | ||
| W19-5420 This paper describes Huawei's neural machine translation systems for the WMT 2019 ***** biomedical translation shared ***** task. | ||
| W18-6448 This paper describes the machine translation systems developed by the Universidade Federal do Rio Grande do Sul (UFRGS) team for the ***** biomedical translation shared ***** task. | ||
| recognition systems | 16 | |
| L04-1263 These databases were used to train and test speech ***** recognition systems ***** applied in a multilingual telephone-based prototype hotel booking system. | ||
| 2012.iwslt-papers.15 From oracle experiments, we show an upper bound of translation quality if we had human-generated segmentation and punctuation on the output stream of speech ***** recognition systems *****. | ||
| P19-2010 We will focus on an intelligibility analysis based on automatic speech ***** recognition systems ***** trained on these three languages. | ||
| 2021.cl-1.5 Abstract Named entity ***** recognition systems ***** achieve remarkable performance on domains such as English news. | ||
| W16-4712 The main purpose of the dictionary is to improve the recall of phenotype concept ***** recognition systems *****. | ||
| collocation extraction | 16 | |
| L10-1251 Our web services take for example corpora of several million words as an input on which they perform preprocessing, such as tokenisation, tagging, lemmatisation and parsing, and corpus exploration, such as ***** collocation extraction ***** and corpus comparison. | ||
| W19-5112 The study is based on the electronic dictionary DWDS (Klein and Geyken, 2010) and uses the ***** collocation extraction ***** tool Wortprofil (Geyken et al., 2009). | ||
| 2020.mwe-1.13 Firstly, the ***** collocation extraction ***** module was evaluated by a corpus with manually annotated collocations. | ||
| W17-1703 This paper presents a new strategy for multilingual ***** collocation extraction ***** which takes advantage of parallel corpora to learn bilingual word-embeddings. | ||
| L12-1470 We approach ***** collocation extraction ***** as a classification problem where the task is to classify a given n-gram as either a collocation (positive) or a non-collocation (negative). | ||
| neural coreference resolution | 16 | |
| 2021.naacl-main.125 External syntactic and semantic information has been largely ignored by existing ***** neural coreference resolution ***** models. | ||
| D19-5727 We have employed an existing span-based state-of-theart ***** neural coreference resolution ***** system as a baseline system. | ||
| W18-2324 In this paper, we investigate the utility of the state-of-the-art general domain ***** neural coreference resolution ***** system on biomedical texts. | ||
| 2021.codi-sharedtask.6 We present an effective system adapted from the end-to-end ***** neural coreference resolution ***** model, targeting on the task of anaphora resolution in dialogues. | ||
| W19-2806 To begin with, we build the first ***** neural coreference resolution ***** system for Basque, training it with the relatively small EPEC-KORREF corpus (45,000 words). | ||
| neural generation | 16 | |
| D19-1197 Since the paired data now is no longer enough to train a ***** neural generation ***** model, we consider leveraging the large scale of unpaired data that are much easier to obtain, and propose response generation with both paired and unpaired data. | ||
| D18-1356 This work proposes a ***** neural generation ***** system using a hidden semi-markov model (HSMM) decoder, which learns latent, discrete templates jointly with learning to generate. | ||
| W19-2308 This work focuses on designing a symbolic intermediate representation to be used in multi-stage ***** neural generation ***** with the intention of reducing the frequency of failed outputs. | ||
| D19-1055 Recent all-in-one style ***** neural generation ***** models have made impressive progress, yet they often produce outputs that are incoherent and unfaithful to the input. | ||
| W18-5019 To date, work on task-oriented ***** neural generation ***** has primarily focused on semantic fidelity rather than achieving stylistic goals, while work on style has been done in contexts where it is difficult to measure content preservation. | ||
| sentence structure | 16 | |
| 2021.isa-1.3 Literary texts feature a rich variety in expressing quantification, including a broad range of lexemes to express quantifiers and complex ***** sentence structure *****s to express the restrictor and the nuclear scope of a quantification. | ||
| 2021.ranlp-1.175 Traditional methods only consider the sequence information of a sentence while ignoring the rich ***** sentence structure ***** and word relationship information. | ||
| 2021.naacl-main.171 positive to negative), which enable controllability at a high level but do not offer fine-grained control involving ***** sentence structure *****, emphasis, and content of the sentence. | ||
| L14-1538 FSTs/FSA are also used for analysing corpora in order to retrieve recursive ***** sentence structure *****s, in which combinatorial and semantic constraints identify properties and denote relationship. | ||
| 1963.earlymt-1.21 A system for automatically producing a *****sentence structure***** diagram for each analysis of a given sentence has been added to the program of the multiple - path syntactic analyzer . | ||
| short answer | 16 | |
| 2020.coling-main.76 Automatic content scoring systems are widely used on ***** short answer ***** tasks to save human effort. | ||
| 2020.lrec-1.321 In this paper, we introduce AR-ASAG, an Arabic Dataset for automatic ***** short answer ***** grading. | ||
| W17-5017 The inputs to neural essay scoring models – ngrams and embeddings – are arguably well-suited to evaluate content in ***** short answer ***** scoring tasks. | ||
| Q19-1026 An annotator is presented with a question along with a Wikipedia page from the top 5 search results, and annotates a long answer (typically a paragraph) and a ***** short answer ***** (one or more entities) if present on the page, or marks null if no long/***** short answer ***** is present. | ||
| 2020.winlp-1.8 So far different research works have been conducted to achieve *****short answer***** questions . | ||
| partial parsing | 16 | |
| L06-1343 In this paper, we describe the second release of a suite of language analysers, developed over the last five years, called wraetlic, which includes tools for several ***** partial parsing ***** tasks, both for English and Spanish. | ||
| L08-1026 In this paper we propose a ***** partial parsing ***** model which achieves robust parsing with a large HPSG grammar. | ||
| L10-1380 This ***** partial parsing ***** is based on a preprocessing stage of the spoken data that consists in reformatting and tagging utterances that break the syntactic structure of the text, such as disfluencies. | ||
| W18-5409 A recent model instead used ***** partial parsing ***** as an auxiliary task in sequential neural network architectures to inject syntactic information. | ||
| L10-1577 In this paper, we present the results of an experiment with utilizing a stochastic morphosyntactic tagger as a pre-processing module of a rule-based chunker and partial parser for Croatian in order to raise its overall chunking and ***** partial parsing ***** accuracy on Croatian texts. | ||
| language comprehension | 16 | |
| 2021.emnlp-main.74 While its implications on language production have been well explored, the hypothesis potentially makes predictions about ***** language comprehension ***** and linguistic acceptability as well. | ||
| W19-3410 Therefore, script knowledge is a central component to ***** language comprehension *****. | ||
| W19-2906 Processing difficulty in online ***** language comprehension ***** has been explained in terms of surprisal and entropy reduction. | ||
| D19-1594 This finding is consistent with the idea that humans adapt syntactic expectations to particular genres during ***** language comprehension ***** (Kaan and Chun, 2018; Branigan and Pickering, 2017). | ||
| 2021.cmcl-1.9 Eye movement data during reading is a useful source of information for understanding *****language comprehension***** processes . | ||
| deep learning based | 16 | |
| W18-6247 Our inference results indicate the feasibility of using ***** deep learning based ***** verbal content representation in inferring hirability scores from online conversational video resumes. | ||
| 2021.emnlp-demo.25 The task of lexical normalisation aims to standardise such corpora, but currently lacks suitable tools to acquire high-quality annotated data to support ***** deep learning based ***** approaches. | ||
| N19-1185 Results of extensive experiments on single and multi-target stance detection datasets show that our proposed method achieves substantial improvement over the current state-of-the-art ***** deep learning based ***** methods. | ||
| C18-1181 Recently, many ***** deep learning based ***** methods have been proposed for the task. | ||
| C16-1135 In this paper, we propose a novel approach for AQP known as - “Deep Feature Fusion Network (DFFN)” which combines the advantages of both hand-crafted features and ***** deep learning based ***** systems. | ||
| video captioning | 16 | |
| 2020.findings-emnlp.98 In the proposed study, we make the first attempt to train the ***** video captioning ***** model on labeled data and unlabeled data jointly, in a semi-supervised learning manner. | ||
| 2021.naacl-main.193 Comprehensive experiments on three video-and-language tasks (text-to-video retrieval, ***** video captioning *****, and video question answering) across five datasets demonstrate that our approach outperforms previous state-of-the-art methods. | ||
| 2020.emnlp-main.61 Observable changes such as movements, manipulations, and transformations of the objects in the scene, are reflected in conventional ***** video captioning *****. | ||
| 2020.aacl-main.48 First, we construct and release a new dense ***** video captioning ***** dataset, Video Timeline Tags (ViTT), featuring a variety of instructional videos together with time-stamped annotations. | ||
| D17-1103 Sequence-to-sequence models have shown promising improvements on the temporal task of ***** video captioning *****, but they optimize word-level cross-entropy loss during training. | ||
| implicit emotion | 16 | |
| W18-6230 This paper describes an approach to solve ***** implicit emotion ***** classification with the use of pre-trained word embedding models to train multiple neural networks. | ||
| W18-6250 We describe UBC-NLP contribution to IEST-2018, focused at learning ***** implicit emotion ***** in Twitter data. | ||
| W18-6235 We present BrainT, a multi-class, averaged perceptron tested on ***** implicit emotion ***** prediction of tweets. | ||
| 2020.socialnlp-1.6 We further evaluate our pipeline quantitatively in an automated and an annotation study based on Tweets and find, indeed, that simultaneous adjustments of content and emotion are conflicting objectives: as we show in a qualitative analysis motivated by Scherer's emotion component model, this is particularly the case for ***** implicit emotion ***** expressions based on cognitive appraisal or descriptions of bodily reactions. | ||
| 2020.lrec-1.203 The corpus deals with both explicit and ***** implicit emotion *****s with more emphasis being placed on the implicit ones. | ||
| statistical translation | 16 | |
| 2008.iwslt-evaluation.18 For the pivot task, we combined the translations generated by a pivot based ***** statistical translation ***** model and a statistical transfer translation model (firstly, translating from Chinese to English, and then from English to Spanish). | ||
| L08-1470 The corpus can be used for different tasks like automatic ***** statistical translation ***** and automatic sign language recognition and it allows the specific modeling of spatial references in signing space. | ||
| 2007.iwslt-1.14 The MIT-LL/AFRL MT system implements a standard phrase-based, ***** statistical translation ***** model. | ||
| L10-1319 To address this problem, we have previously proposed a method that aimed to fill the gap between automatically transcribed text and correctly transcribed text by using a ***** statistical translation ***** technique. | ||
| 2014.amta-researchers.23 Data selection is a common technique for adapting *****statistical translation***** models for a specific domain , which has been shown to both improve translation quality and to reduce model size . | ||
| text annotation | 16 | |
| C16-2028 TextPro-AL is a web-based application integrating four components: a machine learning based NLP pipeline, an annotation editor for task definition and ***** text annotation *****s, an incremental re-training procedure based on active learning selection from a large pool of unannotated data, and a graphical visualization of the learning status of the system. | ||
| L10-1374 Although there are a few publicly available tools which support distributed collaborative ***** text annotation *****, most of them have complex user interfaces and require a significant amount of involvement from the annotators/contributors as well as the project developers and administrators. | ||
| 2021.dash-1.6 An open-source and ready-to-use implementation based on the ***** text annotation ***** platform is made available. | ||
| 2021.naacl-demos.5 In this paper, we introduce FITAnnotator, a generic web-based tool for efficient ***** text annotation *****. | ||
| C16-1168 In modern *****text annotation***** projects , crowdsourced annotations are often aggregated using item response models or by majority vote . | ||
| plain text | 16 | |
| L16-1062 The problem is that this information often comes in an unstructured format, such as ***** plain text *****. | ||
| L14-1504 This framework provides a simple interface to end users via which they can deploy one or more NLPCURATOR instances on EC2, upload ***** plain text ***** documents, specify a set of Text Analytics tools (NLP annotations) to apply, and process and store or download the processed data. | ||
| 2020.coling-main.5 Different from ***** plain text ***** passages in Web documents, Web tables and lists have inherent structures, which carry semantic correlations among various elements in tables and lists. | ||
| W18-0913 Thus, the metaphor shared task is aimed to extract metaphors from ***** plain text *****s at word level. | ||
| S18-1078 Providing such an added info, gives more insights to the ***** plain text *****, arising to hidden interpretation within the text. | ||
| survey | 16 | |
| 2020.sigdial-1.29 A total of 20 papers from the last two years are ***** survey *****ed to analyze three types of evaluation protocols: automated, static, and interactive. | ||
| L14-1186 Information about imageability of words can be obtained from the MRC Psycholinguistic Database (MRCPD) for English words and Lëxico Informatizado del Espan̈ol Programa (LEXESP) for Spanish words, which is a collection of human ratings obtained in a series of controlled ***** survey *****s. | ||
| 2020.nl4xai-1.5 We ***** survey ***** recent papers that integrate traditional NLG submodules in neural approaches and analyse their explainability. | ||
| R19-1089 Based on over two hundred participants, the ***** survey ***** results confirm earlier observations, that successful reproducibility requires more than having access to code and data. | ||
| 2020.coling-main.603 Motivated by the latest advances, in this ***** survey ***** we review neural unsupervised domain adaptation techniques which do not require labeled target domain data. | ||
| intelligence | 16 | |
| 2020.lrec-1.833 Source code and documentation are available at https://github.com/machine-***** intelligence *****-laboratory/TopicNet | ||
| D19-1215 A long-term goal of artificial ***** intelligence ***** is to have an agent execute commands communicated through natural language. | ||
| W19-0419 Learning to follow human instructions is a long-pursued goal in artificial ***** intelligence *****. | ||
| P19-1159 While the study of bias in artificial ***** intelligence ***** is not new, methods to mitigate gender bias in NLP are relatively nascent. | ||
| L08-1392 We discuss the problems encountered in the implementation of each approach in the context of the literature, and propose that a test based on the Turing test for machine ***** intelligence ***** offers a way forward in the evaluation of the subjective notion of text quality. | ||
| relationships | 16 | |
| 2020.lrec-1.270 We run a study on a subset of those ***** relationships ***** in order to analyse the viability of our approach. | ||
| P18-1003 While we may similarly expect that co-occurrence statistics can be used to capture rich information about the ***** relationships ***** between different words, existing approaches for modeling such ***** relationships ***** are based on manipulating pre-trained word vectors. | ||
| W19-5910 To support this argument, the research presented in this paper is structured into three stages: (i) analyzing variable dependencies in dialogue data; (ii) applying an energy-based methodology to model dialogue state tracking as a structured prediction task; and (iii) evaluating the impact of inter-slot ***** relationships ***** on model performance. | ||
| N18-2106 We present an initial approach for this problem, which finds correspondences between narratives in terms of plot events, and resemblances between characters and their social ***** relationships *****. | ||
| 2020.emnlp-main.187 By inferring typological features and language phylogenies, we observe that our representations embed typology and strengthen correlations with language ***** relationships *****. | ||
| acquisition | 16 | |
| L14-1638 Such knowledge resources can be derived from automatic parses of raw corpora, but unfortunately parsing still has not achieved a high enough performance for precise knowledge ***** acquisition *****. | ||
| L12-1156 In this paper, we present the ***** acquisition ***** and labeling processes of the EDECAN-SPORTS corpus, which is a corpus that is oriented to the development of multimodal dialog systems acquired in Spanish and Catalan. | ||
| Q13-1026 In the context of language ***** acquisition *****, this independence assumption discards cues that are important to the learner, e.g., the fact that consecutive utterances are likely to share the same referent (Frank et al., 2013). | ||
| 2020.acl-main.684 This is an interesting example of pragmatic language ***** acquisition ***** without any linguistic annotation. | ||
| 2021.cmcl-1.24 We evaluate learning using a series of tasks inspired by methods commonly used in laboratory studies of language ***** acquisition *****. | ||
| web corpus | 16 | |
| L16-1358 We describe the resulting lexical resource containing several dozens of MWEs in four dialects and we propose a method for constructing a ***** web corpus ***** as a support for crowdsourcing examples of MWE occurrences. | ||
| L16-1056 We describe the infrastructure we developed to iterate over the ***** web corpus ***** for extracting the hypernymy relations and store them effectively into a large database. | ||
| L14-1068 DerivBase.hr groups 100k lemmas from ***** web corpus ***** hrWaC into 56k clusters of derivationally related lemmas, so-called derivational families. | ||
| 2021.acl-short.24 In this exploratory analysis, we delve deeper into the Common Crawl, a colossal ***** web corpus ***** that is extensively used for training language models. | ||
| 2020.wac-1.1 In this paper we discuss some of the current challenges in *****web corpus***** building that we faced in the recent years when expanding the corpora in Sketch Engine . | ||
| transformer language | 16 | |
| N19-1112 To shed light on the linguistic knowledge they capture, we study the representations produced by several recent pretrained contextualizers (variants of ELMo, the OpenAI ***** transformer language ***** model, and BERT) with a suite of sixteen diverse probing tasks. | ||
| 2021.emnlp-main.133 Empirically, we document norm growth in the training of ***** transformer language ***** models, including T5 during its pretraining. | ||
| 2020.findings-emnlp.414 The success of pretrained ***** transformer language ***** models (LMs) in natural language processing has led to a wide range of pretraining setups. | ||
| 2021.emnlp-main.751 KnowMAN strongly improves results compared to direct weakly supervised learning with a pre-trained ***** transformer language ***** model and a feature-based baseline. | ||
| 2021.wmt-1.18 Specifically, we use a combination of 1) checkpoint averaging 2) model scaling 3) data augmentation with backtranslation and knowledge distillation from right-to-left factorized models 4) finetuning on test sets from previous years 5) model ensembling 6) shallow fusion decoding with ***** transformer language ***** models and 7) noisy channel re-ranking. | ||
| scores | 16 | |
| 2021.wmt-1.89 Our submissions (Tencent AI Lab Machine Translation, TMT) in German/French/Spanish⇒English are ranked 1st respectively according to the official evaluation results in terms of BLEU ***** scores *****. | ||
| W18-1704 Additional tests, which take advantage of the fact that the length of compressions can be modulated, still improve ROUGE ***** scores ***** with shorter output sentences. | ||
| U18-1010 Our model effectively weights each worker's ***** scores ***** based on the inferred precision of the worker, and is much more reliable than the mean of either the raw ***** scores ***** or the standardised ***** scores *****. | ||
| L10-1336 We define six ***** scores ***** to filter, based on translation redundancy and FrameNet structure. | ||
| 2020.figlang-1.21 Our best model achieves F1 ***** scores ***** of 73.0% on VUA ALLPOS, 77.1% on VUA VERB, 70.3% on TOEFL ALLPOS and 71.9% on TOEFL VERB. | ||
| including | 16 | |
| 2020.wnut-1.39 This paper presents our teamwork on WNUT 2020 shared task-1: wet lab entity extract, that we conducted studies in several models, ***** including ***** a BiLSTM CRF model and a Bert case model which can be used to complete wet lab entity extraction. | ||
| 2021.blackboxnlp-1.19 Tsetlin Machine (TM) is an interpretable pattern recognition algorithm based on propositional logic, which has demonstrated competitive performance in many Natural Language Processing (NLP) tasks, ***** including ***** sentiment analysis, text classification, and Word Sense Disambiguation. | ||
| 2020.acl-main.541 Further experimental results using various multimodal synthesis techniques highlight the challenge presented by our dataset, ***** including ***** non-local constraints and multi-modal inputs. | ||
| 2004.amta-papers.23 This paper describes an evaluation experiment about a Japanese-Uighur machine translation system which consists of verbal suffix processing, case suffix processing, phonetic change processing, and a Japanese-Uighur dictionary ***** including ***** about 20,000 words. | ||
| C16-1320 As an additional objective, we discuss two novel use cases ***** including ***** automatically extracting links to public datasets from the proceedings, which would further accelerate the advancement in digital libraries. | ||
| binary classification | 16 | |
| 2020.trac-1.4 The contribution of this paper is the design of ***** binary classification ***** and regression-based approaches aiming to predict whether a comment is toxic or not. | ||
| 2021.semeval-1.98 We solve the problem as a ***** binary classification ***** problem and also experiment with data augmentation and adversarial training techniques. | ||
| L14-1491 Both regression - in order to predict the exact heart rate value - and a ***** binary classification ***** setting for high and low heart rate classes are investigated. | ||
| R19-1020 We treat this task as ***** binary classification ***** (alignment/non-alignment). | ||
| S19-2125 Our system achieves 80 - 90% accuracy for the ***** binary classification ***** problems (offensive vs not offensive and targeted vs untargeted) and 63% accuracy for trinary classification (group vs individual vs other). | ||
| simple | 16 | |
| K18-1001 However, the ***** simple ***** graphical model structure belies the often complex non-local constraints between output labels. | ||
| 2020.lrec-1.855 The corpus database is distributed to permit fast indexing, and provides a ***** simple ***** web front-end with corpus linguistics methods for sub-corpus comparison and retrieval. | ||
| 2020.emnlp-main.283 In this paper, we propose a ***** simple ***** method to provide annotations for most unambiguous words in a large corpus. | ||
| P19-1010 Combined with a decoder copy mechanism, this approach provides a conceptually ***** simple ***** mechanism to generate logical forms with entities. | ||
| 2020.clinicalnlp-1.19 In addition, we apply temperature scaling, a ***** simple ***** but efficient model calibration method, to produce more reliable predictions. | ||
| data mining | 16 | |
| 2020.peoples-1.11 We provide an easy-to-use and open source python library for predicting emojis with BERTmoticon so that the model can easily be applied to other ***** data mining ***** tasks. | ||
| L08-1510 The CallSurf project gathers a number of academic and industrial partners covering the complete platform, from automatic transcription to information retrieval and ***** data mining *****. | ||
| 2020.wnut-1.52 However, the discussions on the topic of infectious diseases that are informative in nature also span various topics such as news, politics and humor which makes the ***** data mining ***** challenging. | ||
| L12-1605 Automatically segmenting and classifying clinical free text into sections is an important first step to automatic information retrieval, information extraction and ***** data mining ***** tasks, as it helps to ground the significance of the text within. | ||
| L14-1264 We present a methodology to analyze the linguistic evolution of scientific registers with *****data mining***** techniques , comparing the insights gained from shallow vs. linguistic features . | ||
| positive | 16 | |
| 2020.semeval-1.159 To utilise both text and image data, a multi-modal CNN-LSTM model is proposed to jointly learn latent features for ***** positive *****, negative and neutral category predictions. | ||
| 2020.coling-main.357 This is possible because existing alternation datasets contain ***** positive *****, but no negative instances and are not comprehensive. | ||
| W19-3015 Speech samples were obtained from healthy controls and patients with a diagnosis of schizophrenia or schizoaffective disorder and different severity of ***** positive ***** formal thought disorder. | ||
| 2004.amta-papers.27 This paper describes our experience in deploying this system and the (***** positive *****) customer response to the availability of machine translated articles, as well as other uses of MSR-MT either planned or underway at Microsoft. | ||
| W19-3502 Interactions among users on social network platforms are usually ***** positive *****, constructive and insightful. | ||
| confidence | 16 | |
| L04-1250 We present a supervised method for training a sentence level ***** confidence ***** measure on translation output using a human-annotated corpus. | ||
| 2021.naacl-main.68 Extensive experiments on two benchmark datasets show that BEUrRE consistently outperforms baselines on ***** confidence ***** prediction and fact ranking due to its probabilistic calibration and ability to capture high-order dependencies among facts. | ||
| 2011.iwslt-papers.10 Finally, ***** confidence ***** scores provide a new accuracy-based feature to score phrase pairs. | ||
| 2014.iwslt-papers.3 In the past, this task has been treated separately in ASR or MT contexts and we propose here a joint estimation of word ***** confidence ***** for a spoken language translation (SLT) task involving both ASR and MT. | ||
| P18-2047 We offer a simple and effective method to seek a better balance between model ***** confidence ***** and length preference for Neural Machine Translation (NMT). | ||
| large annotated | 16 | |
| C16-1095 In the absence of ***** large annotated ***** corpora, parallel corpora, treebanks, bilingual lexica, etc., we found the following to be effective: exploiting distributional regularities in monolingual data, projecting information across closely related languages, and utilizing human linguist judgments. | ||
| D19-1100 Although over 100 languages are supported by strong off-the-shelf machine translation systems, only a subset of them possess ***** large annotated ***** corpora for named entity recognition. | ||
| L08-1022 Modern statistical parsers are trained on ***** large annotated ***** corpora (treebanks). | ||
| 2021.naacl-main.127 The lack of ***** large annotated ***** training data causes poor performance especially in relation labeling. | ||
| 2020.lrec-1.82 To apply machine learning approaches to the development of the modules, we created ***** large annotated ***** datasets of 280,467 question-response pairs and 38,868 voluntary utterances. | ||
| difficult | 16 | |
| W17-5001 We also find that ***** difficult *****y is mirrored in the amount of variation in student answers, which can be computed before grading. | ||
| C16-1121 We present a successful collaboration of word embeddings and co-training to tackle in the most ***** difficult ***** test case of semantic role labeling: predicting out-of-domain and unseen semantic frames. | ||
| 2021.eval4nlp-1.1 To address this shortcoming, we introduce the notion of differential evaluation which effectively defines a pragmatic partition of instances into gradually more ***** difficult ***** bins by leveraging the predictions made by a set of systems. | ||
| W16-3809 One ***** difficult *****y encountered in its development springs from the fact that the list of (potentially numerous) frames has no internal organization. | ||
| L08-1479 However recognition of proper nouns is commonly considered as a ***** difficult ***** task. | ||
| Deep learning | 16 | |
| 2021.acl-short.10 *****Deep learning***** algorithms have shown promising results in visual question answering ( VQA ) tasks , but a more careful look reveals that they often do not understand the rich signal they are being fed with . | ||
| P18-2120 *****Deep learning***** approaches for sentiment classification do not fully exploit sentiment linguistic knowledge . | ||
| I17-1065 *****Deep learning***** models have recently been applied successfully in natural language processing , especially sentiment analysis . | ||
| 2020.emnlp-main.12 *****Deep learning***** models for linguistic tasks require large training datasets , which are expensive to create . | ||
| C16-1212 *****Deep learning***** techniques are increasingly popular in the textual entailment task , overcoming the fragility of traditional discrete models with hard alignments and logics . | ||
| Neural network | 16 | |
| C18-1161 *****Neural network***** approaches to Named - Entity Recognition reduce the need for carefully hand - crafted features . | ||
| W19-1917 *****Neural network***** models have shown promise in the temporal relation extraction task . | ||
| W18-5450 *****Neural network***** methods are experiencing wide adoption in NLP , thanks to their empirical performance on many tasks . | ||
| W19-4810 *****Neural network***** models have been very successful in natural language inference , with the best models reaching 90 % accuracy in some benchmarks . | ||
| W19-4823 *****Neural network***** architectures have been augmented with differentiable stacks in order to introduce a bias toward learning hierarchy - sensitive regularities . | ||
| Multilingual Offensive Language | 16 | |
| 2020.semeval-1.255 This paper describes the Duluth systems that participated in SemEval2020 Task 12 , *****Multilingual Offensive Language***** Identification in Social Media ( OffensEval2020 ) . | ||
| 2020.semeval-1.280 The paper presents a system developed for the SemEval-2020 competition Task 12 ( OffensEval-2 ): *****Multilingual Offensive Language***** Identification in Social Media . | ||
| 2020.semeval-1.279 This paper presents our hierarchical multi - task learning ( HMTL ) and multi - task learning ( MTL ) approaches for improving the text encoder in Sub - tasks A , B , and C of *****Multilingual Offensive Language***** Identification in Social Media ( SemEval-2020 Task 12 ) . | ||
| 2020.semeval-1.281 This paper describes a system ( pin_cod _ ) built for SemEval 2020 Task 12 : OffensEval : *****Multilingual Offensive Language***** Identification in Social Media ( Zampieri et al . , 2020 ) . | ||
| 2020.semeval-1.300 This article describes the system submitted to SemEval-2020 Task 12 OffensEval 2 : *****Multilingual Offensive Language***** Recognition in Social Media . | ||
| transition - based | 16 | |
| N19-1076 We propose a novel *****transition - based***** algorithm that straightforwardly parses sentences from left to right by building n attachments , with n being the length of the input sentence . | ||
| 2020.ccl-1.76 In Chinese dependency parsing , the joint model of word segmentation , POS tagging and dependency parsing has become the mainstream framework because it can eliminate error propagation and share knowledge , where the *****transition - based***** model with feature templates maintains the best performance . | ||
| D19-1277 Transition - based and graph - based dependency parsers have previously been shown to have complementary strengths and weaknesses : *****transition - based***** parsers exploit rich structural features but suffer from error propagation , while graph - based parsers benefit from global optimization but have restricted feature scope . | ||
| 2020.lrec-1.642 We investigate a *****transition - based***** parser that uses Eukalyptus , a function - tagged constituent treebank for Swedish which includes discontinuous constituents . | ||
| K19-1023 We present a new method for *****transition - based***** parsing where a solution is a pair made of a dependency tree and a derivation graph describing the construction of the former . | ||
| SemEval-2020 Task | 16 | |
| 2020.semeval-1.285 In this paper , we present our approaches and results for *****SemEval-2020 Task***** 12 , Multilingual Offensive Language Identification in Social Media ( OffensEval 2020 ) . | ||
| 2020.semeval-1.245 The article describes a fast solution to propaganda detection at *****SemEval-2020 Task***** 11 , based on feature adjustment . | ||
| 2020.semeval-1.215 This paper describes the model we apply in the *****SemEval-2020 Task***** 10 . | ||
| 2020.semeval-1.106 This paper describes our system that was designed for Humor evaluation within the *****SemEval-2020 Task***** 7 . | ||
| 2020.semeval-1.267 This paper presents the approach of Team KAFK for the English edition of *****SemEval-2020 Task***** 12 . | ||
| pre - | 16 | |
| P17-1031 Automated processing of historical texts often relies on *****pre -***** normalization to modern word forms . | ||
| 2021.acl-long.55 Our analysis also clearly illustrates the benefits of *****pre -***** training . | ||
| 2020.lrec-1.365 However , existing word analogy datasets have tended to be handcrafted , involving permutations of hundreds of words with only dozens of *****pre -***** defined relations , mostly morphological relations and named entities . | ||
| P19-3002 Many annotation tools have been developed , covering a wide variety of tasks and providing features like user management , *****pre -***** processing , and automatic labeling . | ||
| E17-2099 We explore combinations of linguistically motivated approaches to address these problems in English - to - German SMT and show that they are complementary to one another , but also that the popular verbal *****pre -***** ordering can cause problems on the morphological and lexical level . | ||
| educational | 16 | |
| W17-5908 Spelling errors occur frequently in *****educational***** settings , but their influence on automatic scoring is largely unknown . | ||
| 2021.bea-1.18 We present a new task in *****educational***** NLP , recommend the best interventions to help special needs education professionals to work with students with different disabilities . | ||
| 2021.rocling-1.51 This paper presents the ROCLING 2021 shared task on dimensional sentiment analysis for *****educational***** texts which seeks to identify a real - value sentiment score of self - evaluation comments written by Chinese students in the both valence and arousal dimensions . | ||
| 2021.rocling-1.45 Sentiment analysis has become a popular research issue in recent years , especially on *****educational***** texts which is an important problem . | ||
| 2020.aacl-srw.17 Automated Essay Scoring ( AES ) is a process that aims to alleviate the workload of graders and improve the feedback cycle in *****educational***** systems . | ||
| Neural Machine | 16 | |
| 2020.acl-main.144 This paper explores data augmentation methods for training *****Neural Machine***** Translation to make use of similar translations , in a comparable way a human translator employs fuzzy matches . | ||
| 2021.ranlp-1.4 In this paper , we present a novel approachfor domain adaptation in Neural MachineTranslation which aims to improve thetranslation quality over a new domain . Adapting new domains is a highly challeng - ing task for *****Neural Machine***** Translation onlimited data , it becomes even more diffi - cult for technical domains such as Chem - istry and Artificial Intelligence due to spe - cific terminology , etc . | ||
| E17-3017 We present Nematus , a toolkit for *****Neural Machine***** Translation . | ||
| 2020.lrec-1.463 We present a comparative evaluation of casing methods for *****Neural Machine***** Translation , to help establish an optimal pre- and post - processing methodology . | ||
| D18-1040 *****Neural Machine***** Translation has achieved state - of - the - art performance for several language pairs using a combination of parallel and synthetic data . | ||
| Spanish | 16 | |
| L14-1218 In this paper we present the results of an ongoing experiment of bootstrapping a Treebank for Catalan by using a Dependency Parser trained with *****Spanish***** sentences . | ||
| 2016.lilt-14.4 Our annotation model captures four basic modal meanings and their subtypes , on the one hand , and provides a fine - grained characterisation of the syntactic realisations of those meanings in English and *****Spanish***** , on the other . | ||
| 2018.gwc-1.3 We present some strategies for improving the Spanish version of WordNet , part of the MCR , selecting new lemmas for the *****Spanish***** synsets by translating the lemmas of the corresponding English synsets . | ||
| 2003.mtsummit-papers.40 The goal of the AMETRA project is to make a computer - assisted translation tool from the *****Spanish***** language to the Basque language under the memory - based translation framework . | ||
| S18-1053 Task 1 in the International Workshop SemEval 2018 , Affect in Tweets , introduces five subtasks ( El - reg , El - oc , V - reg , V - oc , and E - c ) to detect the intensity of emotions in English , Arabic , and *****Spanish***** tweets . | ||
| self - | 16 | |
| 2021.reinact-1.1 The next generation of conversational AI systems need to : ( 1 ) process language incrementally , token - by - token to be more responsive and enable handling of conversational phenomena such as pauses , restarts and *****self -***** corrections ; ( 2 ) reason incrementally allowing meaning to be established beyond what is said ; ( 3 ) be transparent and controllable , allowing designers as well as the system itself to easily establish reasons for particular behaviour and tailor to particular user groups , or domains . | ||
| N19-1127 Neural networks equipped with *****self -***** attention have parallelizable computation , light - weight structure , and the ability to capture both long - range and local dependencies . | ||
| R19-1028 In this paper , we propose a new Transformer neural machine translation ( NMT ) model that incorporates dependency relations into *****self -***** attention on both source and target sides , dependency - based self - attention . | ||
| 2021.sigdial-1.7 We propose a novel on - device neural sequence labeling model which uses embedding - free projections and character information to construct compact word representations to learn a sequence model using a combination of bidirectional LSTM with *****self -***** attention and CRF . | ||
| 2021.naacl-main.16 Successful methods for unsupervised neural machine translation ( UNMT ) employ cross - lingual pretraining via *****self -***** supervision , often in the form of a masked language modeling or a sequence generation task , which requires the model to align the lexical- and high - level representations of the two languages . | ||
| procedural | 16 | |
| L14-1594 In this paper , we present our attempt at annotating *****procedural***** texts with a flow graph as a representation of understanding . | ||
| W19-3647 Towards *****procedural***** fidelity in the processing of African English speech corpora , this work demonstrates how the adaptation of machine - assisted segmentation of phonemes and automatic extraction of acoustic values can significantly speed up the processing of naturalistic data and make the vocalic analysis of the varieties less impressionistic . | ||
| D18-1006 Comprehending *****procedural***** text , e.g. , a paragraph describing photosynthesis , requires modeling actions and the state changes they produce , so that questions about entities at different timepoints can be answered . | ||
| N19-1412 Understanding *****procedural***** language requires reasoning about both hierarchical and temporal relations between events . | ||
| W19-2609 Understanding *****procedural***** text requires tracking entities , actions and effects as the narrative unfolds . | ||
| Multidimensional | 15 | |
| W19-8715 The analyses are based on publicly available NMT post editing data annotated for errors in three language pairs (EN-DE, EN-LV, EN-HR) with the ***** Multidimensional ***** Quality Metrics (MQM). | ||
| 2020.emnlp-main.213 To showcase our framework, we train three models with different types of human judgements: Direct Assessments, Human-mediated Translation Edit Rate and ***** Multidimensional ***** Quality Metric. | ||
| 2020.eamt-1.14 To this end, we develop an error taxonomy compliant with the ***** Multidimensional ***** Quality Metrics (MQM) framework that is customised to the relevant phenomena of this translation direction. | ||
| 2021.wmt-1.111 With this year's focus on ***** Multidimensional ***** Quality Metric (MQM) as the ground-truth human assessment, our aim was to steer COMET towards higher correlations with MQM. | ||
| 2021.wmt-1.73 Contrary to previous years' editions, this year we acquired our own human ratings based on expert-based human evaluation via ***** Multidimensional ***** Quality Metrics (MQM) | ||
| homonymy | 15 | |
| L16-1447 In the current version, BECL also provides information on nouns whose senses occur in more than one class allowing a closer look on polysemy and ***** homonymy ***** with regard to countability. | ||
| 2021.gwc-1.4 Experimental evaluation shows that our approach sets a new state of the art for ***** homonymy ***** detection. | ||
| 2021.gwc-1.34 WordNet lacks any explicit description of polysemy or ***** homonymy *****, but as a network of linked senses it may be used to compute semantic distances between word senses. | ||
| 2021.gwc-1.7 The algorithms not only ruled out most cases of ***** homonymy ***** but also were efficacious in distinguishing between closer and indirect semantic relatedness. | ||
| 2020.cogalex-1.16 Our work investigates whether recent advances in NLP, specifically contextualized word embeddings, capture human-like distinctions between English word senses, such as polysemy and ***** homonymy ***** | ||
| tokenizer | 15 | |
| L12-1474 These keywords were extracted using NLP tools such as ***** tokenizer *****, sentence boundary detection and part-of-speech tagging applied to the text extracted from the original PDF papers (currently 22,500). | ||
| L14-1326 In addition, we present open source tools for automatic analysis of Persian containing a text normalizer, a sentence segmenter and ***** tokenizer *****, a part-of-speech tagger, and a parser. | ||
| W18-2502 Users may apply them without awareness of their surprising omissions (e.g. “hasn't” but not “hadn't”) and inclusions (“computer”), or their incompatibility with a particular ***** tokenizer *****. | ||
| L06-1092 The revised ***** tokenizer ***** increases the coverage of the grammar in terms of full parses from 68.3% to 73.4% on sentences 8,001 through 10,000 of the TiGer Corpus. | ||
| 2021.wnut-1.45 We tested our ***** tokenizer ***** by performing classification tasks on Korean user-generated movie reviews and hate speech datasets, and the Korean Named Entity Recognition dataset | ||
| distilling | 15 | |
| 2020.coling-main.489 Given a set of KBs, our proposed approach KD-MKB, learns KB embeddings by mutually and jointly ***** distilling ***** knowledge within a dynamic teacher-student setting. | ||
| C16-1035 First, we propose a novel unsupervised paragraph embedding method, named the essence vector (EV) model, which aims at not only ***** distilling ***** the most representative information from a paragraph but also excluding the general background information to produce a more informative low-dimensional vector representation for the paragraph. | ||
| 2021.acl-long.162 Our preliminary study as well as the recent success in pre-training suggests that transferring parameters are more effective in ***** distilling ***** knowledge. | ||
| 2020.iwpt-1.2 When ***** distilling ***** to 20% of the original model's trainable parameters, we only observe an average decrease of ∼1 point for both UAS and LAS across a number of diverse Universal Dependency treebanks while being 2.30x (1.19x) faster than the baseline model on CPU (GPU) at inference time. | ||
| 2021.sustainlp-1.3 Finally, we demonstrate that ***** distilling ***** a larger model (BERT Large) results in the strongest distilled model that performs best both on the source language as well as target languages in zero-shot settings | ||
| relying | 15 | |
| L16-1226 We then present our implementation of an active learning scenario for person annotation in video, ***** relying ***** on the CAMOMILE server; during a dry run experiment, the manual annotation of 716 speech segments was thus propagated to 3504 labeled tracks. | ||
| 2021.emnlp-main.206 As far as we know, existing neural-based ED models make decisions ***** relying ***** entirely on the contextual semantic features of each word in the inputted text, which we find is easy to be confused by the varied contexts in the test stage. | ||
| P19-1036 Our experiments on 5 standard corpora show that the proposed method increases F1-score over ***** relying ***** solely on human expertise and can also be on par with simple supervised approaches. | ||
| 2021.emnlp-main.109 In this work, we demonstrate that it is possible to turn MLMs into effective lexical and sentence encoders even without any additional data, ***** relying ***** simply on self-supervision. | ||
| N19-1240 We introduce a neural model which integrates and reasons ***** relying ***** on information spread within documents and across multiple documents | ||
| Thereby | 15 | |
| 2021.acl-long.367 ***** Thereby *****, they cannot perform well on targets and opinions which contain multiple words. | ||
| C16-1172 ***** Thereby *****, we use either only the output of the phrase-based machine translation (PBMT) system or a combination of the PBMT output and the source sentence. | ||
| C18-1140 ***** Thereby *****, the meta-embedding space is enforced to capture complementary information in different source embeddings via a coherent common embedding space. | ||
| P19-2020 ***** Thereby *****, diverse corrected sentences is obtained from a single erroneous sentence. | ||
| L16-1022 ***** Thereby *****, it relies on the distributional hypothesis implying that similar words have similar contexts | ||
| hashing | 15 | |
| P17-2063 We evaluate feature ***** hashing ***** for language identification (LID), a method not previously used for this task. | ||
| D18-1524 Finally we describe a semantic ***** hashing ***** layer that allows our model to learn generic binary codes for sentences. | ||
| 2020.findings-emnlp.233 In this paper, we present a simple but effective unsupervised neural generative semantic ***** hashing ***** method with a focus on few-bits ***** hashing *****. | ||
| P18-1190 In this paper, we present an end-to-end Neural Architecture for Semantic Hashing (NASH), where the binary ***** hashing ***** codes are treated as Bernoulli latent variables. | ||
| D19-6120 Representations of the instances of source and target datasets are learned, retrieval of relevant source instances is performed using soft-attention mechanism and locality sensitive ***** hashing ***** and then augmented into the model during training on the target dataset | ||
| inverse | 15 | |
| 2020.wosp-1.5 However, the current term weighting formula (TF-IDF, for instance), weighs terms only based on term frequency and ***** inverse ***** document frequency irrespective of other important factors. | ||
| 2021.semspace-1.6 In particular, we propose and motivate a new logical negation using matrix ***** inverse *****. | ||
| 2020.lrec-1.22 Finally, we perform the same experiments on Korean Hangul, a non-alphabetic writing system, where we find the opposite results: slower responses as a function of denser neighborhoods, and a negative effect of ***** inverse ***** feature weighting. | ||
| 2010.amta-papers.25 In this paper, we extend the HPB model with maximum entropy based bracketing transduction grammar (BTG), which provides content-dependent combination of neighboring phrases in two ways: serial or ***** inverse *****. | ||
| P19-1397 In order to utilise the decoder after learning, we present two types of decoding functions whose ***** inverse ***** can be easily derived without expensive ***** inverse ***** calculation | ||
| cognition | 15 | |
| 2020.lrec-1.286 Various research works have dealt with the comprehensibility of textual, audio, or audiovisual documents, and showed that factors related to text (e.g. linguistic complexity), sound (e.g. speech intelligibility), image (e.g. presence of visual context), or even to ***** cognition ***** and emotion can play a major role in the ability of humans to understand the semantic and pragmatic contents of a given document. | ||
| 2021.emnlp-main.151 Internet search affects people's ***** cognition ***** of the world, so mitigating biases in search results and learning fair models is imperative for social good. | ||
| S18-2010 We can conclude that the use of human references as ground truth for cross-language word embeddings is not proper unless one does not understand how do native speakers process semantics in their ***** cognition *****. | ||
| P19-1506 This finding is compatible with the theory of situated ***** cognition *****: language is inseparable from its physical context. | ||
| D17-1048 The predicted reading time is then used to build a ***** cognition ***** based attention (CBA) layer for neural sentiment analysis | ||
| abstracting | 15 | |
| L10-1099 A BLARK as normally presented in the literature arguably reflects a modern standard language, which is topic- and genre-neutral, thus ***** abstracting ***** away from all kinds of language variation. | ||
| 2001.jeptalnrecital-poster.3 By comparing full text sentences used in ***** abstracting ***** with correspond-ing sentences in abstract, the study found such units to include metadiscourse phrases, parenthetical texts, redundant units inserted for emphasis, or are repetitions. | ||
| 2001.jeptalnrecital-long.12 The paper also discusses some prerequisites and difficulties anticipated for ***** abstracting ***** systems. | ||
| L14-1011 This research describes the challenges therein, including the development of new annotation practices that walk the line between ***** abstracting ***** away from language-particular syntactic facts to explore deeper semantics, and maintaining the connection between semantics and syntactic structures that has proven to be very valuable for PropBank as a corpus of training data for Natural Language Processing applications. | ||
| L10-1130 By ***** abstracting ***** away to a common Roman transliteration scheme in the respective transliterators, our system can be enabled to handle both languages in parallel | ||
| Namely | 15 | |
| C18-1202 ***** Namely *****, they are, directly or indirectly, based on the counts of distinct word types, and spelling errors undesirably increase the number of distinct words. | ||
| E17-1009 ***** Namely *****, we present an unsupervised, knowledge-free WSID approach, which is interpretable at three levels: word sense inventory, sense feature representations, and disambiguation procedure. | ||
| N18-1186 ***** Namely *****, we utilize the queries' memory, the responses' memory, and their unified memory, following the time sequence of the conversation session. | ||
| 2006.amta-papers.19 ***** Namely ***** that unlike with humans, MT systems perform more poorly at both level zero and one than at level two and three. | ||
| W17-3538 ***** Namely *****, linguistic description of complex phenomena constitutes a mature research line | ||
| SQuAD dataset | 15 | |
| P17-1018 We conduct extensive experiments on the ***** SQuAD dataset *****. | ||
| 2020.acl-main.413 Training a QA model on this data gives a relative improvement over a previous unsupervised model in F1 score on the ***** SQuAD dataset ***** by about 14%, and 20% when the answer is a named entity, achieving state-of-the-art performance on SQuAD for unsupervised QA. | ||
| 2020.findings-emnlp.145 On the ***** SQuAD dataset *****, our proposed method achieves 70.14% F1 score with supervision from 26 explanations, comparable to plain supervised learning using 1,100 labeled instances, yielding a 12x speed up. | ||
| P18-1156 Indeed, we observe that state-of-the-art neural RC models which have achieved near human performance on the ***** SQuAD dataset *****, even when coupled with traditional NLP techniques to address the challenges presented in DuoRC exhibit very poor performance (F1 score of 37.42% on DuoRC v/s 86% on ***** SQuAD dataset *****). | ||
| 2021.mrqa-1.9 Many QA models, such as those for the ***** SQuAD dataset *****, are trained and tested on a subset of Wikipedia articles which encode their own biases and also reproduce real-world inequality | ||
| locality | 15 | |
| W89-0235 These structures specify extended domains of ***** locality ***** (as compared to CFGs) over which constraints can be stated. | ||
| N19-1109 By relaxing the strong constraint of ***** locality *****, our method is able to capture both the local and non-local co-occurrences. | ||
| 2020.acl-main.181 We find that subjectivity, information ***** locality *****, and information gain are all strong predictors, with some evidence for a two-factor account, where subjectivity and information gain reflect a factor involving semantics, and information ***** locality ***** reflects collocational preferences. | ||
| C16-1315 Our experimental results show that the proposed sequential interleaving method based on ***** locality ***** sensitive hashing (LSH) technology is efficient in boosting the comparison speed among probability distributions, and the proposed framework can generate meaningful labels to interpret topics, including new emerging topics. | ||
| 2020.emnlp-main.610 Our findings show the promise of syntactic dependency trees in encoding semantic role relations within their syntactic domain of ***** locality *****, and point to potential further integration of syntactic methods into semantic role labeling in the future | ||
| compounding | 15 | |
| L12-1628 A phrase is called sense stable if the senses of all the words ***** compounding ***** it do not change their sense irrespective of the context which could be added to its left or to its right. | ||
| 1995.iwpt-1.24 The algorithm can be applied to the moiphological analysis of any language whose morphology is fully captured by a single (and possibly very large) finite state transducer, regardless of the word formation processes (such as agglutination or productive ***** compounding *****) and morphographemic phenomena involved. | ||
| D18-1091 However, if a query resembles a well-formed question, a natural language processing pipeline is able to perform more accurate interpretation, thus reducing downstream ***** compounding ***** errors. | ||
| L10-1591 This paper proposes statistical analysis methods for improvement of terminology entry ***** compounding *****. | ||
| L08-1261 For *****compounding***** languages , a great part of the topical semantics is transported via nominal compounds . | ||
| Shapley | 15 | |
| 2021.naacl-main.402 In this paper, we develop , an efficient source valuation framework for quantifying the usefulness of the sources (e.g., ) in transfer learning based on the ***** Shapley ***** value method. | ||
| 2021.naacl-main.223 We apply power indices from cooperative game theory, including the ***** Shapley ***** value and Banzhaf index, that measure the relative importance of individual team members in accomplishing a joint task. | ||
| 2021.emnlp-main.452 Ablation studies show that both latent optimization and the use of ***** Shapley ***** values improve success rate and the quality of the generated counterfactuals. | ||
| 2021.acl-long.283 We then denoise the weakly labeled data using the ***** Shapley ***** algorithm. | ||
| 2021.acl-short.8 We formally prove that — save for the degenerate case — attention weights and leave-one-out values cannot be ***** Shapley ***** Values | ||
| utilizing | 15 | |
| D18-1385 When implemented, machine learning methods ***** utilizing ***** such features achieved the state-of-the-art rumor verification results. | ||
| L10-1577 In this paper, we present the results of an experiment with ***** utilizing ***** a stochastic morphosyntactic tagger as a pre-processing module of a rule-based chunker and partial parser for Croatian in order to raise its overall chunking and partial parsing accuracy on Croatian texts. | ||
| 2021.semeval-1.25 We tackle this problem ***** utilizing ***** a combination of a state-of-the-art pre-trained language model (CharacterBERT) and a traditional bag-of-words technique. | ||
| 2020.emnlp-main.336 This work proposes a multi-view sequence-to-sequence model by first extracting conversational structures of unstructured daily chats from different views to represent conversations and then ***** utilizing ***** a multi-view decoder to incorporate different views to generate dialogue summaries. | ||
| W17-6003 In this paper, we hypothesize that in addition to semantic information, sub-character components may also carry emotional information, and that ***** utilizing ***** it should improve performance on sentiment analysis tasks | ||
| Wikinews | 15 | |
| W19-8901 The task measures the performance of multilingual headline generation systems using the Wikipedia and ***** Wikinews ***** articles in multiple languages. | ||
| K19-1049 In addition, it can retrieve candidates extremely fast, and generalizes well to a new dataset derived from ***** Wikinews *****. | ||
| 2020.emnlp-main.432 We present the construction of a corpus of 500 ***** Wikinews ***** articles annotated with temporal dependency graphs (TDGs) that can be used to train systems to understand temporal relations in text. | ||
| L16-1307 Only ***** Wikinews ***** documents are manually annotated and can be used for evaluation, while the others can be used for unsupervised learning. | ||
| 2020.lrec-1.257 WN-Salience is built on top of ***** Wikinews *****, a Wikimedia project whose mission is to present reliable news articles | ||
| exploit | 15 | |
| 2021.acl-long.363 To address the challenge that free-text relations are ambiguous, previous methods ***** exploit ***** neighbor entities and relations for additional context. | ||
| L16-1140 The experiments presented here ***** exploit ***** the properties of the Apertium RDF Graph, principally cycle density and nodes' degree, to automatically generate new translation relations between words, and therefore to enrich existing bilingual dictionaries with new entries. | ||
| 2011.iwslt-evaluation.5 The rules we used ***** exploit ***** punctuation and spacing in the input utterances, and we use these positions to delimit our segments. | ||
| W17-4306 The learning and inference models ***** exploit ***** the structure of the unified graph as well as the global first order domain constraints beyond the data to predict the semantics which forms a structured meaning representation of the spatial context. | ||
| P19-1459 We analyze the nature of these cues and demonstrate that a range of models all ***** exploit ***** them | ||
| copying | 15 | |
| N19-1360 We design an LSTM-based encoder-decoder architecture that models context dependency through ***** copying ***** mechanisms and multiple levels of attention over inputs and previous outputs. | ||
| W18-5807 Most finite-state implementations of computational morphology cannot adequately capture the productivity of unbounded ***** copying ***** in reduplication, nor can they adequately capture bounded ***** copying *****. | ||
| W18-3027 We employ an attention-based model enriched with a ***** copying ***** mechanism to ensure faithful regeneration of the input sequence, while enabling interleaved generation of argument role labels. | ||
| P19-1533 Moreover, because this ***** copying ***** is label-agnostic, we can achieve impressive performance in zero-shot sequence-labeling tasks. | ||
| 2021.emnlp-main.336 The *****copying***** mechanism has had considerable success in abstractive summarization , facilitating models to directly copy words from the input text to the output summary . | ||
| maximal | 15 | |
| L08-1105 To derive ***** maximal ***** benefit from the semantic information provided by these resources, the MASC will also include manually-validated shallow parses and named entities, which will enable linking WordNet senses and FrameNet frames within the same sentences into more complex semantic structures and, because named entities will often be the role fillers of FrameNet frames, enrich the semantic and pragmatic information derivable from the sub-corpus. | ||
| L12-1362 In this work, we therefore aim at defining specific parameters that classify differences in genres of spoken and written texts such as the preferred segmentation strategy, the ***** maximal ***** allowed distance in or the length and size of coreference chains as well as the correlation of structural and syntactic features of coreferring expressions. | ||
| D18-1446 It exploits the ***** maximal ***** marginal relevance method to select representative sentences from multi-document input, and leverages an abstractive encoder-decoder model to fuse disparate sentences to an abstractive summary. | ||
| 2020.findings-emnlp.266 As a measure of robustness, we adopt the notion of the ***** maximal ***** safe radius for a given input text, which is the minimum distance in the embedding space to the decision boundary. | ||
| D18-1124 Our shift-reduce based system then learns to construct the forest structure in a bottom-up manner through an action sequence whose ***** maximal ***** length is guaranteed to be three times of the sentence length | ||
| essays | 15 | |
| W19-4510 In recent years, argumentation mining, which automatically extracts the structure of argumentation from unstructured documents such as ***** essays ***** and debates, is gaining attention. | ||
| W18-0605 Teams aimed to predict mental health outcomes from ***** essays ***** written by 11-year-olds about what they believed their lives would be like at age 25. | ||
| W17-5024 While most of our kernels are based on character p-grams (also known as n-grams) extracted from ***** essays ***** or speech transcripts, we also use a kernel based on i-vectors, a low-dimensional representation of audio recordings, provided by the shared task organizers. | ||
| W19-4404 Based on texts written by students for the official school-leaving state examination (Abitur), we show that teachers successfully assign higher language performance grades to ***** essays ***** with higher task-appropriate language complexity and properly separate this from content scores. | ||
| 2020.aacl-main.86 To demonstrate the efficacy of this multi-task learning based approach to automatic essay grading, we collect gaze behaviour for 48 ***** essays ***** across 4 essay sets, and learn gaze behaviour for the rest of the ***** essays *****, numbering over 7000 ***** essays ***** | ||
| Akkadian | 15 | |
| W19-1421 The goal was to identify dialects of Swiss German in GDI and Sumerian and ***** Akkadian ***** in CLI. | ||
| W19-1420 The Cuneiform Language Identification task in VarDial 2019 addresses the problem of identifying seven languages and dialects written in cuneiform; Sumerian and six dialects of ***** Akkadian ***** language: Old Babylonian, Middle Babylonian Peripheral, Standard Babylonian, Neo-Babylonian, Late Babylonian, and Neo-Assyrian. | ||
| L16-1642 To our best knowledge, this is the first study of this kind applied to either the ***** Akkadian ***** language or the cuneiform writing system. | ||
| 2020.lrec-1.479 In this paper we describe a general finite-state based morphological model for Babylonian, a southern dialect of the ***** Akkadian ***** language, that can achieve a coverage up to 97.3% and recall up to 93.7% on lemmatization and POS-tagging task on token level from a transcribed input. | ||
| 2020.lrec-1.433 The phonological transcription provides a linguistically appealing form to represent ***** Akkadian *****, because the transcription is normalized according to the grammatical description of a given dialect and explicitly shows the ***** Akkadian ***** renderings for Sumerian logograms | ||
| substitution | 15 | |
| W19-0423 We show that powerful contextualized word representations, which give high performance in several semantics-related tasks, deal less well with the subtle in-context similarity relationships needed for ***** substitution *****. | ||
| 2021.acl-long.385 Considering that only relying on the same position ***** substitution ***** cannot handle the variable-length correction cases, various operations such ***** substitution *****, deletion, insertion, and local paraphrasing are required jointly. | ||
| 2020.findings-emnlp.16 To find a modification solution, we use beam search constrained by heuristic rules, and we leverage a BERT masked language model for generating ***** substitution ***** words compatible with the context. | ||
| 2006.amta-papers.3 Decoding requires a very large target-language-only corpus, and while ***** substitution ***** in target can be performed using that same corpus, ***** substitution ***** in source requires a separate (and smaller) source monolingual corpus. | ||
| D19-5552 Lexical ***** substitution ***** ranks ***** substitution ***** candidates from the viewpoint of paraphrasability for a target word in a given sentence | ||
| approximating | 15 | |
| 2021.wnut-1.1 Text simplification is the process of splitting and rephrasing a sentence to a sequence of sentences making it easier to read and understand while preserving the content and ***** approximating ***** the original meaning. | ||
| 2020.emnlp-main.447 We connect this approach to existing techniques such as SwitchOut and word dropout, and show that these techniques are all essentially ***** approximating ***** variants of a single objective. | ||
| 2021.emnlp-main.229 They argue that probing should be seen as ***** approximating ***** a mutual information. | ||
| 2021.emnlp-main.753 In this work, we argue that research efforts should be directed towards ***** approximating ***** the true output of the attention sub-layer, which includes the value vectors. | ||
| C16-1330 We show that volume and provenance are indeed important, but that ***** approximating ***** the perfect balancing of the selected training data leads to an improvement of 21 points and exceeds state-of-the-art systems by 14 points while using only simple features | ||
| Transformer encoder | 15 | |
| 2021.alta-1.22 We present a study on reducing sub-word overlap by scaling the vocabulary size in a ***** Transformer encoder ***** model while pretraining with multiple domains. | ||
| P19-1298 Experiment results show superiorities of lattice-based encoders in word-level and subword-level representations over conventional ***** Transformer encoder *****. | ||
| P19-3015 A salient feature is that NeuralClassifier currently provides a variety of text encoders, such as FastText, TextCNN, TextRNN, RCNN, VDCNN, DPCNN, DRNN, AttentiveConvNet and ***** Transformer encoder *****, etc. | ||
| 2020.loresmt-1.11 Finally, we analyze performance differences between the LSTM and ***** Transformer encoder *****s when using a Transformer decoder and find that the ***** Transformer encoder ***** is better able to handle insertions and substitutions when transliterating. | ||
| 2020.acl-main.250 We show that much efficient light BERT models can be obtained by reducing algorithmically chosen correct architecture design dimensions rather than reducing the number of ***** Transformer encoder ***** layers | ||
| MARCO | 15 | |
| 2020.aacl-main.55 The models are evaluated on the MS ***** MARCO ***** | ||
| 2020.findings-emnlp.63 Experimental results on the MS ***** MARCO ***** passage ranking task show that our ranking approach is superior to strong encoder-only models. | ||
| P18-1157 We show that this simple trick improves robustness and achieves results competitive to the state-of-the-art on the Stanford Question Answering Dataset (SQuAD), the Adversarial SQuAD, and the Microsoft MAchine Reading COmprehension Dataset (MS ***** MARCO *****). | ||
| 2021.sustainlp-1.8 Applied to the MS ***** MARCO ***** passage and document ranking tasks, we are able to achieve the same level of effectiveness, but with up to 18 increase in efficiency. | ||
| D17-1090 We evaluate our question generation method for the answer sentence selection task on three benchmark datasets, including SQuAD, MS ***** MARCO *****, and WikiQA | ||
| RESTful | 15 | |
| 2021.naacl-demos.12 The tool also integrates a ***** RESTful ***** API that enables integration into other software systems, including an API for machine learning integration. | ||
| 2020.acl-demos.6 Our service provides both a user-friendly interface, available at http://syntagnet.org/, and a ***** RESTful ***** endpoint to query the system programmatically (accessible at http://api.syntagnet.org/). | ||
| W18-5203 The functionality of ArguminSci is accessible via three interfaces: as a command line tool, via a ***** RESTful ***** application programming interface, and as a web application. | ||
| 2021.emnlp-demo.16 Moreover, our ***** RESTful ***** APIs enable easy integration of SPRING in downstream applications where AMR structures are needed. | ||
| 2020.acl-demos.26 A user interface for local development, remote webpage access, and a ***** RESTful ***** API are provided to make it simple for users to build their own demos | ||
| trilingual | 15 | |
| W19-3614 In this paper, we present an effort to generate a joint Urdu, Roman Urdu and English ***** trilingual ***** lexicon using automated methods. | ||
| W17-7908 We apply these methods on a ***** trilingual ***** dictionary in Fula, English and French. | ||
| L08-1314 We aim to characterize the comparability of corpora, we address this issue in the ***** trilingual ***** context through the distinction of expert and non expert documents. | ||
| L06-1146 Those two spoken languages become language barriers for deaf people and our ***** trilingual ***** dictionary will remove the barrier. | ||
| L10-1152 This article presents a new freely available *****trilingual***** corpus ( Catalan , Spanish , English ) that contains large portions of the Wikipedia and has been automatically enriched with linguistic information . | ||
| Likert | 15 | |
| W17-2905 As a first step, we crowd-sourced the scoring of a predefined set of topics on a ***** Likert ***** scale from non-controversial to controversial. | ||
| 2020.winlp-1.33 Through our systematic study with 40 crowdsourced workers in each task, we find that using continuous scales achieves more consistent ratings than ***** Likert ***** scale or ranking-based experiment design. | ||
| 2021.semeval-1.11 This article describes a system to predict the complexity of words for the Lexical Complexity Prediction (LCP) shared task hosted at SemEval 2021 (Task 1) with a new annotated English dataset with a ***** Likert ***** scale. | ||
| W19-8648 Rating and ***** Likert ***** scales are widely used in evaluation experiments to measure the quality of Natural Language Generation (NLG) systems. | ||
| 2020.lantern-1.4 Evaluations of image description systems are typically domain - general : generated descriptions for the held - out test images are either compared to a set of reference descriptions ( using automated metrics ) , or rated by human judges on one or more *****Likert***** scales ( for fluency , overall quality , and other quality criteria ) . | ||
| arbitrary | 15 | |
| W18-0506 Given a history of errors made by learners of a second language, the task is to predict errors that they are likely to make at ***** arbitrary ***** points in the future. | ||
| D18-1017 Besides, since ***** arbitrary ***** character can provide important cues when predicting entity type, we exploit self-attention to explicitly capture long range dependencies between two tokens. | ||
| 2020.emnlp-main.23 We show our classifiers are valuable for a variety of applications, like controlling for gender bias in generative models, detecting gender bias in ***** arbitrary ***** text, and classifying text as offensive based on its genderedness. | ||
| 2021.conll-1.32 The recurrent neural network (RNN) language model is a powerful tool for learning ***** arbitrary ***** sequential dependencies in language data. | ||
| 2021.emnlp-demo.14 On the way towards general Visual Question Answering (VQA) systems that are able to answer ***** arbitrary ***** questions, the need arises for evaluation beyond single-metric leaderboards for specific datasets | ||
| digitization | 15 | |
| 2020.ai4hi-1.4 A wealth of information is buried in art-historic archives which can be extracted via ***** digitization ***** and analysis. | ||
| 2021.latechclfl-1.5 Due to the amount of data created by this ***** digitization ***** process, the design of tools that enable the analysis and management of data and metadata has become a relevant topic. | ||
| 2021.bionlp-1.17 To keep pace with the increased generation and ***** digitization ***** of documents, automated methods that can improve search, discovery and mining of the vast body of literature are essential. | ||
| 2020.ai4hi-1.1 Cultural institutions such as galleries, libraries, archives and museums continue to make commitments to large scale ***** digitization ***** of collections. | ||
| 2020.lrec-1.122 Only some few of these specialized newspapers have been digitized up until now, but they are usually not well curated in terms of ***** digitization ***** quality, data formatting, completeness, redundancy (de-duplication), supply of metadata, and, hence, searchability | ||
| metonymic | 15 | |
| L16-1731 After detecting these expressions, they are interpreted as ***** metonymic ***** understanding words by using associative information. | ||
| 2020.isa-1.6 Semantic Type labelling is not only well-suited to annotate verbal polysemy, but also ***** metonymic ***** shifts in verb argument combinations, which in Generative Lexicon | ||
| L10-1574 Computational resources can facilitate our study in this field in an effective way by helping codify, translate and handle particular cases of polysemy, but also guiding in metaphorical and ***** metonymic ***** sense recognition, supported by the ontological classification of the lexical semantic entities. | ||
| 2020.coling-main.602 In this work , we carry out two experiments in order to assess the ability of BERT to capture the meaning shift associated with *****metonymic***** expressions . | ||
| P17-1115 Named entities are frequently used in a *****metonymic***** manner . | ||
| MWP | 15 | |
| 2021.emnlp-main.484 In this paper, we develop a novel ***** MWP ***** generation approach that leverages i) pre-trained language models and a context keyword selection model to improve the language quality of generated ***** MWP *****s and ii) an equation consistency constraint for math equations to improve the mathematical validity of the generated ***** MWP *****s. | ||
| 2021.emnlp-main.348 Experiments on an educational gold-standard set and a large-scale generated ***** MWP ***** set show that our approach is superior on the ***** MWP ***** generation task, and it outperforms the SOTA models in terms of both automatic evaluation metrics, i.e., BLEU-4, ROUGE-L, Self-BLEU, and human evaluation metrics, i.e., equation relevance, topic relevance, and language coherence. | ||
| 2021.emnlp-main.272 Secondly, the hierarchical reasoning encoder is presented for seamlessly integrating the word-level and sentence-level reasoning to bridge the entity and context domain on ***** MWP *****. | ||
| 2020.acl-main.92 Each ***** MWP ***** is annotated with its problem type and grade level (for indicating the level of difficulty). | ||
| 2021.naacl-main.168 To this end, we show that ***** MWP ***** solvers that do not have access to the question asked in the ***** MWP ***** can still solve a large fraction of ***** MWP *****s | ||
| translational | 15 | |
| L10-1604 We show that ILP is particularly well suited for this task in which the data can only be expressed by (***** translational ***** and syntactic) relations. | ||
| L14-1391 By examining the rates and patterns of occurrence across four genres in the NTU Multilingual Corpus, a resource may be created to aid machine translation or, going further, predict Chinese ***** translational ***** trends in any given genre. | ||
| 2020.findings-emnlp.158 Our work provides greater understanding of knowledge transfer for researchers, practitioners, and government agencies interested in encouraging ***** translational ***** research. | ||
| I17-1004 We answer the question of whether attention is only capable of modelling ***** translational ***** equivalent or it captures more information. | ||
| 2021.mtsummit-research.21 Word alignment identify *****translational***** correspondences between words in a parallel sentence pair and are used and for example and to train statistical machine translation and learn bilingual dictionaries or to perform quality estimation . | ||
| orthographical | 15 | |
| L10-1408 An ***** orthographical ***** transcription is available for every utterance. | ||
| S19-1003 In this paper, we argue that, despite the successes of this assumption, it is incomplete: in addition to its context, ***** orthographical ***** or morphological aspects of words can offer clues about their meaning. | ||
| L06-1001 It is supplied, in seven separate interval tiers, with an ***** orthographical ***** transcription, detailed part-of-speech tags, simplified part-of-speech tags, a phonological transcription, a broad phonetic transcription, the pitch relation between each stressed and post-tonic syllable, the phrasal intonation, and an empty tier for comments. | ||
| L08-1319 The ***** orthographical ***** complexities of Chinese, Japanese, Korean (CJK) and Arabic pose a special challenge to developers of NLP applications. | ||
| 2020.coling-main.295 This work presents a method of word sense clustering that differentiates homonyms and merge homophones, taking Japanese as an example, where ***** orthographical ***** variation causes problem for language processing | ||
| cascade | 15 | |
| 2020.wanlp-1.16 The system is learned to predict all the annotation levels in ***** cascade *****, starting from Arabizi input. | ||
| 2021.iwslt-1.6 For offline speech translation, our best end-to-end model achieves 7.9 BLEU improvements over the benchmark on the MuST-C test set and is even approaching the results of a strong ***** cascade ***** solution. | ||
| D17-1293 In this paper, we propose a novel ***** cascade ***** model, which can capture both the latent semantics and latent similarity by modeling MOOC data. | ||
| L10-1466 We used and modified a first ***** cascade ***** to recognize named entities | ||
| 2021.acl-long.224 Five years after the first published proofs of concept , direct approaches to speech translation ( ST ) are now competing with traditional *****cascade***** solutions . | ||
| misinformation | 15 | |
| D19-5004 Further to the efforts of reducing exposure to ***** misinformation ***** on social media, purveyors of fake news have begun to masquerade as satire sites to avoid being demoted. | ||
| 2020.coling-main.147 Traditional fact checking by experts or crowds is increasingly difficult to keep pace with the volume of newly created ***** misinformation ***** in the Web. | ||
| 2021.nlp4if-1.18 This paper describes the TOKOFOU system, an ensemble model for ***** misinformation ***** detection tasks based on six different transformer-based pre-trained encoders, implemented in the context of the COVID-19 Infodemic Shared Task for English. | ||
| 2020.nlpcovid19-acl.9 With the goal of combating ***** misinformation *****, we designed and built Jennifer–a chatbot maintained by a global group of volunteers. | ||
| 2021.eacl-main.201 Growing concern with online ***** misinformation ***** has encouraged NLP research on fact verification | ||
| Backdoor | 15 | |
| 2021.acl-long.431 Recent researches have shown that large natural language processing (NLP) models are vulnerable to a kind of security threat called the ***** Backdoor ***** Attack. | ||
| 2021.emnlp-main.752 ***** Backdoor ***** attacks are a kind of emergent training-time threat to deep neural networks (DNNs). | ||
| 2021.acl-long.37 ***** Backdoor ***** attacks are a kind of insidious security threat against machine learning models. | ||
| 2021.emnlp-main.659 *****Backdoor***** attacks , which maliciously control a well - trained model 's outputs of the instances with specific triggers , are recently shown to be serious threats to the safety of reusing deep neural networks ( DNNs ) . | ||
| 2021.naacl-main.165 Recent studies have revealed a security threat to natural language processing ( NLP ) models , called the *****Backdoor***** Attack . | ||
| pre | 15 | |
| 2021.wnut-1.47 We show that a character-based model trained on only 99k sentences of NArabizi and fined-tuned on a small treebank of this language leads to performance close to those obtained with the same architecture ***** pre *****- trained on large multilingual and monolingual models. | ||
| S19-2232 We implemented a deep-affix based LSTM-CRF NER model for task 1, which utilizes only character, word, ***** pre *****- fix and suffix information for the identification of geolocation entities. | ||
| 2021.semeval-1.172 It ***** pre *****sents ***** pre ***** and post processing techniques, variable threshold learning, meta learning and Ensemble approach to solve various sub-tasks that were part of the challenge. | ||
| 1998.amta-papers.31 This paper ***** pre *****sents our analysis of NTWs and uses these results to argue that in addition to lexicon enhancement, MT systems could benefit from more sophisticated ***** pre *****- and postprocessing of real-world documents in order to weed out such NTWs | ||
| 2020.findings-emnlp.302 We introduce two effective models for duration prediction , which incorporate external knowledge by reading temporal - related news sentences ( time - aware *****pre***** - training ) . | ||
| filter | 15 | |
| R17-1093 The sampling is applied in order to determine the strength of morphological relationships between words, ***** filter ***** out accidental similarities and reduce the set of rules necessary to explain the data. | ||
| 2021.emnlp-main.282 F3 connects models with our ***** filter ***** mechanism to ***** filter ***** out the last model's unchanged fix to the next. | ||
| 2014.iwslt-evaluation.16 Compared with our system used in last year, we added additional subsystems based on deep neural network modeling on ***** filter ***** bank feature and convolutional deep neural network modeling on ***** filter ***** bank feature with tonal features. | ||
| D19-1253 We also propose two empirically effective strategies, a data ***** filter ***** and mixing mini-batch training, to properly use the QG-generated data for QA. | ||
| 2020.findings-emnlp.127 In this method, a context ***** filter ***** and a knowledge ***** filter ***** are first built, which derive knowledge-aware context representations and context-aware knowledge representations respectively by global and bidirectional attention | ||
| microblogging | 15 | |
| 2021.semeval-1.135 The upsurge of prolific blogging and ***** microblogging ***** platforms enabled the abusers to spread negativity and threats greater than ever. | ||
| L16-1696 A significant portion of data generated on blogging and ***** microblogging ***** websites is non-credible as shown in many recent studies. | ||
| L10-1263 Because ***** microblogging ***** has appeared relatively recently, there are a few research works that were devoted to this topic. | ||
| R19-1046 Every day , the emotion and opinion of different people across the world are reflected in the form of short messages using *****microblogging***** platforms . | ||
| W19-1301 Social media sites like Facebook , Twitter , and other *****microblogging***** forums have emerged as a platform for people to express their opinions and views on different issues and events . | ||
| geometric | 15 | |
| W19-8668 However, words such as spatial relations (e.g. next to and under) are not directly referring to ***** geometric ***** arrangements of pixels but to complex ***** geometric ***** and conceptual representations. | ||
| P19-1655 The actual grounding can connect language to the environment through multiple modalities, e.g. “stop at the door” might ground into visual objects, while “turn right” might rely only on the ***** geometric ***** structure of a route. | ||
| 2020.acl-main.276 We propose a novel manifold based ***** geometric ***** approach for learning unsupervised alignment of word embeddings between the source and the target languages. | ||
| 2020.findings-emnlp.280 Afterwards, we introduce a Counterfactual Generation to convert the gender information of words, so the original and the modified embeddings can produce a gender-neutralized word embedding after ***** geometric ***** alignment regularization, without loss of semantic information. | ||
| P18-1012 Despite this popularity and effectiveness of KG embeddings in various tasks (e.g., link prediction), ***** geometric ***** understanding of such embeddings (i.e., arrangement of entity and relation vectors in vector space) is unexplored – we fill this gap in the paper | ||
| multiparty | 15 | |
| L14-1641 The corpus is targeted and designed towards the development of a dialogue system platform to explore verbal and nonverbal tutoring strategies in ***** multiparty ***** spoken interactions. | ||
| L06-1503 We present an annotation scheme for emotionally relevant behavior at the speaker contribution level in ***** multiparty ***** conversation. | ||
| L16-1037 The system is designed to simulate ***** multiparty ***** conversation, expecting implicit learning and enhancement of predictability of learners' utterance through an alignment similar to “interactive alignment”, which is observed in human-human conversation. | ||
| L14-1062 This paper presents the first release of the KiezDeutsch Korpus (KiDKo), a new language resource with ***** multiparty ***** spoken dialogues of Kiezdeutsch, a newly emerging language variety spoken by adolescents from multiethnic urban areas in Germany | ||
| S18-1007 Character identification is a task of entity linking that finds the global entity of each personal mention in *****multiparty***** dialogue . | ||
| passages | 15 | |
| D19-1599 To tackle this issue, we propose a multi-passage BERT model to globally normalize answer scores across all ***** passages ***** of the same question, and this change enables our QA model find better answers by utilizing more ***** passages *****. | ||
| 2020.sustainlp-1.9 To reduce this cost, we propose the use of adaptive computation to control the computational budget allocated for the ***** passages ***** to be read. | ||
| N18-1195 Using a case study, we show that variation in oral reading rate across ***** passages ***** for professional narrators is consistent across readers and much of it can be explained using features of the texts being read. | ||
| P17-1123 We study automatic question generation for sentences from text ***** passages ***** in reading comprehension. | ||
| 2020.latechclfl-1.7 The Vectorian works like a search engine, i.e. a Shakespeare phrase can be entered as a query, the underlying collection of fiction books is then searched for the phrase and the ***** passages ***** that are likely to contain the phrase, either verbatim or as a paraphrase, are presented in a ranked results list | ||
| gated recurrent | 15 | |
| S18-1147 The evaluated models include convolutional neural network, long-short term memory network, ***** gated recurrent ***** unit and recurrent convolutional neural network. | ||
| Q19-1008 Stacking long short-term memory (LSTM) cells or ***** gated recurrent ***** units (GRUs) as part of a recurrent neural network (RNN) has become a standard approach to solving a number of tasks ranging from language modeling to text summarization. | ||
| 2021.teachingnlp-1.9 The lecture departs from count-based statistical methods and spans up to ***** gated recurrent ***** networks and attention, which is ubiquitous in today's NLP. | ||
| D18-1237 In detail, our method has two novel aspects: (1) an advanced memory-augmented architecture and (2) an expanded ***** gated recurrent ***** unit with dense connections that mitigate potential information distortion occurring in the memory. | ||
| W18-1104 Using a hybrid supervision method that exploits first person emotion seeds, we show how we can acquire promising results with a deep ***** gated recurrent ***** neural network | ||
| specialized | 15 | |
| S18-1127 Ultimately, the CNN model class proved most performant, so we ***** specialized ***** to this model for our final submissions. | ||
| 2018.gwc-1.50 We describe preliminary work in the creation of the first ***** specialized ***** vocabulary to be integrated into the Open Multilingual Wordnet (OMW). | ||
| L16-1505 In addition, ***** specialized ***** knowledge is used for processing medical roots and affixes, ontological relations and concept mapping, and for generating lay variants of terms according to the patient's non-expert discourse. | ||
| 2021.acl-long.125 We firstly design a topic-augmented language model (LM) with an additional layer ***** specialized ***** for topic detection. | ||
| L06-1485 Additionally, we present some specific annotation and research tasks for which NOMOS has been ***** specialized ***** and used, annotation and research tasks for which NOMOS has been ***** specialized ***** and used, including topic segmentation and decision-point annotation of meetings | ||
| setting | 15 | |
| 2020.lrec-1.74 We highlight how thinking aloud affects interpretation of dialogue acts in our ***** setting ***** and how to best capture that information. | ||
| 2005.mtsummit-wpt.9 The workflow consists of the stage for ***** setting ***** lexical goals and the semi- automatic terminology construction stage. | ||
| 2021.inlg-1.11 Because existing datasets do not have such alignments of data in multiple modalities, this ***** setting ***** has not been explored in depth. | ||
| N18-2084 We show that such embeddings can be surprisingly effective in some cases – providing gains of up to 20 BLEU points in the most favorable ***** setting *****. | ||
| 2019.iwslt-1.26 We study here a related ***** setting *****, multi-domain adaptation, where the number of domains is potentially large and adapting separately to each domain would waste training resources. | ||
| semantic structures | 15 | |
| 2021.emnlp-main.641 The case studies verify that the generated mind-maps better reveal the underlying ***** semantic structures ***** of the document. | ||
| L16-1574 This paper introduces a toolkit used for the purpose of detecting replacements of different grammatical and ***** semantic structures ***** in ongoing text production logged as a chronological series of computer interaction events (so-called keystroke logs). | ||
| D19-1030 The identification of complex ***** semantic structures ***** such as events and entity relations, already a challenging Information Extraction task, is doubly difficult from sources written in under-resourced and under-annotated languages. | ||
| 2021.deelio-1.1 Through visualization, we demonstrate the hierarchical ***** semantic structures ***** captured by the transformer factors, e.g., word-level polysemy disambiguation, sentence-level pattern formation, and long-range dependency. | ||
| 2021.naacl-main.125 In this paper, we present a heterogeneous graph-based model to incorporate syntactic and ***** semantic structures ***** of sentences. | ||
| strategies | 15 | |
| 2021.hackashop-1.19 In the 2021 Embeddia Hackathon, we implemented one novel, normative theory-based evaluation metric, “activation”, and use it to compare two recommendation ***** strategies ***** of New York Times comments, one based on user likes and another on editor picks. | ||
| 2001.mtsummit-teach.7 It is a common mispreconception to say that machine translation programs translate word-for-word, but real systems follow ***** strategies ***** which are much more complex. | ||
| D19-6203 Unfortunately, the models in the literature tend to employ different ***** strategies ***** to perform pooling for RE, leading to the challenge to determine the best pooling mechanism for this problem, especially in the biomedical domain. | ||
| L10-1351 We describe an experimentalWizard-of-Oz-setup for the integration of emotional ***** strategies ***** into spoken dialogue management. | ||
| L10-1626 Another point is the different ***** strategies ***** of compression used according to the length of the sentence. | ||
| formal semantics | 15 | |
| 1993.iwpt-1.13 Advantages of the bunch concept are illustrated by using it in descriptions of a ***** formal semantics ***** for context-free grammars and of functional parsing algorithms. | ||
| L08-1398 We consider an ontology to be a semiotic object and we identify three main types of semiotic ontology evaluation levels: the structural level, assessing the ontology syntax and ***** formal semantics *****; the functional level, assessing the ontology cognitive semantics and; the usability-related level, assessing the ontology pragmatics. | ||
| W19-8616 A prominent strand of work in ***** formal semantics ***** investigates the ways in which human languages quantify over the elements of a set, as when we say “All A are B ”, “All except two A are B ”, “Only a few of the A are B ” and so on. | ||
| 2020.pam-1.8 We present a ***** formal semantics ***** (a version of Type Theory with Records) which places classifiers of perceptual information at the core of semantics. | ||
| L10-1270 Ontology-based semantic annotation aims at putting fragments of a text in correspondence with proper elements of an ontology such that the ***** formal semantics ***** encoded by the ontology can be exploited to represent text interpretation. | ||
| discussion forums | 15 | |
| Q15-1006 Online ***** discussion forums ***** and community question-answering websites provide one of the primary avenues for online users to share information. | ||
| 2021.wassa-1.4 We show the effectiveness and interpretability of our approach by achieving state-of-the-art results on datasets from social networking platforms, online ***** discussion forums *****, and political dialogues. | ||
| 2020.argmining-1.11 Annotators were asked to annotate spans of argumentation in 9 threads from two ***** discussion forums *****. | ||
| D19-1291 We propose a computational model for argument mining in online persuasive ***** discussion forums ***** that brings together the micro-level (argument as product) and macro-level (argument as process) models of argumentation. | ||
| D19-1675 Controversial claims are abundant in online media and ***** discussion forums *****. | ||
| agents | 15 | |
| 2020.coling-main.96 We introduce Situated Interactive MultiModal Conversations (SIMMC) as a new direction aimed at training ***** agents ***** that take multimodal actions grounded in a co-evolving multimodal input context in addition to the dialog history. | ||
| I17-3015 The proposed framework provides a pioneering example of on-demand knowledge validation in dialog environment to address such needs in AI ***** agents *****/chatbots. | ||
| 2020.challengehml-1.7 To this end, understanding passenger intents from spoken interactions and vehicle vision systems is an important building block for developing contextual and visually grounded conversational ***** agents ***** for AV. | ||
| 2020.ecomnlp-1.4 We propose a novel way of conversational recommendation, where instead of asking questions to the user to acquire their preferences; the recommender tracks their conversation with other people, including customer support ***** agents ***** (CSA), and joins the conversation only when it is time to introduce a recommendation. | ||
| L14-1668 We have made them accessible on the Web both for humans (via a Web interface) and software ***** agents ***** (with a SPARQL endpoint). | ||
| span identification | 15 | |
| 2021.semeval-1.65 The competition is composed of five subtasks that build on top of each other: (1) quantity ***** span identification *****, (2) unit extraction from the identified quantities and their value modifier classification, (3) ***** span identification ***** for measured entities and measured properties, (4) qualifier ***** span identification *****, and (5) relation extraction between the identified quantities, measured entities, measured properties, and qualifiers. | ||
| 2021.semeval-1.177 This paper presents our system for the Quantity ***** span identification *****, Unit of measurement identification and Value modifier classification subtasks of the MeasEval 2021 task. | ||
| 2021.semeval-1.151 As for subtask 2, we propose a system that consolidates a ***** span identification ***** model and a multi-label classification model based on pre-trained BERT. | ||
| 2021.semeval-1.124 We evaluated several pre-trained language models using various ensemble techniques for toxic ***** span identification ***** and achieved sizable improvements over our baseline fine-tuned BERT models. | ||
| 2020.semeval-1.196 We participate in both the ***** span identification ***** and technique classification subtasks and report on experiments using different BERT-based models along with handcrafted features. | ||
| discourse parser | 15 | |
| 2020.coling-main.16 Experiments show that our framework using sentiment-related discourse augmentations for sentiment prediction enhances the overall performance for long documents, even beyond previous approaches using well-established ***** discourse parser *****s trained on human annotated data. | ||
| P19-1410 Our framework comprises a discourse segmenter to identify the elementary discourse units (EDU) in a text, and a ***** discourse parser ***** that constructs a discourse tree in a top-down fashion. | ||
| 2020.codi-1.17 We present preliminary results on investigating the benefits of coreference resolution features for neural RST discourse parsing by considering different levels of coupling of the ***** discourse parser ***** with the coreference resolver. | ||
| D17-1136 We evaluate all these parsers with the standard Parseval procedure to provide a more accurate picture of the actual RST ***** discourse parser *****s performance in standard evaluation settings. | ||
| 2021.ranlp-1.143 We successfully applied state-of-the-art ***** discourse parser *****s and machine learning models to reconstruct argument graphs with the identified and classified discourse units as nodes and relations between them as edges. | ||
| conference | 15 | |
| P19-1038 Nowadays, firm CEOs communicate information not only verbally through press releases and financial reports, but also nonverbally through investor meetings and earnings ***** conference ***** calls. | ||
| 2020.acl-main.560 In this paper we look at the relation between the types of languages, resources, and their representation in NLP ***** conference *****s to understand the trajectory that different languages have followed over time. | ||
| 1997.iwpt-1.3 Providing machines with the ability to interpret, generate, and support interaction with multimedia artifacts (e.g., documents, broadcasts, hypermedia) will be a valuable facility for a number of key applications such as videotele***** conference ***** archiving, custom on-line news, and briefing assistants. | ||
| L16-1004 Some further analysis is proposed, with findings to be presented at the ***** conference *****. | ||
| 1991.mtsummit-papers.18 The overall system performance on a corpus of ***** conference ***** registration conversations is 87%. | ||
| high accuracy | 15 | |
| 2021.emnlp-main.705 Models achieving ***** high accuracy ***** during training perform poorly on the evaluation set, with a large gap between human performance. | ||
| D18-1009 Empirical results demonstrate that while humans can solve the resulting inference problems with ***** high accuracy ***** (88%), various competitive models struggle on our task. | ||
| 2020.aacl-srw.19 In this work, we introduce a GRU-based architecture called GRUBERT that learns to map the different BERT hidden layers to fused embeddings with the aim of achieving ***** high accuracy ***** on the Twitter sentiment analysis task. | ||
| N18-2015 Previous work has shown that ignoring the training set and training a model on the validation set can achieve ***** high accuracy ***** on this task due to stylistic differences between the story endings in the training set and validation and test sets. | ||
| L12-1468 Our experiments with the dataset have allowed us to reach very ***** high accuracy ***** in different phases of query analysis, especially when adopting machine learning methods. | ||
| dialog generation | 15 | |
| 2020.coling-main.362 To conquer these limitations, we propose a Dual Dynamic Memory Network (DDMN) for multi-turn ***** dialog generation *****, which maintains two core components: dialog memory manager and KB memory manager. | ||
| 2021.acl-demo.28 We pre-train a cross-lingual generation model ProphetNet-Multi, a Chinese generation model ProphetNet-Zh, two open-domain ***** dialog generation ***** models ProphetNet-Dialog-En and ProphetNet-Dialog-Zh. | ||
| W18-5001 This algorithm can learn a cross-domain embedding space that models the semantics of dialog responses which in turn, enables a neural ***** dialog generation ***** model to generalize to new domains. | ||
| 2020.emnlp-main.736 Despite the success of existing referenced metrics (e.g., BLEU and MoverScore), they correlate poorly with human judgments for open-ended text generation including story or ***** dialog generation ***** because of the notorious one-to-many issue: there are many plausible outputs for the same input, which may differ substantially in literal or semantics from the limited number of given references. | ||
| D19-1189 Experimental results demonstrate that our proposed model has significant advantages over the baselines in both the evaluation of ***** dialog generation ***** and recommendation. | ||
| linguistic typology | 15 | |
| 2020.emnlp-main.187 Sparse language vectors from ***** linguistic typology ***** databases and learned embeddings from tasks like multilingual machine translation have been investigated in isolation, without analysing how they could benefit from each other's language characterisation. | ||
| N18-1004 The field of ***** linguistic typology ***** seeks to answer these questions and, thereby, divine the mechanisms that underlie human language. | ||
| 2020.emnlp-main.180 It also allows for an easy but effective integration of existing ***** linguistic typology ***** features into the parsing network. | ||
| 2020.emnlp-main.71 Our work highlights the utility of deep contextualized models in ***** linguistic typology *****. | ||
| K18-2026 We jointly train models when two languages are similar according to ***** linguistic typology ***** and then ensemble the models using a simple re-parse algorithm. | ||
| natural language data | 15 | |
| 2021.naacl-tutorials.6 In this tutorial, we present a portion of unique industry experience in efficient ***** natural language data ***** annotation via crowdsourcing shared by both leading researchers and engineers from Yandex. | ||
| 2020.coling-main.129 Since ***** natural language data *****sets have nested dependencies of bounded depth, this may help explain why they perform well in modeling hierarchical dependencies in ***** natural language data ***** despite prior works indicating poor generalization performance on Dyck languages. | ||
| 2020.emnlp-main.643 However, vocal cues in the speech of company executives present an underexplored rich source of ***** natural language data ***** for estimating financial risk. | ||
| E17-1058 Our approach can be understood as a ***** natural language data *****base, in that questions about KB entities are answered by attending to textual or database evidence. | ||
| L12-1028 To bridge the cognitive and affective gap between word-level ***** natural language data ***** and the concept-level sentiments conveyed by them, affective common sense knowledge is needed. | ||
| participants | 15 | |
| L16-1741 Previously, a seniors' speech corpus named S-JNAS was developed, but the average age of the ***** participants ***** was 67.6 years, but the target age for nursing home care is around 75 years old, much higher than that of the S-JNAS samples. | ||
| W17-3105 Our results indicate that, overall, research ***** participants ***** were enthusiastic about the possibility of using social media (in conjunction with automated Natural Language Processing algorithms) for mood tracking under the supervision of a mental health practitioner. | ||
| R19-1089 Based on over two hundred ***** participants *****, the survey results confirm earlier observations, that successful reproducibility requires more than having access to code and data. | ||
| 2020.cogalex-1.16 We collect data from a behavioral, web-based experiment, in which ***** participants ***** provide judgments of the relatedness of multiple WordNet senses of a word in a two-dimensional spatial arrangement task. | ||
| W19-5305 Further, we get the best BLEU scores in the directions of English to Gujarati and Lithuanian to English (28.2 and 36.3 respectively) among all the ***** participants *****. | ||
| learning systems | 15 | |
| R19-1002 We also investigate various machine ***** learning systems ***** and features and evaluate their performance on the newly generated dataset. | ||
| 2020.coling-main.235 We propose a novel Bi-directional Cognitive Knowledge Framework (BCKF) for reading comprehension from the perspective of complementary ***** learning systems ***** theory. | ||
| 2020.bea-1.8 In this paper, we show how a deep-learning based system can outperform feature-based machine ***** learning systems *****, as well as a string kernel system in scoring essay traits. | ||
| K17-3001 The Conference on Computational Natural Language Learning (CoNLL) features a shared task, in which participants train and test their ***** learning systems ***** on the same data sets. | ||
| 2021.acl-short.59 Out-of-domain evaluation across 12 genres shows nearly 15-20% degradation for both deterministic and deep ***** learning systems *****, indicating a lack of generalizability or covert overfitting in existing coreference resolution models. | ||
| deep convolutional neural | 15 | |
| W17-5031 We present a very simple model for text quality assessment based on a ***** deep convolutional neural ***** network, where the only supervision required is one corpus of user-generated text of varying quality, and one contrasting text corpus of consistently high quality. | ||
| Q17-1002 The second is visually grounded, using ***** deep convolutional neural ***** networks trained on Google Images. | ||
| D19-1114 In addition, we introduce residual connections to the ***** deep convolutional neural ***** network component of the model. | ||
| P17-1052 This paper proposes a low-complexity word-level ***** deep convolutional neural ***** network (CNN) architecture for text categorization that can efficiently represent long-range associations in text. | ||
| 2020.clinicalnlp-1.24 In this study, we proposed a novel multi-channel ***** deep convolutional neural ***** network architecture, namely Quest-CNN, for the purpose of separating real questions that expect an answer (information or help) about an issue from sentences that are not questions, as well as from questions referring to an issue mentioned in a nearby sentence (e.g., can you clarify this? | ||
| sentence similarity | 15 | |
| L16-1452 To our knowledge, this is the first study that investigates using dependency tree based ***** sentence similarity ***** for multi-document summarization. | ||
| 2020.ngt-1.6 Existing manually annotated paraphrase datasets for Russian are limited to small-sized ParaPhraser corpus and ParaPlag which are suitable for a set of NLP tasks, such as paraphrase and plagiarism detection, ***** sentence similarity ***** and relatedness estimation, etc. | ||
| P18-1142 Our results show the effectiveness of this method for both machine translation and cross-lingual ***** sentence similarity *****, demonstrating the importance of syntactic structure compatibility for boosting cross-lingual transfer in NLP. | ||
| D17-1029 Finally, the proposed architecture is applied to various sentence composition models, which achieves substantial performance gains over baseline models on ***** sentence similarity ***** task. | ||
| 2021.emnlp-main.612 Our meaning embedding allows efficient cross-lingual ***** sentence similarity ***** estimation by simple cosine similarity calculation. | ||
| contrastive learning | 15 | |
| 2021.acl-short.29 Based on Vision-and-Language BERT, we train UMIC to discriminate negative captions via ***** contrastive learning *****. | ||
| 2020.emnlp-main.265 Therefore, we introduce a novel self-supervised ***** contrastive learning ***** mechanism to learn the relationship between original samples, factual samples and counterfactual samples. | ||
| 2021.argmining-1.19 One component employs ***** contrastive learning ***** via a siamese neural network for matching arguments to key points; the other is a graph-based extractive summarization model for generating key points. | ||
| 2021.wmt-1.121 In this paper, we propose CorefCL, a novel data augmentation and ***** contrastive learning ***** scheme based on coreference between the source and contextual sentences. | ||
| 2021.emnlp-main.204 In this paper, we introduce a novel approach based on ***** contrastive learning ***** that learns better representations by exploiting relation label information. | ||
| neural language generation | 15 | |
| W19-3405 However, typical ***** neural language generation ***** approaches to event-to-sentence can ignore the event details and produce grammatically-correct but semantically-unrelated sentences. | ||
| 2021.sigdial-1.43 Different lines of work in ***** neural language generation ***** investigated decoding methods for generating more diverse utterances, or increasing the informativity through pragmatic reasoning. | ||
| N19-1377 We consider ***** neural language generation ***** under a novel problem setting: generating the words of a sentence according to the order of their first appearance in its lexicalized PCFG parse tree, in a depth-first, left-to-right manner. | ||
| P19-1256 Recent ***** neural language generation ***** systems often hallucinate contents (i.e., producing irrelevant or contradicted facts), especially when trained on loosely corresponding pairs of the input structure and text. | ||
| W17-4914 We focus on a specific aspect of ***** neural language generation *****: its ability to reproduce authorial writing styles. | ||
| imitation learning | 15 | |
| P18-1174 We introduce a method that learns an AL “policy” using “***** imitation learning *****” (IL). | ||
| 2021.eacl-main.233 Next, we perform a coupled scheduled sampling to effectively mitigate the exposure bias when learning both policies jointly with ***** imitation learning *****. | ||
| N18-1187 Applying reinforcement learning with user feedback after the ***** imitation learning ***** stage further improves the agent's capability in successfully completing a task. | ||
| D19-1619 Although both reinforcement learning (RL) and ***** imitation learning ***** (IL) have been widely used to alleviate the bias, the lack of direct comparison leads to only a partial image on their benefits. | ||
| P19-1125 In this paper, we propose an ***** imitation learning ***** framework for non-autoregressive machine translation, which still enjoys the fast translation speed but gives comparable translation performance compared to its auto-regressive counterpart. | ||
| discriminative training | 15 | |
| 2021.emnlp-main.561 Recent works have shown that ***** discriminative training ***** results in models that exploit these underlying biases to achieve a better held-out performance, without learning the right way to reason. | ||
| K17-1011 Pairwise ranking methods are the most widely used ***** discriminative training ***** approaches for structure prediction problems in natural language processing (NLP). | ||
| 2012.amta-papers.14 In particular, large-margin structured prediction methods for ***** discriminative training ***** of feature weights, such as the structured perceptron or MIRA, have started to match or exceed the performance of existing methods such as MERT. | ||
| 2013.iwslt-evaluation.18 The system is constructed by applying several techniques, notably, subspace Gaussian mixture models, speaker adaptation, ***** discriminative training *****, system combination and SOUL, a neural network language model. | ||
| D19-1427 This setup lends itself to a ***** discriminative training ***** approach, which we demonstrate to work better than generative language modeling. | ||
| endangered languages | 15 | |
| 2021.sigtyp-1.12 For many low-resource and ***** endangered languages *****, only single-speaker recordings may be available, demanding a need for domain and speaker-invariant language ID systems. | ||
| 2020.rail-1.1 The ǂKhomani San, Hugh Brody Collection features the voices and history of indigenous hunter gatherer descendants in three ***** endangered languages ***** namely, N|uu, Kora and Khoekhoe as well as a regional dialect of Afrikaans. | ||
| 2020.emnlp-main.478 We create a benchmark dataset of transcriptions for scanned books in three critically ***** endangered languages ***** and present a systematic analysis of how general-purpose OCR tools are not robust to the data-scarce setting of ***** endangered languages *****. | ||
| L12-1521 The RELISH project promotes language-oriented research by addressing a two-pronged problem: (1) the lack of harmonization between digital standards for lexical information in Europe and America, and (2) the lack of interoperability among existing lexicons of ***** endangered languages *****, in particular those created with the Shoebox/Toolbox lexicon building software. | ||
| L10-1234 This is particularly true of ***** endangered languages *****. | ||
| automatically identifying | 15 | |
| 2020.nuse-1.3 Building on prior work, we demonstrate an improved approach to ***** automatically identifying ***** the discourse function of paragraphs in news articles. | ||
| W17-5007 Native Language Identification (NLI) is the task of ***** automatically identifying ***** the native language (L1) of an individual based on their language production in a learned language. | ||
| R17-1037 In the context of investigative journalism, we address the problem of ***** automatically identifying ***** which claims in a given document are most worthy and should be prioritized for fact-checking. | ||
| P19-2013 The paper sets out a research plan for the first steps at ***** automatically identifying ***** and predicting consensus in a corpus of German language debates on hydraulic fracking. | ||
| W18-5210 This paper goes a step further by addressing the task of ***** automatically identifying ***** reasoning patterns of arguments using predefined templates, which is called argument template (AT) instantiation. | ||
| sign language corpus | 15 | |
| 2020.lrec-1.737 Our results will be implemented as a suggestion system for ***** sign language corpus ***** annotation. | ||
| L16-1526 For publishing ***** sign language corpus ***** data on the web, anonymization is crucial even if it is impossible to hide the visual appearance of the signers: In a small community, even vague references to third persons may be enough to identify those persons. | ||
| L08-1471 This paper discusses the design, recording and preprocessing of a Czech ***** sign language corpus *****. | ||
| L14-1472 This paper introduces the RWTH-PHOENIX-Weather 2014, a video-based, large vocabulary, German ***** sign language corpus ***** which has been extended over the last two years, tripling the size of the original corpus. | ||
| L08-1470 We therefore present the ATIS ***** sign language corpus ***** that is based on the domain of air travel information. | ||
| word prediction | 15 | |
| Q15-1031 We define training and evaluation paradigms for the task of surface ***** word prediction *****, and report results on subsets of 7 languages. | ||
| 2021.acl-long.514 Finally, in a study where users had to dwell for a second on each key, sentence abbreviated input was competitive with a conventional keyboard with ***** word prediction *****s. | ||
| 2021.emnlp-main.557 This paper studies the effect of using six different number encoders on the task of masked ***** word prediction ***** (MWP), as a proxy for evaluating literacy. | ||
| 2020.conll-1.49 Here we evaluate several state-of-the-art language models for their match to human next-***** word prediction *****s and to reading time behavior from eye movements. | ||
| 2021.cmcl-1.25 Our results indicate that the transformer models are better at capturing semantic knowledge relating to lexical concepts, both during ***** word prediction ***** and when retention is required. | ||
| continuous speech | 15 | |
| L06-1085 It will be used to improve the performance of large vocabulary ***** continuous speech ***** recogniser for non-native speakers. | ||
| L06-1014 In this paper building statistical language models for Persian language using a corpus and incorporating them in Persian ***** continuous speech ***** recognition (CSR) system are described. | ||
| L16-1738 The TYPALOC corpus is constituted of a selection of 28 dysarthric patients (three different pathologies) and of 12 healthy control speakers recorded while reading the same text and in a more natural ***** continuous speech ***** condition. | ||
| L08-1506 Many researches including large vocabulary ***** continuous speech ***** recognition and extraction of important sentences against lecture contents are necessary in order to realize the above system. | ||
| P17-1047 Given a collection of images and spoken audio captions , we present a method for discovering word - like acoustic units in the *****continuous speech***** signal and grounding them to semantically relevant image regions . | ||
| noun phrase | 15 | |
| L12-1614 ***** noun phrase *****s, verb phrases, adjectival phrases, etc.) | ||
| D19-1036 Open Information Extraction (OpenIE) methods are effective at extracting (***** noun phrase *****, relation phrase, ***** noun phrase *****) triples from text, e.g., (Barack Obama, took birth in, Honolulu). | ||
| P18-1009 This formulation allows us to use a new type of distant supervision at large scale: head words, which indicate the type of the ***** noun phrase *****s they appear in. | ||
| L16-1555 Verbs and ***** noun phrase *****s are annotated with event and participant types, respectively. | ||
| 2000.iwpt-1.38 This paper describes a rule based method for partial parsing , particularly for *****noun phrase***** recognition , which has been used in the development of a noun phrase recognizer for Modern Greek . | ||
| manual annotation | 15 | |
| L06-1278 We describe a corpus of TV interviews and the ***** manual annotation *****s that have been defined. | ||
| 2021.naacl-main.159 Active Learning (AL) strategies reduce the need for huge volumes of labeled data by iteratively selecting a small number of examples for ***** manual annotation ***** based on their estimated utility in training the given model. | ||
| 2020.coling-main.357 Beyond the prediction task, we use the classifier results as a source for a ***** manual annotation ***** step in order to identify new, unseen instances of each alternation. | ||
| W18-4924 Our results show that synthetic methods can be effective at significantly reducing parsing errors for a target domain without having to invest large resources on ***** manual annotation *****; and the combination of manual and synthetic methods is our best domain-independent performer. | ||
| L10-1160 The ***** manual annotation ***** of speech material is performed with the LombardSpeechLabel tool developed at the University of Maribor. | ||
| coordination | 15 | |
| 1998.amta-papers.15 Such problems include ambiguous attachment of participles, ambiguous scope in ***** coordination *****, and ambiguous attachment of the agent phrase for double passives. | ||
| 2021.law-1.9 Similarly to the converter for English (Schuster and Manning, 2016), we develop a rule-based system for deriving enhanced dependencies from the basic layer, covering three linguistic phenomena: relative clauses, ***** coordination *****, and raising/control. | ||
| L10-1206 This study examines the relationship between two kinds of semantic spaces ― i.e., spaces based on term frequency (tf) and word cooccurrence frequency (co) ― and four semantic relations ― i.e., synonymy, ***** coordination *****, superordination, and collocation ― by comparing, for each semantic relation, the performance of two semantic spaces in predicting word association. | ||
| 2021.cmcl-1.3 CCG has well-defined incremental parsing algorithms, surface compositional semantics, and can explain long-range dependencies as well as complicated cases of ***** coordination *****. | ||
| N19-1343 We propose a simple and accurate model for ***** coordination ***** boundary identification. | ||
| biomedical text mining | 15 | |
| E17-1109 Entity extraction is one of the fundamental components for ***** biomedical text mining *****. | ||
| D19-5701 These results indicate that there is a real interest in promoting ***** biomedical text mining ***** efforts beyond English. | ||
| L12-1492 In this publication, we announce the release of the final CALBC corpora which include the silver standard corpus in different versions and several gold standard corpora for the further usage of the ***** biomedical text mining ***** community. | ||
| D19-5729 BioNLP Open Shared Tasks (BioNLP-OST) is an international competition organized to facilitate development and sharing of computational tasks of ***** biomedical text mining ***** and solutions to them. | ||
| 2020.acl-main.335 Biomedical named entities often play important roles in many *****biomedical text mining***** tools . | ||
| opinions | 15 | |
| 2021.eacl-main.229 Since the number of reviews for each target can be prohibitively large, neural network-based methods follow a two-stage approach where an extractive step first pre-selects a subset of salient ***** opinions ***** and an abstractive step creates the summary while conditioning on the extracted subset. | ||
| D19-1342 If a real-world sentiment classification system ignores the existence of conflict ***** opinions ***** when it is designed, it will incorrectly mixed conflict ***** opinions ***** into other sentiment polarity categories in action. | ||
| 2020.lrec-1.611 In this paper we present the Vaccination Corpus, a corpus of texts related to the online vaccination debate that has been annotated with three layers of information about perspectives: attribution, claims and ***** opinions *****. | ||
| C16-2027 We present an opinion retrieval system that retrieves subjective and query-relevant tweets from Twitter, which is a useful source of obtaining real-time ***** opinions *****. | ||
| 2021.eacl-main.170 In this study, we design a directed syntactic dependency graph based on a dependency tree to establish a path from the target to candidate ***** opinions *****. | ||
| posts | 15 | |
| 2020.wnut-1.52 Increasing usage of social media presents new non-traditional avenues for monitoring disease outbreaks, virus transmissions and disease progressions through user ***** posts ***** describing test results or disease symptoms. | ||
| W18-5912 Our system ranks 2nd in the second shared task: Automatic classification of ***** posts ***** describing medication intake. | ||
| P18-1185 In this paper, we explore the task of name tagging in multimodal social media ***** posts *****. | ||
| W19-3403 We develop an unsupervised pipeline to extract schemas and apply our method to Reddit ***** posts ***** to detect schematic structures that are characteristic of different subreddits. | ||
| 2020.semeval-1.212 Our results for identifying offensive ***** posts ***** (Task A) yielded satisfactory accuracy of 0.92 for English, 0.81 for Danish, 0.84 for Turkish, 0.85 for Greek, and 0.89 for Arabic. | ||
| neural attention | 15 | |
| D18-1069 To alleviate this problem, this paper proposes a hybrid ***** neural attention ***** model which combines self and cross attention mechanism to locate salient part from textual context and interaction between users. | ||
| C16-1027 Our model firstly encode the source sentence with a bidirectional Long Short-Term Memory (BI-LSTM) and then use the ***** neural attention ***** as a pointer to select an ordered sub sequence of the input as the output. | ||
| 2021.naacl-main.225 Where past work has used word-level alignments, we focus on spans; borrowing ideas from phrase-based machine translation, we align subtrees in semantic parses to spans of input sentences, and encourage ***** neural attention ***** mechanisms to mimic these alignments. | ||
| 2020.acl-main.641 The ***** neural attention ***** model has achieved great success in data-to-text generation tasks. | ||
| E17-1035 We show that our methods achieve significant improvement over a baseline ***** neural attention ***** model and our results are also competitive against state-of-the-art systems that do not use extra linguistic resources. | ||
| children | 15 | |
| 2020.winlp-1.6 As a result, SIMPLEX-PB 2.0 features much more reliable and numerous candidate substitutions to complex words, as well as word complexity rankings produced by a group underprivileged ***** children *****. | ||
| 2020.coling-main.547 We create MIND-CA, a new corpus of 11,311 question-answer pairs in English from 1,066 ***** children ***** aged from 7 to 14. | ||
| 2020.cmcl-1.6 The results largely supported the hypothesis: Co-occurrence-based similarity was a strong predictor of ***** children *****'s associative behavior even controlling for other possible predictors such as phonological similarity, word frequency, and word length. | ||
| 2020.bea-1.6 In this paper we employ a novel approach to advancing our understanding of the development of writing in English and German ***** children ***** across school grades using classification tasks. | ||
| 2020.lrec-1.857 As a byproduct of our study, we create two new datasets comprised of spelling errors generated by ***** children ***** from hand-written essays and web search inquiries, which we make available to the research community. | ||
| graph neural | 15 | |
| 2021.emnlp-main.11 Recent studies have leveraged ***** graph neural ***** networks to capture the inter-sentential relationship (e.g., the discourse graph) within the documents to learn contextual sentence embedding. | ||
| 2020.coling-main.260 Integrating the proposed method with two ***** graph neural ***** network-based semantic parsers together with BERT representations demonstrates substantial gains in parsing accuracy on the challenging Spider dataset. | ||
| 2020.acl-main.280 To distinguish confusing charges, we propose a novel ***** graph neural ***** network, GDL, to automatically learn subtle differences between confusing law articles, and also design a novel attention mechanism that fully exploits the learned differences to attentively extract effective discriminative features from fact descriptions. | ||
| D19-1498 We thus propose an edge-oriented ***** graph neural ***** model for document-level relation extraction. | ||
| 2021.emnlp-main.278 We then proposed a novel graph-aware definition generation model Graphex that integrates transformer with ***** graph neural ***** network. | ||
| text representation | 15 | |
| 2020.sdp-1.11 While most previous approaches represent context using solely text surrounding the citation, we propose enhancing con***** text representation ***** with global information. | ||
| 2021.acl-long.299 Topic models have been widely used to learn ***** text representation *****s and gain insight into document corpora. | ||
| P18-1030 Bi-directional LSTMs are a powerful tool for ***** text representation *****. | ||
| 2020.clinicalnlp-1.8 However, recent advanced neural architectures with flat convolutions or multi-channel feature concatenation ignore the sequential causal constraint within a text sequence and may not learn meaningful clinical ***** text representation *****s, especially for lengthy clinical notes with long-term sequential dependency. | ||
| C18-1317 In this paper, we formulate previous utterances into context using a proposed deep utterance aggregation model to form a fine-grained con***** text representation *****. | ||
| linguistic processing | 15 | |
| L12-1406 The annotation tool is implemented as a component of the Ellogon language engineering platform, exploiting its extensive annotation engine, its cross-platform abilities and its ***** linguistic processing ***** components, if such a need arises. | ||
| L12-1512 This paper extracts a standard set of ***** linguistic processing ***** functionalities and tries to classify them formally. | ||
| 2000.iwpt-1.8 Range Concatenation Languages are closed both under intersection and complementation and these closure properties may allow to consider novel ways to describe some ***** linguistic processing *****s. | ||
| L06-1117 The aim has been to construct a summarizer that can be quickly assembled, with the use of only a very few basic language tools, for languages that lack large amounts of structured or annotated data or advanced tools for ***** linguistic processing *****. | ||
| W19-4402 An ablation study of the various types of linguistic features suggested that information from all levels of ***** linguistic processing ***** contributes to predicting item difficulty, with features related to semantic ambiguity and the psycholinguistic properties of words having a slightly higher importance. | ||
| massively multilingual | 15 | |
| 2021.mtsummit-research.24 While interesting and fully unsupervised settings are unrealistic; small amounts of bilingual data are usually available due to the existence of ***** massively multilingual ***** parallel corpora and or linguists can create small amounts of parallel data. | ||
| 2020.coling-tutorials.3 In particular, we will focus on the following topics: modeling parameter sharing for multi-way models, ***** massively multilingual ***** models, training protocols, language divergence, transfer learning, zero-shot/zero-resource learning, pivoting, multilingual pre-training and multi-source translation. | ||
| 2021.mtsummit-research.6 We provide several neural MT benchmarks and compare them to the performance of popular pre-trained (***** massively multilingual *****) MT models both for the heterogeneous test set and its subdomains. | ||
| 2021.emnlp-main.814 We release these cross-lingual entity pairs along with the ***** massively multilingual ***** tagged named entity corpus as a resource to the NLP community. | ||
| D18-1103 Experiments demonstrate that ***** massively multilingual ***** models, even without any explicit adaptation, are surprisingly effective, achieving BLEU scores of up to 15.5 with no data from the LRL, and that the proposed similar-language regularization method improves over other adaptation methods by 1.7 BLEU points average over 4 LRL settings. | ||
| offensive content | 15 | |
| 2021.dravidianlangtech-1.32 In this shared task on Offensive Language Identification in Dravidian Languages, in the First Workshop of Speech and Language Technologies for Dravidian Languages in EACL 2021, the aim is to identify ***** offensive content ***** from code mixed Dravidian Languages Kannada, Malayalam, and Tamil. | ||
| N19-1144 However, previous work on this topic did not consider the problem as a whole, but rather focused on detecting very specific types of ***** offensive content *****, e.g., hate speech, cyberbulling, or cyber-aggression. | ||
| 2021.naacl-demos.17 The interest in ***** offensive content ***** identification in social media has grown substantially in recent years. | ||
| 2021.dravidianlangtech-1.38 Social media often acts as breeding grounds for different forms of ***** offensive content *****. | ||
| 2021.dravidianlangtech-1.27 The theme of this shared task is the detection of ***** offensive content ***** in social media. | ||
| sentence boundary | 15 | |
| 2020.findings-emnlp.409 The goal of Document-level Relation Extraction (DRE) is to recognize the relations between entity mentions that can span beyond ***** sentence boundary *****. | ||
| 2020.autosimtrans-1.1 In this paper, we propose a novel method for ***** sentence boundary ***** detection that takes it as a multi-class classification task under the end-to-end pre-training framework. | ||
| L16-1348 This paper describes a method to perform ***** sentence boundary ***** detection and alignment simultaneously, which significantly improves the alignment accuracy on languages like Chinese with uncertain sentence boundaries. | ||
| 2020.autosimtrans-1.6 We present a sentence length based method and a ***** sentence boundary ***** detection model based method for the streaming input segmentation. | ||
| C16-1028 The paper applies a deep recurrent neural network to the task of ***** sentence boundary ***** detection in Sanskrit, an important, yet underresourced ancient Indian language. | ||
| construction | 15 | |
| P18-1020 We describe a novel method for efficiently eliciting scalar annotations for dataset ***** construction ***** and system quality estimation by human judgments. | ||
| 2005.mtsummit-wpt.9 The workflow consists of the stage for setting lexical goals and the semi- automatic terminology ***** construction ***** stage. | ||
| C16-1078 The system was evaluated using standard IR metrics on the new benchmark, and we saw that lexical-semantical rerankers improve significantly over a purely surface-oriented system, but must be carefully tailored for each individual ***** construction *****. | ||
| L12-1590 This paper addresses theoretical and practical issues experienced in the ***** construction ***** of Turkish National Corpus (TNC). | ||
| 1963.earlymt-1.20 The paper will investigate a few major *****construction***** types in several related European languages : relative clauses , attributive phrases , and certain instances of coordinate conjunction involving these constructions . | ||
| sequence modeling | 15 | |
| 2020.wmt-1.52 Our baseline system is byte-pair encoding based transformer model trained with the Fairseq ***** sequence modeling ***** toolkit. | ||
| W18-6219 Self-attention networks have been shown to be effective for ***** sequence modeling ***** tasks, while having no recurrence or convolutions. | ||
| P19-1354 Due to the lack of recurrence structure such as recurrent neural networks (RNN), SAN is ascribed to be weak at learning positional information of words for ***** sequence modeling *****. | ||
| C16-1329 On the other hand, applying two-dimensional (2D) pooling operation over the two dimensions may sample more meaningful features for ***** sequence modeling ***** tasks. | ||
| N19-4009 fairseq is an open - source *****sequence modeling***** toolkit that allows researchers and developers to train custom models for translation , summarization , language modeling , and other text generation tasks . | ||
| domain specific | 15 | |
| L14-1084 The proposed method is based on an original method that is not ***** domain specific *****. | ||
| 2020.parlaclarin-1.8 To overcome this issue, future researchwill be directed at using ***** domain specific ***** language models in combination with off-the-shelf acoustic models. | ||
| W18-2319 Intrinsic evaluation results tell that drug name embeddings created with a ***** domain specific ***** document corpus outperformed the previously published versions that derived from a very large general text corpus. | ||
| 2021.ranlp-1.53 We utilized ***** domain specific ***** feature reduction techniques to implement the most accurate models to date for predicting book success, with our best model achieving an average accuracy of 94.0%. | ||
| C16-2022 INDREX - MM simplifies these tasks for the user with powerful SQL extensions for gathering statistical semantics , for executing open information extraction and for integrating relation candidates with *****domain specific***** data . | ||
| table | 15 | |
| W18-4106 Sentences with presuppositions are often treated as uninterpre***** table ***** or unvalued (neither true nor false) if their presuppositions are not satisfied. | ||
| 2021.acl-long.320 Open pit mines left many regions worldwide inhospi***** table ***** or uninhabi***** table *****. | ||
| D19-5817 Our study suggests that while current metrics may be sui***** table ***** for existing QA datasets, they limit the complexity of QA datasets that can be created. | ||
| 2021.emnlp-main.208 Specifically, we first generate a ***** table ***** feature for each relation. | ||
| 2021.blackboxnlp-1.19 Tsetlin Machine (TM) is an interpre***** table ***** pattern recognition algorithm based on propositional logic, which has demonstrated competitive performance in many Natural Language Processing (NLP) tasks, including sentiment analysis, text classification, and Word Sense Disambiguation. | ||
| neural dependency | 15 | |
| 2020.findings-emnlp.364 We compare the two baselines with key configurations and find that: automatic Vietnamese word segmentation improves the parsing results of both baselines; the normalized pointwise mutual information (NPMI) score (Bouma, 2009) is useful for schema linking; latent syntactic features extracted from a ***** neural dependency ***** parser for Vietnamese also improve the results; and the monolingual language model PhoBERT for Vietnamese (Nguyen and Nguyen, 2020) helps produce higher performances than the recent best multilingual language model XLM-R (Conneau et al., 2020). | ||
| D17-1173 Very recently, some studies on ***** neural dependency ***** parsers have shown advantage over the traditional ones on a wide variety of languages. | ||
| P19-1562 In this paper, we investigate the aspect of structured output modeling for the state-of-the-art graph-based ***** neural dependency ***** parser (Dozat and Manning, 2017). | ||
| 2020.aacl-main.12 We empirically show that our approaches match the accuracy of very recent state-of-the-art second-order graph-based ***** neural dependency ***** parsers and have significantly faster speed in both training and testing. | ||
| W17-6318 To improve grammatical function labelling for German , we augment the labelling component of a *****neural dependency***** parser with a decision history . | ||
| vision and language | 15 | |
| C16-1264 Thus, ***** vision and language ***** provide complementary information that, properly combined, can potentially yield more complete concept representations. | ||
| W18-6514 We also showcase the shortcomings of current ***** vision and language ***** models by performing an error analysis on our system's output. | ||
| 2021.mmsr-1.4 We investigate the reasoning ability of pretrained ***** vision and language ***** (V&L) models in two tasks that require multimodal integration: (1) discriminating a correct image-sentence pair from an incorrect one, and (2) counting entities in an image. | ||
| 2021.emnlp-main.733 A key solution to temporal sentence grounding (TSG) exists in how to learn effective alignment between ***** vision and language ***** features extracted from an untrimmed video and a sentence description. | ||
| 2020.coling-main.220 Automatically describing videos in natural language is an ambitious problem, which could bridge our understanding of ***** vision and language *****. | ||
| multilingual word | 15 | |
| 2020.ccl-1.75 Recent advances of ***** multilingual word ***** representations weaken the input divergences across languages, making cross-lingual transfer similar to the monolingual cross-domain and semi-supervised settings. | ||
| E17-1113 Most current approaches in phylogenetic linguistics require as input ***** multilingual word ***** lists partitioned into sets of etymologically related words (cognates). | ||
| K17-3006 Thanks to ***** multilingual word ***** embeddings and one hot encodings for languages, our system can use both monolingual and multi-source training. | ||
| 2021.iwpt-1.9 In our experiments, we used ***** multilingual word ***** embeddings and a total of 11 Universal Dependencies treebanks drawn from three high-resource languages (English, French, Norwegian) and three low-resource languages (Bambara, Wolof and Yoruba). | ||
| 2020.emnlp-main.240 When distant languages are involved, the proposed approach shows robust behavior and outperforms existing unsupervised ***** multilingual word ***** embedding approaches. | ||
| group | 15 | |
| L06-1322 From our experimental results, we found that the correspondence between a ***** group ***** of adjectives and their category name was more suitable in our method than in the EDR lexicon. | ||
| Q14-1014 We introduce a method for automatically segmenting a corpus into chunks such that many uncertain labels are ***** group *****ed into the same chunk, while human supervision can be omitted altogether for other segments. | ||
| W19-3018 For the behavioral model approach, we model each user's behaviour and thoughts with four ***** group *****s of features: posting behaviour, sentiment, motivation, and content of the user's posting. | ||
| L12-1634 Our goal is to simplify the classification by ***** group *****ing the frames into genera―explainable clusters that may be used as experimental parameters. | ||
| D19-1384 We demonstrate that complex linguistic behavior observed in natural language can be reproduced in this simple setting: i) the outcome of contact between communities is a function of inter- and intra-***** group ***** connectivity; ii) linguistic contact either converges to the majority protocol, or in balanced cases leads to novel creole languages of lower complexity; and iii) a linguistic continuum emerges where neighboring languages are more mutually intelligible than farther removed languages. | ||
| supervised training | 15 | |
| Q13-1016 We further introduce a weakly ***** supervised training ***** procedure that estimates LSP's parameters using annotated referents for entire statements, without annotated referents for individual words or the parse structure of the statement. | ||
| 2021.acl-short.108 Specifically, we insert small bottleneck layers (i.e., adapter) within each layer of a pretrained model, then fix the pretrained layers and train the adapter layers on the downstream task data, with (1) task-specific unsupervised pretraining and then (2) task-specific ***** supervised training ***** (e.g., classification, sequence labeling). | ||
| N18-4010 The ***** supervised training ***** agent can further be improved via interacting with users and learning online from user demonstration and feedback with imitation and reinforcement learning. | ||
| 2021.dialdoc-1.10 We can leverage these signals to generate the weakly ***** supervised training ***** data for learning dialog policy and reward estimator, and make the policy take actions (generates responses) which can foresee the future direction for a successful (rewarding) conversation. | ||
| D19-6607 The complete product can be manipulated by various applications using Neo4j's native Cypher query language: We present a subgraph-matching approach to align extracted relations with external facts and show that fact verification, locating textual support for asserted facts, detecting inconsistent and missing facts, and extracting distantly-***** supervised training ***** data can all be performed within the same framework. | ||
| disease | 15 | |
| 2020.wnut-1.52 Increasing usage of social media presents new non-traditional avenues for monitoring ***** disease ***** outbreaks, virus transmissions and ***** disease ***** progressions through user posts describing test results or ***** disease ***** symptoms. | ||
| P18-1098 The International Classification of Diseases (ICD) provides a hierarchy of diagnostic codes for classifying ***** disease *****s. | ||
| 2020.dmr-1.7 This paper examines how Abstract Meaning Representation (AMR) can be utilized for finding answers to research questions in medical scientific documents, in particular, to advance the study of UV (ultraviolet) inactivation of the novel coronavirus that causes the ***** disease ***** COVID-19. | ||
| R19-1096 Our knowledge base supports queries regarding drugs (e.g., active ingredients, concentration, expiration date), drug-drug interaction, symptom-***** disease ***** relations, as well as drug-symptom relations. | ||
| 2020.emnlp-main.372 This disease knowledge is critical for many health - related and biomedical tasks , including consumer health question answering , medical language inference and *****disease***** name recognition . | ||
| bilingual word | 15 | |
| R19-1140 We applied our model for the Turkish-Finnish language pair on the ***** bilingual word ***** translation task. | ||
| C16-1300 We remove this constraint by introducing the Earth Mover's Distance into the training of ***** bilingual word ***** embeddings. | ||
| P19-1312 State-of-the-art methods for unsupervised ***** bilingual word ***** embeddings (BWE) train a mapping function that maps pre-trained monolingual word embeddings into a bilingual space. | ||
| N19-1188 Recent research has discovered that a shared ***** bilingual word ***** embedding space can be induced by projecting monolingual word embedding spaces from two languages using a self-learning paradigm without any bilingual supervision. | ||
| L16-1536 In this paper we investigate the usefulness of neural word embeddings in the process of translating Named Entities ( NEs ) from a resource - rich language to a language low on resources relevant to the task at hand , introducing a novel , yet simple way of obtaining *****bilingual word***** vectors . | ||
| semantic segmentation | 15 | |
| 2020.emnlp-main.314 Document structure extraction has been a widely researched area for decades with recent works performing it as a *****semantic segmentation***** task over document images using fully-convolution networks. | ||
| 2020.emnlp-main.227 In this paper, we present a novel and extensive approach, which formulates it as a *****semantic segmentation***** task. | ||
| 2021.eacl-srw.1 The problem of estimating the probability distribution of labels has been widely studied as a label distribution learning (LDL) problem, whose applications include age estimation, emotion analysis, and *****semantic segmentation*****. | ||
| Q14-1016 We present a novel representation, evaluation measure, and supervised models for the task of identifying the multiword expressions (MWEs) in a sentence, resulting in a lexical *****semantic segmentation*****. | ||
| 2020.lrec-1.743 We propose three challenges appropriate for this corpus that are related to processing units of signs in context: automatic alignment of text and video, *****semantic segmentation***** of sign language, and production of video-text embeddings for cross-modal retrieval. | ||
| cross - lingual transfer | 15 | |
| 2021.eacl-main.204 Much work in *****cross - lingual transfer***** learning explored how to select better transfer languages for multilingual tasks , primarily focusing on typological and genealogical similarities between languages . | ||
| W18-6125 Projecting linguistic annotations through word alignments is one of the most prevalent approaches to *****cross - lingual transfer***** learning . | ||
| 2021.acl-long.207 In structured prediction problems , *****cross - lingual transfer***** learning is an efficient way to train quality models for low - resource languages , and further improvement can be obtained by learning from multiple source languages . | ||
| 2020.acl-main.747 This paper shows that pretraining multilingual language models at scale leads to significant performance gains for a wide range of *****cross - lingual transfer***** tasks . | ||
| D19-1607 Because it is not feasible to collect training data for every language , there is a growing interest in *****cross - lingual transfer***** learning . | ||
| low | 15 | |
| P19-1118 Most previous work relies on supervised systems , which are trained on parallel data , thus their applicability is problematic in *****low***** - resource scenarios . | ||
| L16-1101 Out - of - vocabulary ( OOV ) word is a crucial problem in statistical machine translation ( SMT ) with *****low***** resources . | ||
| 2021.eacl-main.250 Recently , there has been a strong interest in developing natural language applications that live on personal devices such as mobile phones , watches and IoT with the objective to preserve user privacy and have *****low***** memory . | ||
| W19-3001 Rather than replacing human empathy with an automated counselor , we propose simulating an individual in crisis so that human counselors in training can practice crisis counseling in a *****low***** - risk environment . | ||
| 2021.bucc-1.1 AI now and in future will have to grapple continuously with the problem of *****low***** resource . | ||
| Russian | 15 | |
| L06-1045 The RNC now it is a 120 million - word collection of *****Russian***** text , thus , it is the most representative and authoritative corpus of the Russian language . | ||
| 2020.lrec-1.485 *****Russian***** morphology has been studied for decades , but there is still no large high coverage resource that contains the derivational families ( groups of words that share the same root ) of Russian words . | ||
| R19-1073 We build the first full pipeline for semantic role labelling of *****Russian***** texts . | ||
| 2018.gwc-1.5 In the paper we presented a new *****Russian***** wordnet , RuWordNet , which was semi - automatically obtained by transformation of the existing Russian thesaurus RuThes . | ||
| 2016.gwc-1.2 *****Russian***** Language is currently poorly supported with WordNet - like resources . | ||
| few - shot | 15 | |
| 2020.coling-main.486 We introduce ManyEnt , a benchmark for entity typing models in *****few - shot***** scenarios . | ||
| 2020.findings-emnlp.303 In this paper , we focus on generating training examples for *****few - shot***** intents in the realistic imbalanced scenario . | ||
| 2020.ngt-1.5 We present META - MT , a meta - learning approach to adapt Neural Machine Translation ( NMT ) systems in a *****few - shot***** setting . | ||
| 2021.eacl-main.134 Metric - based learning is a well - known family of methods for *****few - shot***** learning , especially in computer vision . | ||
| 2021.emnlp-main.433 In recent years , *****few - shot***** models have been applied successfully to a variety of NLP tasks . | ||
| 15 | ||
| 2020.nlp4convai-1.8 Speech - based virtual assistants , such as Amazon Alexa , *****Google***** assistant , and Apple Siri , typically convert users ' audio signals to text data through automatic speech recognition ( ASR ) and feed the text to downstream dialog models for natural language understanding and response generation . | ||
| W19-4413 We introduce unsupervised techniques based on phrase - based statistical machine translation for grammatical error correction ( GEC ) trained on a pseudo learner corpus created by *****Google***** Translation . | ||
| L14-1249 This paper introduces a distributional thesaurus and sense clusters computed on the complete Google Syntactic N - grams , which is extracted from *****Google***** Books , a very large corpus of digitized books published between 1520 and 2008 . | ||
| 2021.emnlp-main.143 For voice assistants like Alexa , *****Google***** Assistant , and Siri , correctly interpreting users ' intentions is of utmost importance . | ||
| 2020.lrec-1.465 The *****Google***** Patents is one of the main important sources of patents information . | ||
| Pretrained language | 15 | |
| 2021.latechclfl-1.3 *****Pretrained language***** models like BERT have advanced the state of the art for many NLP tasks . | ||
| 2021.emnlp-main.734 *****Pretrained language***** models demonstrate strong performance in most NLP tasks when fine - tuned on small task - specific datasets . | ||
| 2020.emnlp-main.154 *****Pretrained language***** models , especially masked language models ( MLMs ) have seen success across many NLP tasks . | ||
| D19-1572 *****Pretrained language***** models are promising particularly for low - resource languages as they only require unlabelled data . | ||
| 2020.winlp-1.19 *****Pretrained language***** models have obtained impressive results for a large set of natural language understanding tasks . | ||
| open - domain dialogue | 15 | |
| 2021.emnlp-main.367 Enabling *****open - domain dialogue***** systems to ask clarifying questions when appropriate is an important direction for improving the quality of the system response . | ||
| W19-2310 Despite advances in *****open - domain dialogue***** systems , automatic evaluation of such systems is still a challenging problem . | ||
| D17-1230 We apply adversarial training to *****open - domain dialogue***** generation , training a system to produce sequences that are indistinguishable from human - generated dialogue utterances . | ||
| 2021.acl-long.11 Nowadays , *****open - domain dialogue***** models can generate acceptable responses according to the historical context based on the large - scale pre - trained language models . | ||
| 2020.acl-main.634 Neural network - based sequence - to - sequence ( seq2seq ) models strongly suffer from the low - diversity problem when it comes to *****open - domain dialogue***** generation . | ||
| smart | 15 | |
| D19-3016 Used for simple commands recognition on devices from *****smart***** speakers to mobile phones , keyword spotting systems are everywhere . | ||
| 2021.wat-1.10 With the growing popularity of *****smart***** speakers , such as Amazon Alexa , speech is becoming one of the most important modes of human - computer interaction . | ||
| 2020.lrec-1.83 Interactive dialogue agents like *****smart***** speakers have become more and more popular in recent years . | ||
| D18-1385 With the increasing popularity of *****smart***** devices , rumors with multimedia content become more and more common on social networks . | ||
| L16-1228 The term *****smart***** home refers to a living environment that by its connected sensors and actuators is capable of providing intelligent and contextualised support to its user . | ||
| virtual | 15 | |
| 2020.signlang-1.34 The Hamburg Notation System ( HamNoSys ) was developed for movement annotation of any sign language ( SL ) and can be used to produce signing animations for a *****virtual***** avatar with the JASigning platform . | ||
| 2021.internlp-1.2 We investigate the question of how adaptive feedback from a *****virtual***** agent impacts the linguistic input of the user in a shared world game environment . | ||
| 2020.emnlp-main.413 Task - oriented semantic parsing is a critical component of *****virtual***** assistants , which is responsible for understanding the user 's intents ( set reminder , play music , etc . ) . | ||
| W19-5916 We demo a chatbot that delivers content in the form of *****virtual***** dialogues automatically produced from plain texts extracted and selected from documents . | ||
| D19-1460 The need for high - quality , large - scale , goal - oriented dialogue datasets continues to grow as *****virtual***** assistants become increasingly wide - spread . | ||
| background | 15 | |
| 2020.findings-emnlp.369 Commonsense question answering ( QA ) requires *****background***** knowledge which is not explicitly stated in a given context . | ||
| N19-1361 Identifying the intent of a citation in scientific papers ( e.g. , *****background***** information , use of methods , comparing results ) is critical for machine reading of individual publications and automated analysis of the scientific literature . | ||
| 2020.lrec-1.71 Fully data driven Chatbots for non - goal oriented dialogues are known to suffer from inconsistent behaviour across their turns , stemming from a general difficulty in controlling parameters like their assumed *****background***** personality and knowledge of facts . | ||
| D17-1086 Humans interpret texts with respect to some *****background***** information , or world knowledge , and we would like to develop automatic reading comprehension systems that can do the same . | ||
| D18-1255 Existing dialog datasets contain a sequence of utterances and responses without any explicit *****background***** knowledge associated with them . | ||
| regular | 15 | |
| 1995.iwpt-1.24 Error - tolerant recognition enables the recognition of strings that deviate slightly from any string in the *****regular***** set recognized by the underlying finite state recognizer . | ||
| W03-3016 We show that a well - known algorithm to compute the intersection of a context - fre language and a *****regular***** language can be extended to apply to a probabilistic context - free grammar and a probabilistic finite automaton , provided the two probabilistic models are combined through multiplication . | ||
| 2020.lrec-1.398 The paper presents a dataset of 11,000 Polish - English translational equivalents in the form of pairs of plWordNet and Princeton WordNet lexical units linked by three types of equivalence links : strong equivalence , *****regular***** equivalence , and weak equivalence . | ||
| D19-6303 This paper describes a method of inflecting and linearizing a lemmatized dependency tree by : ( 1 ) determining a *****regular***** expression and substitution to describe each productive wordform rule ; ( 2 ) learning the dependency distance tolerance for each head - dependent pair , resulting in an edge - weighted directed acyclic graph ( DAG ) ; and ( 3 ) topologically sorting the DAG into a surface realization based on edge weight . | ||
| W19-3109 We show that *****regular***** transductions for which the input part is generated by some multiple context - free grammar can be simulated by synchronous multiple context - free grammars . | ||
| open | 15 | |
| 2005.mtsummit-osmtw.2 We present the current status of development of an *****open***** architecture for the translation from Spanish into Basque . | ||
| 2021.emnlp-main.161 Neural language models typically tokenise input text into sub - word units to achieve an *****open***** vocabulary . | ||
| 2021.emnlp-main.576 Machine translation models have discrete vocabularies and commonly use subword segmentation techniques to achieve an ` *****open***** vocabulary . ' | ||
| L14-1206 In the context of ongoing developments as regards the creation of a sustainable , interoperable language resource infrastructure and spreading ideas of the need for *****open***** access , not only of research publications but also of the underlying data , various issues present themselves which require that different stakeholders reconsider their positions . | ||
| W17-2337 Question answering , the identification of short accurate answers to users questions , is a longstanding challenge widely studied over the last decades in the *****open***** domain . | ||
| refinements | 14 | |
| 2003.mtsummit-tttt.4 The course has evolved steadily over the past several years to incorporate ***** refinements ***** in the set of course topics, how they are taught, and how students “learn by doing”. | ||
| 2021.eacl-main.197 This paper reports qualitative and empirical insights into the most common and challenging types of ***** refinements ***** that a voice-based conversational search system must support. | ||
| W19-3322 We detail ***** refinements ***** made to Abstract Meaning Representation (AMR) that make the representation more suitable for supporting a situated dialogue system, where a human remotely controls a robot for purposes of search and rescue and reconnaissance. | ||
| 2021.emnlp-main.720 Prior works often modulate one modal feature to another straightforwardly and thus, underutilizing both unimodal and crossmodal representation ***** refinements *****, which incurs a bottleneck of performance improvement. | ||
| 2004.amta-papers.21 This paper describes the two types of grammars and gives a detailed error analysis of their output, indicating what kinds of ***** refinements ***** are required in each case | ||
| prototyping | 14 | |
| D17-2005 Adding new predictors or decoding strategies is particularly easy, making it a very efficient tool for ***** prototyping ***** new research ideas. | ||
| 2021.eacl-demos.29 We consider how this applies to creative writers and present Story Centaur, a user interface for ***** prototyping ***** few shot models and a set of recombinable web components that deploy them. | ||
| 2020.bionlp-1.13 We also release an abstract only (as opposed to full-texts) version of the task for rapid model ***** prototyping *****. | ||
| P18-4021 This creates a demand for tools that speed up ***** prototyping ***** of feature-rich dialogue systems. | ||
| P19-3029 Flambë's main objective is to provide a unified interface for ***** prototyping ***** models, running experiments containing complex pipelines, monitoring those experiments in real-time, reporting results, and deploying a final model for inference | ||
| vagueness | 14 | |
| 2020.findings-emnlp.371 The rise of unsupervised neural machine translation (UNMT) almost completely relieves the parallel corpus curse, though UNMT is still subject to unsatisfactory performance due to the ***** vagueness ***** of the clues available for its core back-translation training. | ||
| 2021.eacl-srw.5 Starting from a noisy dataset of revision histories, we specifically extract and analyze edits that involve cases of ***** vagueness ***** in instructions. | ||
| 2020.coling-main.294 Our experiments show that the temporal relationships that present ***** vagueness ***** are in fact much more common than those in which a single relationship can be established precisely. | ||
| 2020.coling-main.422 Semantic annotation tasks contain ambiguity and ***** vagueness ***** and require varying degrees of world knowledge. | ||
| D18-1387 Finally, we provide suggestions for resolving ***** vagueness ***** and improving the usability of privacy policies | ||
| biased | 14 | |
| P19-1435 In this paper, we investigate the problem of selection bias on six NLSM datasets and find that four out of them are significantly ***** biased *****. | ||
| P19-2031 Gender bias exists in natural language datasets, which neural language models tend to learn, resulting in ***** biased ***** text generation. | ||
| D19-6115 Then, we train a de***** biased ***** model that fits to the residual of the ***** biased ***** model, focusing on examples that cannot be predicted well by ***** biased ***** features only. | ||
| 2021.emnlp-main.215 However, most existing keyphrase extraction approaches only focus on the part of them, which leads to ***** biased ***** results. | ||
| 2021.eacl-srw.9 Therefore, evaluating BERT-based rankers may lead to ***** biased ***** and unfair evaluation results, simply because a relevant document has not been exposed to the annotators while creating the collection | ||
| Devlin | 14 | |
| D19-6011 We find out that: (a) for task 1, first fine-tuning on larger datasets like RACE (Lai et al., 2017) and SWAG (Zellersetal.,2018), and then fine-tuning on the target task improve the performance significantly; (b) for task 2, we find out the incorporating a KG of commonsense knowledge, WordNet (Miller, 1995) into the Bert model (***** Devlin ***** et al., 2018) is helpful, however, it will hurts the performace of XLNET (Yangetal.,2019), a more powerful pre-trained model. | ||
| 2021.metanlp-1.2 Multilingual pre-trained contextual embedding models (***** Devlin ***** et al., 2019) have achieved impressive performance on zero-shot cross-lingual transfer tasks. | ||
| 2021.eval4nlp-1.9 Author embedding methods are often built on top of either Doc2Vec (Mikolov et al. 2014) or the Transformer architecture (***** Devlin ***** et al. 2019). | ||
| D19-1542 Recently, BERT realized a breakthrough in sentence representation learning (***** Devlin ***** et al., 2019), which is broadly transferable to various NLP tasks. | ||
| 2020.findings-emnlp.71 We present a novel way of injecting factual knowledge about entities into the pretrained BERT model (***** Devlin ***** et al., 2019) | ||
| specialization | 14 | |
| D19-1226 Our ***** specialization ***** transfer comprises two crucial steps: 1) Inducing noisy constraints in the target language through automatic word translation; and 2) Filtering the noisy constraints via a state-of-the-art relation prediction model trained on the source language constraints. | ||
| 2020.acl-srw.36 To this end, we use traditional word embeddings and apply ***** specialization ***** methods to better capture semantic relations between words. | ||
| W19-4310 Leveraging a partially LE-specialized distributional space, our POSTLE (i.e., post-***** specialization ***** for LE) model learns an explicit global ***** specialization ***** function, allowing for ***** specialization ***** of vectors of unseen words, as well as word vectors from other languages via cross-lingual transfer. | ||
| 1993.eamt-1.15 The paper reviews some of the variety of facts to be accounted for, particularly in the ***** specialization ***** of sense associated with some collocations, and the pervasive phenomenon of Intensionality. | ||
| 2021.acl-long.410 Inspired by prior work on semantic ***** specialization ***** of static word embedding (WE) models, we show that it is possible to expose and enrich lexical knowledge from the LMs, that is, to specialize them to serve as effective and universal “decontextualized” word encoders even when fed input words “in isolation” (i.e., without any context) | ||
| prompts | 14 | |
| L14-1259 The ever-growing number of published scientific papers ***** prompts ***** the need for automatic knowledge extraction to help scientists keep up with the state-of-the-art in their respective fields. | ||
| 2021.naacl-main.410 We explore the idea of learning ***** prompts ***** by gradient descent—either fine-tuning ***** prompts ***** taken from previous work, or starting from random initialization. | ||
| 2020.bea-1.4 The two approaches were found to be complementary to one another, yielding a combined F0.5 score of 0.814 for off-topic response detection where the ***** prompts ***** have not been seen in training. | ||
| 2020.emnlp-main.352 We learn the pair relationship between the ***** prompts ***** and responses as a regression task on a latent space instead. | ||
| 2021.naacl-main.208 We aim to quantify this benefit through rigorous testing of ***** prompts ***** in a fair setting: comparing prompted and head-based fine-tuning in equal conditions across many tasks and data sizes | ||
| “the | 14 | |
| W17-3515 For instance, referring expressions like ***** “the ***** big mug” or “it” typically contain content words (“big”, “mug”), which are notoriously fuzzy or vague in their meaning, and also fun-ction words (***** “the *****”, “it”) that largely serve as discrete pointers. | ||
| N19-1402 To improve the training efficiency of hierarchical recurrent models without compromising their performance, we propose a strategy named as ***** “the ***** lower the simpler”, which is to simplify the baseline models by making the lower layers simpler than the upper layers. | ||
| 1994.bcs-1.18 We agree with S. Nirenburg that ***** “the ***** ability and the right to subdivide sentences or to combine them together in the Target language are powerful tools in the hands of human translators.” | ||
| 2020.emnlp-main.646 In a document, they may refer to a pair of named entities such as `London' and `Paris' with different expressions: ***** “the ***** major cities”, ***** “the ***** capital cities” and “two European cities”. | ||
| P19-1384 Embedding a clause inside another (***** “the ***** girl [who likes cars [that run fast]] has arrived”) is a fundamental resource that has been argued to be a key driver of linguistic expressiveness | ||
| notably | 14 | |
| 2021.acl-long.444 Extensive experiments show that our approach ***** notably ***** boosts the performance over strong baselines by a large margin and significantly surpasses some state-of-the-art context-aware NMT models in terms of BLEU and TER. | ||
| W17-1604 Towards this end, we articulate a set of issues, propose a set of best practices, ***** notably ***** a process featuring an ethics review board, and sketch and how they could be meaningfully applied. | ||
| 2012.iwslt-papers.16 Although the learning on large scale task requests ***** notably ***** amounts of computational resources, the decoder makes use of the tagging information as soft constraints. | ||
| D18-1161 In both cases, the dynamic oracles manage to ***** notably ***** increase their accuracy, in comparison to that obtained by performing classic static training. | ||
| 2020.acl-main.7 The experimental results have shown that our proposed methodology can ***** notably ***** improve the performance of persona-aware response generation, and the metrics are reasonable to evaluate the results | ||
| learnability | 14 | |
| 2020.lt4hala-1.12 Therefore, in this paper, we investigate the ***** learnability ***** of sound correspondences between a proto-language and daughter languages for two machine-translation-inspired models, one statistical, the other neural. | ||
| 2011.mtsummit-tutorials.1 On syntactic SMT, we will explore the trade-offs for SMT between ***** learnability ***** and representational expressiveness. | ||
| J18-2004 This article investigates the ***** learnability ***** of stress systems in a wide range of languages. | ||
| W19-0801 We examine the ***** learnability ***** of such relations as represented in ConceptNet, taking into account their specific properties, which can make relation classification difficult: a given concept pair can be linked by multiple relation types, and relations can have multi-word arguments of diverse semantic types. | ||
| 2020.lrec-1.691 The ***** learnability ***** of the annotated input is discussed in relation to existing resources for the target languages | ||
| alternation | 14 | |
| W16-5303 Our method is able to detect polysemous words that have the same regular sense ***** alternation ***** as in a given example (a word with two automatically induced senses that represent one polysemy pattern, such as ANIMAL / FOOD). | ||
| N19-3017 This paper aims at introducing and formally expressing three methods of representing word order ***** alternation ***** in the pregroup representation of any language. | ||
| 2020.blackboxnlp-1.25 For the verbal ***** alternation ***** tests, we find that the model displays behavior that is consistent with a transitivity bias: verbs seen few times are expected to take direct objects, but verbs seen with direct objects are not expected to occur intransitively. | ||
| D19-5907 Code-switching refers to the ***** alternation ***** of two or more languages in a conversation or utterance and is common in multilingual communities across the world. | ||
| L06-1058 Preserving the valuable ***** alternation ***** information required special linguistic rules for keeping, altering and re-merging the automatically generated preliminary valency frames | ||
| Dependencies treebanks | 14 | |
| N19-1203 We experiment on several typologically diverse languages from the Universal ***** Dependencies treebanks *****, showing the utility of incorporating linguistically-motivated latent variables into NLP models. | ||
| N19-1393 The availability of resources such as the Universal ***** Dependencies treebanks ***** and the World Atlas of Language Structures make it possible to study the plausibility of universal grammar from the perspective of dependency parsing. | ||
| 2021.eacl-main.270 Relying on the established fine-tuning paradigm, we first couple a pretrained transformer with a biaffine parsing head, aiming to infuse explicit syntactic knowledge from Universal ***** Dependencies treebanks ***** into the transformer. | ||
| 2020.udw-1.18 In this paper we present a method for identifying and analyzing adnominal possessive constructions in 66 Universal ***** Dependencies treebanks *****. | ||
| K18-2008 Furthermore, experimental results on parsing 61 “big” Universal ***** Dependencies treebanks ***** from raw texts show that our model outperforms the baseline UDPipe (Straka and Strakova, 2017) with 0.8% higher average POS tagging score and 3.6% higher average LAS score | ||
| sememe | 14 | |
| 2020.repl4nlp-1.21 The adversarial test further demonstrates that ***** sememe ***** knowledge can substantially improve model robustness. | ||
| P17-1187 The results indicate that WRL can benefit from ***** sememe *****s via the attention scheme, and also confirm our models being capable of correctly modeling ***** sememe ***** information. | ||
| P19-1571 Furthermore, we make the first attempt to incorporate ***** sememe ***** knowledge into SC models, and employ the ***** sememe *****-incorporated models in learning representations of multiword expressions, a typical task of SC. | ||
| P18-1227 We experiment on HowNet, a Chinese ***** sememe ***** knowledge base, and demonstrate that our framework outperforms state-of-the-art baselines by a large margin, and maintains a robust performance even for low-frequency words. | ||
| 2020.coling-main.263 The model exploits the state-of-the-art contextualized BERT representations as an encoder, and is further enhanced with the ***** sememe ***** knowledge from HowNet by graph attention networks | ||
| fusing | 14 | |
| C18-1255 In this paper, we propose a multi-layer representation fusion (MLRF) approach to ***** fusing ***** stacked layers. | ||
| 2020.acl-srw.26 In this paper, we present an investigation into ***** fusing ***** sentences drawn from a document by introducing the notion of points of correspondence, which are cohesive devices that tie any two sentences together into a coherent text. | ||
| 2021.acl-long.425 By ***** fusing ***** event and pattern information, we select key sentences to represent an article and then predict if the article fact-checks the given claim using the claim, key sentences, and patterns. | ||
| 2021.emnlp-main.287 For better ***** fusing ***** these features with character representations, we devise masked language model alike pre-training tasks. | ||
| 2021.naacl-main.5 Through a series of careful examinations, we validate the importance of learning distinct contextual representations for entities and relations, ***** fusing ***** entity information early in the relation model, and incorporating global context | ||
| politeness | 14 | |
| 2021.emnlp-main.535 In experiments with three attributes (length, ***** politeness ***** and monotonicity) and two language pairs (English to German and Japanese) our models achieve better control over a wider range of tasks compared to tagging, and translation quality does not degrade when no control is requested. | ||
| Q18-1027 Our reinforcement learning model (Polite-RL) encourages ***** politeness ***** generation by assigning rewards proportional to the ***** politeness ***** classifier score of the sampled response. | ||
| P18-1125 To this end, we develop a framework for capturing pragmatic devices—such as ***** politeness ***** strategies and rhetorical prompts—used to start a conversation, and analyze their relation to its future trajectory. | ||
| 2021.naacl-main.323 Finally, we present an applied case study investigating the effects of complaint ***** politeness ***** on bureaucratic response times | ||
| 2020.acl-main.169 This paper introduces a new task of *****politeness***** transfer which involves converting non - polite sentences to polite sentences while preserving the meaning . | ||
| Cantonese | 14 | |
| L16-1610 For ***** Cantonese ***** speech recognition, the basic unit of acoustic models can either be the conventional Initial-Final (IF) syllables, or the Onset-Nucleus-Coda (ONC) syllables where finals are further split into nucleus and coda to reflect the intra-syllable variations in ***** Cantonese *****. | ||
| W16-3714 We observe a 5-9% relative reduction in phone error rate for the predicted ***** Cantonese ***** phone transcriptions using our proposed techniques compared with the previous PT method. | ||
| 2020.sigmorphon-1.26 We apply this method to spoken multi-syllable words in Mandarin Chinese and ***** Cantonese ***** and evaluate how closely our clusters match the ground truth tone categories. | ||
| L06-1382 For the former, ***** Cantonese *****, a dialect of Chinese, is the home language of more than 90% of the population in Hong Kong and so used in the courts | ||
| 2019.gwc-1.26 This paper reports on the development of the *****Cantonese***** Wordnet , a new wordnet project based on Hong Kong Cantonese . | ||
| CFG | 14 | |
| L06-1002 The results show that (1) the ***** CFG ***** correctly encoded the annotation rules and (2) the annotation done by the Masoretes is highly consistent. | ||
| R19-1160 We present an open-source, wide-coverage context-free grammar (***** CFG *****) for Icelandic, and an accompanying parsing system. | ||
| W17-3523 It comprises a basic annotation model for linguistic information such as noun class, an implementation of existing verbalisation rules and a ***** CFG ***** for verbs, and a basic interface for data entry. | ||
| 2000.iwpt-1.17 Moreover the lexicalization raises up the important problem of multiplication of structures, a problem which does not exist in ***** CFG *****. | ||
| 1991.iwpt-1.15 In the first part of this paper a slow parallel recognizer is described for general *****CFG***** 's . | ||
| ISO | 14 | |
| L10-1377 This is a work in progress within ***** ISO *****-TC37 in order to define a new ***** ISO ***** standard. | ||
| L08-1034 This paper describes the first steps that have been taken to provide users of the multimedia annotation tool ELAN, with the means to create references from tiers and annotations to data categories defined in the ***** ISO ***** Data Category Registry. | ||
| L14-1637 In this paper, we present a web service version of FreeLing that provides standard-compliant morpho-syntactic and syntactic annotations for Spanish, according to several ***** ISO ***** linguistic annotation standards and standard drafts. | ||
| L10-1601 The structure of the lexicon is based on the recently introduced ***** ISO ***** standard called the Lexical Markup Framework | ||
| L08-1433 This poster presents an *****ISO***** framework for the standardization of syntactic annotation ( SynAF ) . | ||
| multiplicative | 14 | |
| D17-1272 This can be achieved by additive and ***** multiplicative ***** control variates that avoid degenerate behavior in empirical risk minimization. | ||
| 2020.lrec-1.584 We perform further experiments comparing the effects of the relative size of the state space and the ***** multiplicative ***** interaction space on performance. | ||
| P17-1168 Our model, the Gated-Attention (GA) Reader, integrates a multi-hop architecture with a novel attention mechanism, which is based on ***** multiplicative ***** interactions between the query embedding and the intermediate states of a recurrent neural network document reader. | ||
| S18-2020 We suggest a cheap and easy way to boost the performance of these methods by integrating ***** multiplicative ***** features into commonly used representations | ||
| 2021.rocling-1.30 The masking - based speech enhancement method pursues a *****multiplicative***** mask that applies to the spectrogram of input noise - corrupted utterance , and a deep neural network ( DNN ) is often used to learn the mask . | ||
| authored | 14 | |
| 2020.emnlp-main.46 We demonstrate the predictive performance of our model on tweets ***** authored ***** by members of the U.S. House and Senate related to the president from November 2016 to February 2018. | ||
| W17-3106 We then illustrate the future possibility of this work with an example of an exposure scenario ***** authored ***** with our application. | ||
| L10-1423 Given that the entire process of developing such a ruleset is simple and fast, our approach can be used for rapid development of morphological analysers and yet it can obtain competitive results with analysers built relying on human ***** authored ***** rules. | ||
| 2021.naacl-main.382 Given the documentation ***** authored ***** throughout a patient's hospitalization, generate a paragraph that tells the story of the patient admission. | ||
| W19-2105 In this paper, we describe a new corpus of censored and uncensored social media tweets from a Chinese microblogging website, Sina Weibo, collected by tracking posts that mention `sensitive' topics or ***** authored ***** by `sensitive' users | ||
| Distantly | 14 | |
| 2021.emnlp-main.761 ***** Distantly ***** supervised relation extraction (RE) automatically aligns unstructured text with relation instances in a knowledge base (KB). | ||
| P18-1161 ***** Distantly ***** supervised open-domain question answering (DS-QA) aims to find answers in collections of unlabeled text. | ||
| C18-1036 ***** Distantly ***** supervised relation extraction greatly reduces human efforts in extracting relational facts from unstructured texts. | ||
| W17-4407 ***** Distantly ***** supervised methods exist, although these generally rely on knowledge gathered using external sources. | ||
| 2021.emnlp-main.839 ***** Distantly ***** supervised named entity recognition (DS-NER) efficiently reduces labor costs but meanwhile intrinsically suffers from the label noise due to the strong assumption of distant supervision | ||
| abstraction | 14 | |
| 2021.naacl-main.217 Our solution first finds nodes for sub-tasks from multiple `how-to' articles on the web by injecting a neural text generator with three key desiderata – relevance, ***** abstraction *****, and consensus. | ||
| Q18-1012 Math word problems form a natural ***** abstraction ***** to a range of quantitative reasoning problems, such as understanding financial news, sports results, and casualties of war. | ||
| D19-1390 On three large-scale summarization dataset, we show the model is able to (1) capture more latent alignment relations than exact word matches, (2) improve word alignment accuracy, allowing for better model interpretation and controlling, (3) generate higher-quality summaries validated by both qualitative and quantitative evaluations and (4) bring more ***** abstraction ***** to the generated summaries. | ||
| L16-1097 Through experiments, we demonstrate that incorporating fine-grained named entities into statistical machine translation improves the accuracy of SMT with more adequate granularity compared with the standard SMT, which is a non-named entity ***** abstraction ***** method. | ||
| 2020.bea-1.20 The type the signal that best predicts difficulty and response time is also explored, both in terms of representation ***** abstraction ***** and item component used as input (e.g., whole item, answer options only, etc.) | ||
| incorporating | 14 | |
| S17-2086 For both subtasks, we adopted supervised machine learning methods, ***** incorporating ***** rich features. | ||
| 2021.acl-long.62 Specifically, we first construct a directed heterogeneous document graph for each news ***** incorporating ***** topics and entities. | ||
| D18-1378 Limbic combines three ideas, ***** incorporating ***** authors, discourse relations, and word embeddings. | ||
| 2014.iwslt-papers.14 Including this new “near-domain” data in training can potentially lead to better language model performance, while reducing training resources relative to ***** incorporating ***** all data. | ||
| L14-1406 This study presents the findings of a detailed benchmark analysis of Twitter sentiment analysis tools, ***** incorporating ***** 20 tools applied to 5 different test beds | ||
| intuitions | 14 | |
| 1999.mtsummit-1.81 It was implemented in C++, using object classes that reflect linguistic concepts and thus facilitate the transfer of linguistic ***** intuitions ***** into code. | ||
| N18-1051 Moreover, our approach gives some ***** intuitions ***** on how target-specific sentence representations can be achieved from its word constituents. | ||
| D18-1152 Our results demonstrate that transferring ***** intuitions ***** from classical models like WFSAs can be an effective approach to designing and understanding neural models. | ||
| D19-1167 Our analysis validates several empirical results and long-standing ***** intuitions *****, and unveils new observations regarding how representations evolve in a multilingual translation model. | ||
| 2020.lrec-1.667 We share ***** intuitions ***** and experimental results that how this dataset can be used to analyze and improve the interpretability of existing reading comprehension model behavior | ||
| synonymous | 14 | |
| 2020.ccl-1.90 Moreover, the cross-entropy loss function discourages model to generate ***** synonymous ***** predictions and overcorrect them to ground truth words. | ||
| 1998.amta-papers.38 We verify this by demonstrating that verbs with similar argument structure as encoded in Lexical Conceptual Structure (LCS) are rarely ***** synonymous ***** in WordNet. | ||
| 2020.emnlp-main.666 In this work, we hypothesize that these two tasks are tightly coupled because two ***** synonymous ***** entities tend to have a similar likelihood of belonging to various semantic classes. | ||
| 2021.bppf-1.2 This bias, in effect, leads to poor performance on data without this bias: a preference elicitation architecture based on BERT suffers a 5.3% absolute drop in performance, when like is replaced with a ***** synonymous ***** phrase, and a 13.2% drop in performance when evaluated on out-of-sample data. | ||
| 2019.gwc-1.23 In order to reflect the abundant meanings that compound verbs own, we will try to think of a link of ***** synonymous ***** expressions to Japanese wordnet | ||
| Kaldi | 14 | |
| 2017.iwslt-1.9 Our setup includes systems using both the Janus and ***** Kaldi ***** frameworks. | ||
| 2013.iwslt-evaluation.18 For the ASR task, using ***** Kaldi ***** toolkit, we developed the system based on weighted finite state transducer. | ||
| 2016.iwslt-1.23 Our setup includes systems using both the Janus and ***** Kaldi ***** frameworks. | ||
| L16-1616 Both have been developed using the ***** Kaldi ***** toolkit. | ||
| 2020.lrec-1.784 This paper also presents some initial results achieved in baseline experiments for Maltese ASR using Sphinx and ***** Kaldi ***** | ||
| visually | 14 | |
| W19-1808 Recent work on ***** visually ***** grounded language learning has focused on broader applications of grounded representations, such as visual question answering and multimodal machine translation. | ||
| 2020.emnlp-main.354 In this work, we study ***** visually ***** grounded grammar induction and learn a constituency parser from both unlabeled text and its visual groundings. | ||
| 2020.emnlp-main.708 In this work, we set forth to design a set of experiments to understand an important but often ignored problem in ***** visually ***** grounded language generation: given that humans have different utilities and visual attention, how will the sample variance in multi-reference datasets affect the models' performance? | ||
| 2020.findings-emnlp.67 Recent models achieve promising results in ***** visually ***** grounded dialogues. | ||
| 2020.gamnlp-1.2 The game facilitated the task of basic image labeling ; however , the labels generated were non - specific and limited the ability to distinguish similar images from one another , limiting its ability in search tasks , annotating images for the *****visually***** impaired , and training computer vision machine algorithms . | ||
| categorial | 14 | |
| 2010.jeptalnrecital-demonstration.11 These developments have been possible thanks to a ***** categorial ***** grammar which has been extracted semi-automatically from the Paris 7 treebank and a semantic lexicon which maps word, part-of-speech tags and formulas combinations to Discourse Representation Structures. | ||
| 1997.iwpt-1.9 The standard resource sensitive invariants of ***** categorial ***** grammar are not suited to prune search space in the presence of coordination. | ||
| 2003.mtsummit-systems.9 We present a new large-scale database called “CatVar” (Habash and Dorr, 2003) which contains ***** categorial ***** variations of English lexemes. | ||
| 1995.iwpt-1.20 The theorem proving strategy is particularly well suited to the treatment of ***** categorial ***** grammar, because it allows us to distribute the computational cost between the algorithm which deals with the grammatical types and the algebraic checker which constrains the derivation | ||
| 2017.jeptalnrecital-recital.12 Finding Missing Categories in Incomplete Utterances This paper introduces an efficient algorithm ( O(n4 ) ) for finding a missing category in an incomplete utterance by using unification technique as when learning *****categorial***** grammars , and dynamic programming as in CockeYoungerKasami algorithm . | ||
| CS | 14 | |
| D19-1019 Customers ask questions and customer service staffs answer their questions, which is the basic service model via multi-turn customer service (***** CS *****) dialogues on E-commerce platforms. | ||
| W19-1410 We explore leveraging multiple neural network architectures to measure the impact of different pre-trained embeddings methods on POS tagging ***** CS ***** data. | ||
| 2020.acl-main.716 The NLP community has mostly focused on monolingual and multi-lingual scenarios, but little attention has been given to ***** CS ***** in particular. | ||
| 2021.calcs-1.10 Morphological tagging of code - switching ( CS ) data becomes more challenging especially when language pairs composing the *****CS***** data have different morphological representations . | ||
| W18-3217 Though this task has been heavily studied in formal monolingual texts and also noisy texts like Twitter data , it is still an emerging task in code - switched ( *****CS***** ) content on social media . | ||
| Quranic | 14 | |
| L10-1192 The ***** Quranic ***** Arabic Dependency Treebank (QADT) is part of the ***** Quranic ***** Arabic Corpus (http://corpus.quran.com), an online linguistic resource organized by the University of Leeds, and developed through online collaborative annotation. | ||
| L10-1190 Processing ***** Quranic ***** Arabic is a unique challenge from a computational point of view, since the vocabulary and spelling differ from Modern Standard Arabic. | ||
| L12-1376 Available online at http://corpus.quran.com, the website is a popular study guide for ***** Quranic ***** Arabic, used by over 1.2 million visitors over the past year. | ||
| L12-1051 This paper presents a large corpus created from the original *****Quranic***** text , where semantically similar or related verses are linked together . | ||
| L12-1011 This paper presents QurAna : a large corpus created from the original *****Quranic***** text , where personal pronouns are tagged with their antecedence . | ||
| psychometric | 14 | |
| 2021.semeval-1.71 The proposed system is made up of a LightGBM model fed with features obtained from many word frequency lists, published lexical norms and ***** psychometric ***** data. | ||
| 2021.starsem-1.2 We find cases in which transformer-based LMs predict ***** psychometric ***** properties consistently well in certain categories but consistently poorly in others, thus providing new insights into fundamental similarities and differences between human and LM reasoning. | ||
| L14-1445 Besides presenting the main characteristics of the corpus (scenario, subjects, experimental protocol, sensing approach, ***** psychometric ***** measurements), the paper reviews the main results obtained so far using the data. | ||
| 2021.cmcl-1.7 The winning system used a range of linguistic and ***** psychometric ***** features in a gradient boosting framework. | ||
| 2020.coling-main.77 This paper brings together approaches from the fields of NLP and ***** psychometric ***** measurement to address the problem of predicting examinee proficiency from responses to short-answer questions (SAQs) | ||
| shortcut | 14 | |
| W17-5308 Our encoder is based on stacked bidirectional LSTM-RNNs with ***** shortcut ***** connections and fine-tuning of word embeddings. | ||
| 2021.naacl-main.71 Based on this ***** shortcut ***** measurement, we propose a ***** shortcut ***** mitigation framework LGTR, to suppress the model from making overconfident predictions for samples with large ***** shortcut ***** degree. | ||
| 2021.conll-1.8 However, the models also display ***** shortcut ***** learning, which is crucial to overcome in search of more cognitively plausible generalisation behaviour. | ||
| P18-4006 It overcomes the low efficiency of traditional text annotation tools by annotating entities through both command line and ***** shortcut ***** keys, which are configurable with custom labels. | ||
| W19-5211 To alleviate this bottleneck, we introduce gated ***** shortcut ***** connections between the embedding layer and each subsequent layer within the encoder and decoder | ||
| analyses | 14 | |
| L16-1405 Our analyzer framework was therefore focused on rapid implementation of the key structures of the language, together with accepting “wildcard” solutions as possible ***** analyses ***** for a word with an unknown stem, building upon our similar experiences with morphological annotation with Modern Standard Arabic and Egyptian Arabic. | ||
| L12-1471 In contrast to previous works on query language evaluation that compare a range of existing query languages against a small number of queries, our approach ***** analyses ***** only three query languages against criteria derived from a suite of 300 use cases that cover diverse aspects of linguistic research. | ||
| L14-1457 Documentation of its semantic interface is a prerequisite to use by non-experts of the grammar and the ***** analyses ***** it produces, but this effort also advances our own understanding of relevant interactions among phenomena, as well as of areas for future work in the grammar. | ||
| 2021.mrl-1.12 In addition to ***** analyses ***** of our results, we also discuss future challenges and present a research agenda in multi-lingual dense retrieval. | ||
| 2020.sustainlp-1.18 In this paper, we presented an ***** analyses ***** of the resource efficient predictive models, namely Bonsai, Binary Neighbor Compression(BNC), ProtoNN, Random Forest, Naive Bayes and Support vector machine(SVM), in the machine learning field for resource constraint devices | ||
| continual | 14 | |
| 2021.eacl-main.317 Next, we introduce ***** continual ***** learning strategies that allow for incremental consolidation of new knowledge while retaining and promoting efficient usage of prior knowledge. | ||
| 2021.emnlp-main.310 This paper investigates ***** continual ***** learning for semantic parsing. | ||
| 2020.coling-main.381 And the investigation on the parameters shows that some parameters are important for both the general-domain and in-domain translation and the great change of them during ***** continual ***** training brings about the performance decline in general-domain. | ||
| 2021.naacl-main.378 This paper studies ***** continual ***** learning (CL) of a sequence of aspect sentiment classification (ASC) tasks | ||
| 2021.naacl-main.93 We propose a straightforward vocabulary adaptation scheme to extend the language capacity of multilingual machine translation models , paving the way towards efficient *****continual***** learning for multilingual machine translation . | ||
| precision | 14 | |
| N19-2018 We demonstrate the empirical effectiveness of the proposed method in both ***** precision ***** and recall compared to a strong IBE baseline, DBpedia, with an absolute improvement of 41.3% in average F1. | ||
| L10-1307 Administrators can look at ***** precision ***** for a given data set over time, as well as by evaluation type, data set, or annotator. | ||
| 2021.emnlp-main.27 We find that recall is high for both “Pro'' and “Anti'' stance classifications but ***** precision ***** is variable in a number of cases. | ||
| 2020.aacl-main.7 The experimental results, based on four widely-used datasets, demonstrate that BCTH is competitive, compared with currently competitive baselines in the perspective of both ***** precision ***** and training speed. | ||
| L12-1045 The first filter is based on the number of rule occurrences, the second filter takes two non-independent characteristics into account: a rule's ***** precision ***** and the amount of instances it acquires | ||
| inconsistent | 14 | |
| 2018.gwc-1.41 We have found that definitions for concepts in this domain can be too restrictive, ***** inconsistent *****, and unclear. | ||
| 2020.coling-main.301 This results in word similarity results obtained from embedding models ***** inconsistent ***** with human judgment. | ||
| 2020.acl-main.454 Neural abstractive summarization models are prone to generate content ***** inconsistent ***** with the source document, i.e. unfaithful. | ||
| 2021.emnlp-main.697 As a result, it can be hard to identify what the model actually “believes” about the world, making it susceptible to ***** inconsistent ***** behavior and simple errors. | ||
| N18-1078 These social media posts often come in ***** inconsistent ***** or incomplete syntax and lexical notations with very limited surrounding textual contexts, bringing significant challenges for NER | ||
| syntactic parse | 14 | |
| L04-1151 Also in systems analyzing text, this information is needed in order to attach the adverbs to the right node in the ***** syntactic parse ***** trees. | ||
| C18-1233 Using a biaffine scorer, our model directly predicts all semantic role labels for all given word pairs in the sentence without relying on any ***** syntactic parse ***** information. | ||
| I17-1050 In this work, we explore the idea of incorporating ***** syntactic parse ***** tree into neural networks. | ||
| E17-1008 In this paper, we present a novel neural network model AntSynNET that exploits lexico-syntactic patterns from ***** syntactic parse ***** trees. | ||
| 2021.naloma-1.3 It is designed to model the ***** syntactic parse ***** tree information from the sentence pair of a reasoning task | ||
| vectorial | 14 | |
| L16-1195 The used corpora along with a set of tools, as well as large repositories of ***** vectorial ***** word representations are made publicly available for four languages (English, German, Italian, and Greek). | ||
| C16-3001 Compositional distributional models of meaning (CDMs) provide a function that produces a ***** vectorial ***** representation for a phrase or a sentence by composing the vectors of its words. | ||
| 2021.cl-3.20 Word embeddings are ***** vectorial ***** semantic representations built with either counting or predicting techniques aimed at capturing shades of meaning from word co-occurrences. | ||
| S17-2044 The atomic classifiers include lexical string based, based on ***** vectorial ***** representations and rulebased. | ||
| W19-3717 The resulted texts are then converted to vectors by averaging the ***** vectorial ***** representation of words derived from a pre-trained Word2Vec English model | ||
| construct | 14 | |
| Q18-1036 These models ***** construct ***** a lattice of possible paths through a sentence and marginalize across this lattice to calculate sequence probabilities or optimize parameters. | ||
| D19-1653 In online arguments, identifying how users ***** construct ***** their arguments to persuade others is important in order to understand a persuasive strategy directly. | ||
| 2020.coling-main.204 Existing approaches ***** construct ***** text description independently for each image and roughly concatenate them as a story, which leads to the problem of generating semantically incoherent content. | ||
| W19-4801 While sequence-to-sequence models have shown remarkable generalization power across several natural language tasks, their ***** construct ***** of solutions are argued to be less compositional than human-like generalization. | ||
| D18-1141 We were motivated by the absence of studies investigating sentiment analyzer performance on sentences with polarity items, a common ***** construct ***** in human language | ||
| misspelling | 14 | |
| 2021.wnut-1.13 Experiments on two public ***** misspelling ***** correction datasets demonstrate that HCTagger is an accurate and much faster approach than many existing models. | ||
| L06-1007 We show that even though it has been claimed that the state of the art for practical applications based on isolated word error correction does not offer always a sensible set of ranked candidates for the ***** misspelling *****, the introduction of a finer-grained categorization of errors and the use of their relative frequency has had a positive impact in the speller application developed for Spanish (the corresponding evaluation data is presented). | ||
| 2020.coling-industry.12 Existing spelling research has primarily focused on advancement in ***** misspelling ***** correction and the approach for ***** misspelling ***** detection has remained the use of a large dictionary. | ||
| 2020.emnlp-main.383 Additional text normalization experiments and case studies show that TNT is a new potential approach to ***** misspelling ***** correction. | ||
| 2020.findings-emnlp.374 #Turki$hTweets provides correct/incorrect word annotations with a detailed ***** misspelling ***** category formulation based on the real user data | ||
| Slovak | 14 | |
| W18-6006 We report on several experiments in enrichment of training data for this specific construction, evaluated on five languages: Czech, English, Finnish, Russian and ***** Slovak *****. | ||
| 2020.lrec-1.830 In this paper we aim to address both of these issues by introducing a summarization dataset of articles from a popular ***** Slovak ***** news site and proposing small adaptation to the ROUGE metric that make it better suited for ***** Slovak ***** texts. | ||
| L14-1535 The recordings were manually transcribed using Transcriber tool modified for ***** Slovak ***** annotators and automatic ***** Slovak ***** spell checking | ||
| L14-1517 The presented corpus aims to be the first attempt to create a representative sample of the contemporary *****Slovak***** language from various domains with easy searching and automated processing . | ||
| L16-1302 This work proposes an information retrieval evaluation set for the *****Slovak***** language . | ||
| Conditional Random | 14 | |
| L14-1707 DisMo is a hybrid system that uses a combination of lexical resources, rules, and statistical models based on ***** Conditional Random ***** Fields (CRF). | ||
| L16-1320 These two steps are respectively performed by two classifiers, the first being based on ***** Conditional Random ***** Fields and the second one on Support Vector Machines. | ||
| 2020.wnut-1.37 The paper describes how classifier model built using *****Conditional Random***** Field detects named entities in wet lab protocols . | ||
| 2021.teachingnlp-1.13 In this article , we show and discuss our experience in applying the flipped classroom method for teaching *****Conditional Random***** Fields in a Natural Language Processing course . | ||
| W16-3715 The paper describes a new tagset for the morphological disambiguation of Sanskrit , and compares the accuracy of two machine learning methods ( *****Conditional Random***** Fields , deep recurrent neural networks ) for this task , with a special focus on how to model the lexicographic information . | ||
| plagiarism | 14 | |
| L10-1016 Various methods for automatic ***** plagiarism ***** detection have been developed whose objective is to assist human experts in the analysis of documents for ***** plagiarism *****. | ||
| W17-1303 Semantic textual similarity is the basis of countless applications and plays an important role in diverse areas, such as information retrieval, ***** plagiarism ***** detection, information extraction and machine translation | ||
| D19-1346 Applications such as textual entailment , *****plagiarism***** detection or document clustering rely on the notion of semantic similarity , and are usually approached with dimension reduction techniques like LDA or with embedding - based neural approaches . | ||
| 2020.emnlp-main.407 Text alignment finds application in tasks such as citation recommendation and *****plagiarism***** detection . | ||
| P17-1071 Text similarity measures are used in multiple tasks such as *****plagiarism***** detection , information ranking and recognition of paraphrases and textual entailment . | ||
| idiosyncratic | 14 | |
| J17-4005 Multiword expressions (MWEs) are a class of linguistic forms spanning conventional word boundaries that are both ***** idiosyncratic ***** and pervasive across different languages. | ||
| D18-1252 In this paper, we address this collaborative nature to improve dialogic reference resolution in two ways: First, we trained a words-as-classifiers logistic regression model of word semantics and incrementally adapt the model to ***** idiosyncratic ***** language between dyad partners during evaluation of the dialog. | ||
| N19-1415 We investigate whether semantic classes of nouns and adjectives differ in how much they reduce uncertainty in classifier choice, and find that it is not fully ***** idiosyncratic *****; while there are no obvious trends for the majority of semantic classes, shape nouns reduce uncertainty in classifier choice the most. | ||
| W16-4111 It is an elastic measure that takes into account ***** idiosyncratic ***** pause duration of translators as well as further confounds such as bi-gram frequency, letter frequency and some motor tasks involved in writing. | ||
| W17-1404 Multiword expressions (MWEs) are linguistic objects containing two or more words and showing ***** idiosyncratic ***** behavior at different levels | ||
| centering | 14 | |
| L10-1037 It is ideally fitted for the transcription of spoken language by ***** centering ***** on the temporal relations of the speakers utterances and is implemented in reliable tools that support an iterative workflow. | ||
| D17-1019 Yet existing models of coherence focus on measuring individual aspects of coherence (lexical overlap, rhetorical structure, entity ***** centering *****) in narrow domains. | ||
| 2021.dash-1.4 The proposed approach consists of the ordering of sentences that mention smoking, ***** centering ***** them at smoking tokens, and annotating to enhance informative parts of the text. | ||
| 2020.coling-main.149 Prior works investigating the geometry of pre-trained word embeddings have shown that word embeddings to be distributed in a narrow cone and by ***** centering ***** and projecting using principal component vectors one can increase the accuracy of a given set of pre-trained word embeddings. | ||
| W17-8006 A key difference to other approaches is the ***** centering ***** around the user in a Human-in-the-Loop machine learning approach, where users define and extend categories and enable the system to improve via feedback and interaction | ||
| unrestricted | 14 | |
| 2021.eacl-tutorials.2 However, many other tasks, particularly in NLP, have unique characteristics not considered by standard models of annotation, e.g., label interdependencies in sequence labelling tasks, ***** unrestricted ***** labels for anaphoric annotation, or preference labels for ranking texts. | ||
| W19-4425 We also explore the generation of artificial syntactic error sentences using error+context phrases for the ***** unrestricted ***** track. | ||
| L06-1062 The main goal of the elaboration of the typology was to help in the im-plementation of a spell checker that detects context-independent misspellings in general ***** unrestricted ***** texts with the most common con-fusion pairs (i.e. error/correction pairs) to improve the set of ranked correction candidates for misspellings. | ||
| N19-1196 We present a novel method for mapping ***** unrestricted ***** text to knowledge graph entities by framing the task as a sequence-to-sequence problem. | ||
| L08-1545 On the challenging problem of generating interlingua from domain and structure ***** unrestricted ***** English sentences, we are able to demonstrate that the use of these lexical resources makes a difference in terms of accuracy figures | ||
| directions | 14 | |
| 2021.acl-long.101 With this modification, we gain up to 18.5 BLEU points on zero-shot translation while retaining quality on supervised ***** directions *****. | ||
| L10-1391 We also suggest further work ***** directions ***** towards characterising MWEs by analysing the data organised in our database through lexico-semantic information available in WordNet or MultiWordNet-like resources, also in the perspective of expanding their set through the extraction of other similar compact expressions. | ||
| P19-1289 Experiments show our strat- egy achieves low latency and reasonable qual- ity (compared to full-sentence translation) on 4 ***** directions *****: | ||
| 2020.wat-1.5 BLEU results of this meta ensembled model rank the first both on 2 ***** directions ***** of ASPEC Japanese-Chinese translation. | ||
| W17-5111 An indicative evaluation outlines challenges and improvement ***** directions ***** | ||
| lexical normalization | 14 | |
| D19-5554 We propose a pipeline that consists of four modules, i.e tokenization, language identification, ***** lexical normalization *****, and translation. | ||
| 2020.lrec-1.769 However, for Italian, there is no benchmark available for ***** lexical normalization *****, despite the presence of many benchmarks for other tasks involving social media data. | ||
| W19-3202 Yet, ***** lexical normalization ***** of such data has not been addressed properly. | ||
| 2021.wnut-1.51 Hence, ***** lexical normalization ***** has been proven to improve the performance of numerous natural language processing tasks on social media. | ||
| 2021.wnut-1.50 Do ESL learners need ***** lexical normalization ***** to read noisy English texts? | ||
| document structure | 14 | |
| 2020.textgraphs-1.3 Most classical approaches use a sequence-based model (typically BiLSTM-CRF framework) without considering ***** document structure *****. | ||
| C18-1014 Inspired by how humans use ***** document structure *****, we propose a novel framework for reading comprehension. | ||
| 2020.fnp-1.28 Then, we apply supervised learning on a self-constructed regression task to predict the depth of each text block in the ***** document structure ***** hierarchy using transfer learning combined with document features and layout features. | ||
| I17-1102 To this end, we propose multilingual hierarchical attention networks for learning ***** document structure *****s, with shared encoders and/or shared attention mechanisms across languages, using multi-task learning and an aligned semantic space as input. | ||
| 2020.sdp-1.22 We note that performance on the high-recall document-level task is much lower than in the standard evaluation approach, due to the necessity of incorporation of ***** document structure ***** as features. | ||
| challenges | 14 | |
| 2006.bcs-1.1 Processing of Colloquial Arabic is a relatively new area of research, and a number of interesting ***** challenges ***** pertaining to spoken Arabic dialects arise. | ||
| 2020.nlpcss-1.9 While this task has been closely associated with emotion prediction, we argue and show that identifying worry needs to be addressed as a separate task given the unique ***** challenges ***** associated with it. | ||
| W17-5201 This talk will describe the approach, datasets and ***** challenges ***** in sarcasm detection using different forms of incongruity. | ||
| U19-1001 Our future work will build off the baseline and ***** challenges ***** presented here. | ||
| W17-5546 This paper shows that these ***** challenges ***** can be mitigated by adding a weighting model into the architecture. | ||
| computational approaches | 14 | |
| 2020.figlang-1.1 As the community working on ***** computational approaches ***** for sarcasm detection is growing, it is imperative to conduct benchmarking studies to analyze the current state-of-the-art, facilitating progress in this area. | ||
| 2021.eacl-main.165 Populist rhetoric has risen across the political sphere in recent years; however, due to its complex nature, ***** computational approaches ***** to it have been scarce. | ||
| S18-1121 We describe the dataset with 1970 instances that we built for the task, and we outline the 21 ***** computational approaches ***** that participated, most of which used neural networks. | ||
| 2020.lrec-1.318 Although diverse efforts to revitalize it have been made, there have been few ***** computational approaches *****. | ||
| Q13-1006 Progress has been made on this task using text-based models, but few ***** computational approaches ***** have considered how infants might benefit from acoustic cues. | ||
| semantic structure | 14 | |
| 2021.emnlp-main.641 The case studies verify that the generated mind-maps better reveal the underlying ***** semantic structure *****s of the document. | ||
| N18-2075 Text segmentation, the task of dividing a document into contiguous segments based on its ***** semantic structure *****, is a longstanding challenge in language understanding. | ||
| P19-3023 We demonstrate HEIDL, a prototype HITL-ML system that exposes the machine-learned model through high-level, explainable linguistic expressions formed of predicates representing ***** semantic structure ***** of text. | ||
| S17-1009 The Paraphrase Database (PPDB) covers 650 times more words, but lacks the ***** semantic structure ***** of WordNet that would make it more directly useful for downstream tasks. | ||
| W18-4903 I argue that a lexicon-free lexical semantics—defined in terms of units and supersense tags—is an appetizing direction for NLP, as it is robust, cost-effective, easily understood, not too language-specific, and can serve as a foundation for richer ***** semantic structure *****. | ||
| key challenge | 14 | |
| L14-1662 Extracting Linked Data following the Semantic Web principle from unstructured sources has become a ***** key challenge ***** for scientific research. | ||
| W19-5904 Learning with minimal data is one of the ***** key challenge *****s in the development of practical, production-ready goal-oriented dialogue systems. | ||
| 2020.findings-emnlp.62 A ***** key challenge ***** is to recognize and stop at the correct location, especially for complicated outdoor environments. | ||
| N19-1405 Learning multi-hop reasoning has been a ***** key challenge ***** for reading comprehension models, leading to the design of datasets that explicitly focus on it. | ||
| 2020.emnlp-main.169 A ***** key challenge ***** in this task is to efficiently learn effective graph representations. | ||
| semantic content | 14 | |
| D18-1005 Our system features the use of domain-specific resources automatically derived from a large unlabeled corpus, and contextual representations of the emotional and ***** semantic content ***** of the user's recent tweets as well as their interactions with other users. | ||
| 2021.semeval-1.44 There is currently a gap between the natural language expression of scholarly publications and their structured ***** semantic content ***** modeling to enable intelligent content search. | ||
| P19-1242 However, the ***** semantic content ***** of hashtags is not straightforward to infer as these represent ad-hoc conventions which frequently include multiple words joined together and can include abbreviations and unorthodox spellings. | ||
| 1991.mtsummit-papers.7 Concrete examples taken from this project exemplify a modern approach to ma- chine translation: a rich representation of the ***** semantic content ***** of a sentence, the use of a sin- gle grammar for parsing and generating as well as generation and transfer based exclusively on the semantic representation of a sentence. | ||
| 1998.amta-papers.38 We then use the results of this work to guide our implementation of an algorithm for cross-language selection of lexical items, exploiting the strengths of each resource: LCS for semantic structure and WordNet for ***** semantic content *****. | ||
| rich language | 14 | |
| L08-1067 One way to overcome this difficulty is to adapt the resources of a linguistically close resource ***** rich language *****. | ||
| C16-1132 Evaluation of machine translation (MT) into morphologically ***** rich language *****s (MRL) has not been well studied despite posing many challenges. | ||
| 2021.eacl-main.158 The present article discusses how to improve translation quality when using limited training data to translate towards morphologically ***** rich language *****s. | ||
| W17-4101 We present a case study on Czech, a morphologically-***** rich language *****, experimenting with different input and output representations. | ||
| W18-5806 Morphologically ***** rich language *****s are challenging for natural language processing tasks due to data sparsity. | ||
| learning to rank | 14 | |
| D17-1139 With the training corpus, we design a symmetric CNN neural network to model text pairs and rank the semantic coherence within the ***** learning to rank ***** framework. | ||
| C18-2010 We propose a claim-oriented ranking module which can be divided into the offline topic-independent ***** learning to rank ***** model and the online topic-dependent lexicon model. | ||
| 2021.eacl-main.185 Building on these limitations, we propose a novel hierarchical, ***** learning to rank ***** approach that uses textual data to make time-aware predictions for ranking stocks based on expected profit. | ||
| W19-8644 We use ***** learning to rank ***** and synthetic data to improve the quality of ratings assigned by our system: We synthesise training pairs of distorted system outputs and train the system to rank the less distorted one higher. | ||
| 2020.emnlp-main.121 We introduce a training strategy for multilingual BERT models by ***** learning to rank ***** synthetic divergent examples of varying granularity. | ||
| human translation | 14 | |
| N18-5015 Machine Translation systems are usually evaluated and compared using automated evaluation metrics such as BLEU and METEOR to score the generated translations against ***** human translation *****s. | ||
| 2020.wmt-1.99 In this paper, we study YiSi-1's correlation with ***** human translation ***** quality judgment by varying three major attributes (which architecture; which inter- mediate layer; whether it is monolingual or multilingual) of the pretrained language mod- els. | ||
| 2012.amta-government.5 Historically, DARPA took the lead in the grand challenge aiming at surpassing ***** human translation ***** quality. | ||
| 2020.findings-emnlp.375 Recent machine translation shared tasks have shown top-performing systems to tie or in some cases even outperform ***** human translation *****. | ||
| L12-1052 We performed a user study where subjects read short texts translated by three MT systems and one ***** human translation *****, while we gathered eye tracking data. | ||
| government | 14 | |
| 2012.amta-***** government *****.13 The RevP program saves time by removing the need for post-editing of Chinese names, and improves consistency in the translation of these names. | ||
| L16-1617 This creates a demand for tools and technologies which will enable ***** government *****s to quickly and thoroughly digest the points being made and to respond accordingly. | ||
| 2012.amta-***** government *****.5 Machine Translation is often misunderstood or misplaced in the operational settings as expectations are unrealistic and optimization not achieved. | ||
| 2012.amta-***** government *****.9 We have developed an annotation tool that enables us to create representations that compactly encode an exponential number of correct translations for a sentence. | ||
| 2010.amta-***** government *****.6 This presentation however, examines the importance of pre-editing source material to improve MT. | ||
| metaphor identification | 14 | |
| P18-1113 Current word embedding based ***** metaphor identification ***** models cannot identify the exact metaphorical words within a sentence. | ||
| E17-2084 In this paper we present the first ***** metaphor identification ***** method that uses representations constructed from property norms. | ||
| W17-4906 In this paper we look at ***** metaphor identification ***** in Adjective-Noun pairs. | ||
| K19-1034 We show that using syntactic features and lexical resources can automatically provide additional high-quality training data for metaphoric language, and this data can cover gaps and inconsistencies in metaphor annotation, improving state-of-the-art word-level ***** metaphor identification *****. | ||
| 2020.figlang-1.3 In this paper, we report on the shared task on ***** metaphor identification ***** on VU Amsterdam Metaphor Corpus and on a subset of the TOEFL Native Language Identification Corpus. | ||
| historical text normalization | 14 | |
| D19-6112 This paper evaluates 63 multi-task learning configurations for sequence-to-sequence-based ***** historical text normalization ***** across ten datasets from eight languages, using autoencoding, grapheme-to-phoneme mapping, and lemmatization as auxiliary tasks. | ||
| P19-1157 Policy gradient training enables direct optimization for exact matches, and while the small datasets in ***** historical text normalization ***** are prohibitive of from-scratch reinforcement learning, we show that policy gradient fine-tuning leads to significant improvements across the board. | ||
| N19-1389 There is no consensus on the state-of-the-art approach to ***** historical text normalization *****. | ||
| 2021.eacl-main.163 morphological inflection generation and ***** historical text normalization *****, there are few works that outperform recurrent models using the transformer. | ||
| N18-2113 We highlight several issues in the evaluation of *****historical text normalization***** systems that make it hard to tell how well these systems would actually work in practicei.e . , for new datasets or languages ; in comparison to more nave systems ; or as a preprocessing step for downstream NLP tools . | ||
| change | 14 | |
| 2004.amta-papers.23 This paper describes an evaluation experiment about a Japanese-Uighur machine translation system which consists of verbal suffix processing, case suffix processing, phonetic ***** change ***** processing, and a Japanese-Uighur dictionary including about 20,000 words. | ||
| D19-1272 We propose to use masking (replacement) rate threshold as an adjustable parameter to control the amount of semantic ***** change ***** in the text. | ||
| 2021.naacl-main.342 In the pursuit of natural language understanding, there has been a long standing interest in tracking state ***** change *****s throughout narratives. | ||
| K18-1044 However, rarely do editorials ***** change ***** anyone's stance on an issue completely, nor do they tend to argue explicitly (but rather follow a subtle rhetorical strategy). | ||
| L10-1616 During this evolution their spelling rules ( and sometimes the syntactic and semantic ones ) *****change***** , putting old documents out of use . | ||
| argumentative structure | 14 | |
| D17-1253 Then, we adapt the idea of positional tree kernels in order to capture sequential and hierarchical ***** argumentative structure ***** together for the first time. | ||
| L08-1553 In addition, several examples are offered of how this kind of language resource can be used in linguistic, computational and philosophical research, and in particular, how the corpus has been used to initiate a programme investigating the automatic detection of ***** argumentative structure *****. | ||
| E17-1028 Discourse parsing is an integral part of understanding information flow and ***** argumentative structure ***** in documents. | ||
| W17-5105 This paper presents a method of extracting ***** argumentative structure ***** from natural language text. | ||
| C16-1158 Argument mining aims to determine the ***** argumentative structure ***** of texts. | ||
| remains a challenge | 14 | |
| C16-1311 Target-dependent sentiment classification ***** remains a challenge *****: modeling the semantic relatedness of a target with its context words in a sentence. | ||
| 2021.bionlp-1.27 Balancing the contribution of query terms appearing in the abstract and in sections of different importance in full text articles ***** remains a challenge ***** both with traditional bag-of-words IR approaches and for neural retrieval methods. | ||
| D18-1347 Code-switching, the use of more than one language within a single utterance, is ubiquitous in much of the world, but ***** remains a challenge ***** for NLP largely due to the lack of representative data for training models. | ||
| 2021.emnlp-main.570 However, evaluation of these systems ***** remains a challenge *****, especially in multilingual settings. | ||
| 2021.wanlp-1.17 While Arabic Natural Language Processing has seen significant advances in Natural Language Understanding (NLU) with language models such as AraBERT, Natural Language Generation (NLG) ***** remains a challenge *****. | ||
| automatic essay scoring | 14 | |
| W18-3713 In this paper we present a qualitatively enhanced deep convolution recurrent neural network for computing the quality of a text in an ***** automatic essay scoring ***** task. | ||
| P17-1011 We further demonstrate that discourse modes can be used as features that improve ***** automatic essay scoring ***** (AES). | ||
| P18-2080 In this work, we present an approach based on combining string kernels and word embeddings for ***** automatic essay scoring *****. | ||
| 2021.ccl-1.107 With the increasing popularity of learning Chinese as a second language (L2) the development of an ***** automatic essay scoring ***** (AES) method specially for Chinese L2 essays has become animportant task. | ||
| K17-1017 Neural network models have recently been applied to the task of ***** automatic essay scoring *****, giving promising results. | ||
| language grounding | 14 | |
| W17-2803 In this work, we gather behavior annotations from humans and demonstrate that these improve ***** language grounding ***** performance by allowing a system to focus on relevant behaviors for words like “white” or “half-full” that can be understood by looking or lifting, respectively. | ||
| 2021.mmsr-1.9 Concluding discussion considers how learned representations may be used for gesture recognition by the robot, and how the framework may mature into a system to address ***** language grounding ***** and semantic representation. | ||
| W19-1604 In this paper, we tackle the problem by separately learning the world representation of the robot and the ***** language grounding *****. | ||
| R17-1105 Recent attempts at behaviour understanding through ***** language grounding ***** have shown that it is possible to automatically generate models for planning problems from textual instructions. | ||
| W18-6551 We present a data resource which can be useful for research purposes on *****language grounding***** tasks in the context of geographical referring expression generation . | ||
| natural language question | 14 | |
| N19-1360 We describe a new semantic parsing setting that allows users to query the system using both ***** natural language question *****s and actions within a graphical user interface. | ||
| L16-1734 The ultimate goal of this research is to build a QA system that can answer ***** natural language question *****s from players by using inference on these game-specific logic rules. | ||
| D19-5309 While automated question answering systems are increasingly able to retrieve answers to ***** natural language question *****s, their ability to generate detailed human-readable explanations for their answers is still quite limited. | ||
| 2020.alvr-1.3 It then uses the variational auto-encoders model to encode images into a latent space and decode ***** natural language question *****s. | ||
| 2021.emnlp-main.707 Second, we propose a hierarchical SQL-to-question generation model to obtain high-quality ***** natural language question *****s, which is the major contribution of this work. | ||
| multimodal fusion | 14 | |
| P19-1046 We propose a general strategy named `divide, conquer and combine' for ***** multimodal fusion *****. | ||
| 2020.emnlp-main.161 HERO encodes multimodal inputs in a hierarchical structure, where local context of a video frame is captured by a Cross-modal Transformer via ***** multimodal fusion *****, and global video context is captured by a Temporal Transformer. | ||
| 2020.findings-emnlp.35 Tensor-based fusion methods have been proven effective in ***** multimodal fusion ***** tasks. | ||
| W18-3309 Previous work on multimodal sentiment analysis have been focused on input-level feature fusion or decision-level fusion for ***** multimodal fusion *****. | ||
| 2021.acl-long.412 The main challenge is the occurrence of some missing modalities during the ***** multimodal fusion ***** procedure. | ||
| deep recurrent neural | 14 | |
| W16-3715 The paper describes a new tagset for the morphological disambiguation of Sanskrit, and compares the accuracy of two machine learning methods (Conditional Random Fields, ***** deep recurrent neural ***** networks) for this task, with a special focus on how to model the lexicographic information. | ||
| P17-2054 To overcome this, we explore several auxiliary tasks, including semantic super-sense tagging and identification of multi-word expressions, and cast the task as a multi-task learning problem with ***** deep recurrent neural ***** networks. | ||
| C16-1028 The paper applies a ***** deep recurrent neural ***** network to the task of sentence boundary detection in Sanskrit, an important, yet underresourced ancient Indian language. | ||
| C16-1138 In this paper, we propose ***** deep recurrent neural ***** networks (DRNNs) for relation classification to tackle this challenge. | ||
| W17-4808 Our entry builds on our last year's success, our system based on ***** deep recurrent neural ***** networks outperformed all the other systems with a clear margin. | ||
| single sentence | 14 | |
| 2019.iwslt-1.8 The other two submitted systems are adapted to TED talks: SENTFINE is fine-tuned on ***** single sentence *****s, DOCFINE is fine-tuned on multi-sentence sequences. | ||
| D19-1606 Machine comprehension of texts longer than a ***** single sentence ***** often requires coreference resolution. | ||
| C16-2036 As syntactically complex sentences often pose a challenge for current Open RE approaches, we have developed a simplification framework that performs a pre-processing step by taking a ***** single sentence ***** as input and using a set of syntactic-based transformation rules to create a textual input that is easier to process for subsequently applied Open RE systems. | ||
| 2020.figlang-1.13 We extended latest pre-trained transformers like BERT, RoBERTa, spanBERT on different task objectives like ***** single sentence ***** classification, sentence pair classification, etc. | ||
| 2020.aacl-main.85 IndoNLU includes twelve tasks, ranging from ***** single sentence ***** classification to pair-sentences sequence labeling with different levels of complexity. | ||
| presents | 14 | |
| 2020.wnut-1.39 This paper ***** presents ***** our teamwork on WNUT 2020 shared task-1: wet lab entity extract, that we conducted studies in several models, including a BiLSTM CRF model and a Bert case model which can be used to complete wet lab entity extraction. | ||
| P18-2091 This paper ***** presents ***** the first study aimed at capturing stylistic similarity between words in an unsupervised manner. | ||
| Q18-1025 This paper ***** presents ***** the first model for time normalization trained on the SCATE corpus. | ||
| L14-1596 This article ***** presents ***** the methods, results, and precision of the syntactic annotation process of the Rhapsodie Treebank of spoken French. | ||
| W19-0506 At the same time, the studies reveal empirical evidence why contextual abstractness re***** presents ***** a valuable indicator for automatic non-literal language identification. | ||
| dialog act | 14 | |
| 2020.emnlp-main.655 This dataset is annotated with pre-existing user knowledge, message-level ***** dialog act *****s, grounding to Wikipedia, and user reactions to messages. | ||
| 2021.eacl-main.94 In this paper, we present a ***** dialog act ***** annotation scheme, MIDAS (Machine Interaction Dialog Act Scheme), targeted at open-domain human-machine conversations. | ||
| Q18-1033 Relying on the theory of institutional talk, we develop a labeling scheme for police speech during traffic stops, and a tagger to detect institutional ***** dialog act *****s (Reasons, Searches, Offering Help) from transcribed text at the turn (78% F-score) and stop (89% F-score) level. | ||
| 2021.rocling-1.19 The experimental results show that the accuracy of the configuration with ***** dialog act ***** embedding is 16% higher than that with only original statement embedding. | ||
| W18-5049 Given a user utterance, the intent of a slot-value pair is captured using ***** dialog act *****s (DA) expressed in that utterance. | ||
| web application | 14 | |
| 2020.cmlc-1.8 Close attention is paid to the data collection and, in particular, to the description of ***** web application ***** development. | ||
| L10-1073 We propose a solution which anchors on using controlled languages as interfaces to semantic ***** web application *****s. | ||
| D19-3016 However, despite their obvious advantages in natural language interaction, voice-enabled ***** web application *****s are still few and far between. | ||
| 2020.lrec-1.425 This contribution describes an ongoing project of speech data collection, using the ***** web application ***** Samrömur which is built upon Common Voice, Mozilla Foundation's web platform for open-source voice collection. | ||
| 2020.parlaclarin-1.3 Furthermore, information on word frequency are accessible in a custom made ***** web application ***** and an n-gram viewer. | ||
| single document summarization | 14 | |
| W18-6545 Till now, neural abstractive summarization methods have achieved great success for ***** single document summarization ***** (SDS). | ||
| 2020.aacl-srw.7 In this work, the task of extractive ***** single document summarization ***** applied to an education setting to generate summaries of chapters from grade 10 Hindi history textbooks is undertaken. | ||
| D19-5607 We propose a system that improves performance on ***** single document summarization ***** task using the CNN/DailyMail and Newsroom datasets. | ||
| D17-1223 Our work is aimed at ***** single document summarization ***** using small amounts of reference summaries. | ||
| P19-1099 We compared GOLC with two optimization methods, a maximum log-likelihood and a minimum risk training, on CNN/Daily Mail and a Japanese ***** single document summarization ***** data set of The Mainichi Shimbun Newspapers. | ||
| aspect level sentiment | 14 | |
| C18-1092 In ***** aspect level sentiment ***** classification, there are two common tasks: to identify the sentiment of an aspect (category) or a term. | ||
| D19-1549 In this paper, we propose a novel target-dependent graph attention network (TD-GAT) for ***** aspect level sentiment ***** classification, which explicitly utilizes the dependency relationship among words. | ||
| N18-1053 We train and evaluate Long Short Term Memory (LSTM) based architecture for ***** aspect level sentiment ***** classification. | ||
| D18-1380 We propose a novel multi-grained attention network (MGAN) model for ***** aspect level sentiment ***** classification. | ||
| D18-1136 We introduce a novel parameterized convolutional neural network for ***** aspect level sentiment ***** classification. | ||
| rating | 14 | |
| P19-1624 Experimental results on the WMT14 English-German and English-French benchmarks show that our model consistently improves performance over the strong Transformer model, demonst***** rating ***** the necessity and effectiveness of exploiting sentential context for NMT. | ||
| 2020.coling-main.556 We find that integ***** rating ***** event context improves classification performance over a very strong baseline. | ||
| N19-1061 Several recent works tackle this problem, and propose methods for significantly reducing this gender bias in word embeddings, demonst***** rating ***** convincing results. | ||
| 2021.wnut-1.53 Our results show that while word-level, intrinsic, performance evaluation is behind other methods, our model improves performance on extrinsic, downstream tasks through normalization compared to models ope***** rating ***** on raw, unprocessed, social media text. | ||
| D17-1170 We present opinion recommendation , a novel task of jointly generating a review with a *****rating***** score that a certain user would give to a certain product which is unreviewed by the user , given existing reviews to the product by other users , and the reviews that the user has given to other products . | ||
| levels | 14 | |
| 2008.amta-papers.19 We also build a cascaded translation model that dynamically shifts translation units from phrase level to word and morpheme phrase ***** levels *****. | ||
| W18-1601 For detection of stylistic variation, we use relative entropy, measuring the difference between probability distributions at different linguistic ***** levels ***** (here: lexis and grammar). | ||
| 2020.coling-main.554 Our best model, based on LSTMs, outperforms state-of-the-art results and achieves mean absolute errors of 1.86 and 2.28, at sentence and text ***** levels *****, respectively. | ||
| L16-1513 the clinical subcorpus, consisting of written texts produced by speakers with various types of language disorders, and the healthy speakers subcorpus, as well as by the ***** levels ***** of its annotation, it offers an opportunity for different lines of research. | ||
| L08-1159 Yet, building such models requires appropriate definition of various ***** levels ***** for representing the emotions themselves but also some contextual information such as the events that elicit these emotions. | ||
| measure | 14 | |
| 2021.acl-long.96 We also carry out multiple experiments to ***** measure ***** how much each augmentation strategy improves the performance of automatic scoring systems. | ||
| E17-2054 We show that a model capitalizing on a `fuzzy' ***** measure ***** of similarity is effective for learning quantifiers, whereas the learning of exact cardinals is better accomplished when information about number is provided. | ||
| L10-1071 Working within the EU funded COMPANIONS program, we investigate the use of appropriateness as a ***** measure ***** of conversation quality, the hypothesis being that good companions need to be good conversational partners . | ||
| 2005.mtsummit-papers.29 Example-based machine translation (EBMT) systems, so far, rely on heuristic ***** measure *****s in retrieving translation examples. | ||
| D19-1349 We derive a novel ***** measure ***** of LDA topic quality using the variability of the posterior distributions. | ||
| documentation | 14 | |
| 2020.lrec-1.833 Source code and ***** documentation ***** are available at https://github.com/machine-intelligence-laboratory/TopicNet | ||
| 2011.mtsummit-tutorials.4 The expectation had however always been that MT could one day be deployed on the bulk of user interface and product ***** documentation *****, due to the expected process efficiencies and cost savings. | ||
| 2021.smm4h-1.29 The steps for pre-processing tweets, feature extraction, and the development of the machine learning models, are described extensively in the ***** documentation *****. | ||
| L12-1476 In this paper, we describe the online repository that we have created as a one-stop resource for obtaining NLG task materials, both from Generation Challenges tasks and from other sources, where the set of materials provided for each task consists of (i) task definition, (ii) input and output data, (iii) evaluation software, (iv) ***** documentation *****, and (v) publications reporting previous results. | ||
| L06-1507 However for the much more restricted domain of language ***** documentation ***** such a category system might still prove reasonable if not indispensable. | ||
| automatic question | 14 | |
| W18-6536 In this work we present a new Attentional Encoder–Decoder Recurrent Neural Network model for ***** automatic question ***** generation. | ||
| D19-5809 These results established that our research direction may be promising, but at the same time revealed that the identification of question patterns is a challenging issue, and it has to be largely refined to achieve a better quality in the end-to-end ***** automatic question ***** generation. | ||
| P17-1123 We study ***** automatic question ***** generation for sentences from text passages in reading comprehension. | ||
| L10-1162 Creating more fine-grained annotated data than previously relevent document sets is important for evaluating individual components in ***** automatic question ***** answering systems. | ||
| R19-1049 We present a novel approach to ***** automatic question ***** answering that does not depend on the performance of an information retrieval (IR) system and does not require that the training data come from the same source as the questions. | ||
| uncertainty | 14 | |
| Q14-1005 We propose a new method that projects model expectations rather than labels, which facilities transfer of model ***** uncertainty ***** across language boundaries. | ||
| 2021.eacl-main.145 We conduct an extensive empirical study of various Bayesian ***** uncertainty ***** estimation methods and Monte Carlo dropout options for deep pre-trained models in the active learning framework and find the best combinations for different types of models. | ||
| I17-1043 The difficulties come from the scarcity of training data, general subjectivity in emotion perception resulting in low annotator agreement, and the ***** uncertainty ***** about which features are the most relevant and robust ones for classification. | ||
| W18-5451 They have become the standard approach for automatic translation of text, at the cost of increased model complexity and ***** uncertainty *****. | ||
| 2021.insights-1.20 Our extensive empirical evaluation shows that ***** uncertainty *****-based acquisition functions can not surpass the accuracy reached with the random acquisition on these data sets. | ||
| multilingual document | 14 | |
| 1999.mtsummit-1.46 In this paper we describe a language recognition algorithm for ***** multilingual document *****s that is based on mixed-order n-grams, Markov chains, maximum likelihood, and dynamic programming. | ||
| 2021.eacl-main.146 The collection – called MultiHumES– provides ***** multilingual document *****s coupled with informative snippets that have been annotated by humanitarian analysts over the past four years. | ||
| 2000.amta-systems.2 In this paper we describe the KANTOO machine translation environment, a set of software services and tools for ***** multilingual document ***** production. | ||
| 2001.mtsummit-papers.7 The corpus feeds an experimental ***** multilingual document ***** generation system for the web. | ||
| Q14-1003 In this work, we address the problem of detecting documents that contain text from more than one language (***** multilingual document *****s). | ||
| artificial neural | 14 | |
| E17-2110 Existing models based on ***** artificial neural ***** networks (ANNs) for sentence classification often do not incorporate the context in which sentences appear, and classify sentences individually. | ||
| 2020.emnlp-main.18 In this paper, we propose KERMIT (Kernel-inspired Encoder with Recursive Mechanism for Interpretable Trees) to embed symbolic syntactic parse trees into ***** artificial neural ***** networks and to visualize how syntax is used in inference. | ||
| 2021.rocling-1.42 In this study, we would like to construct a multi-speaker TTS system by incorporating two sub modules into ***** artificial neural ***** network-based speech synthesis system to alleviate this problem. | ||
| Q18-1045 We examine the role of ***** artificial neural ***** networks, the current state of the art in many common NLP tasks, by returning to a classic case study. | ||
| W19-8705 Recent advances in ***** artificial neural ***** networks now have a great impact on translation technology. | ||
| simultaneous machine | 14 | |
| L06-1064 These interpreting patterns can be expected to be used as interpreting rules of ***** simultaneous machine ***** interpretation. | ||
| 2021.wmt-1.119 Recent work in ***** simultaneous machine ***** translation is often trained with conventional full sentence translation corpora, leading to either excessive latency or necessity to anticipate as-yet-unarrived words, when dealing with a language pair whose word orders significantly differ. | ||
| 2021.emnlp-main.536 We propose a generative framework for ***** simultaneous machine ***** translation. | ||
| 2021.eacl-main.281 This paper addresses the problem of ***** simultaneous machine ***** translation (SiMT) by exploring two main concepts: (a) adaptive policies to learn a good trade-off between high translation quality and low latency; and (b) visual information to support this process by providing additional (visual) contextual information which may be available before the textual input is produced. | ||
| W19-3648 We describe work in progress for evaluating performance of sequence-to-sequence neural networks on the task of syntax-based reordering for rules applicable to ***** simultaneous machine ***** translation. | ||
| speaking | 14 | |
| 2020.lrec-1.804 This paper presents a dataset of transcribed high-quality audio of English sentences recorded by volunteers ***** speaking ***** with different accents of the British Isles. | ||
| 2020.codi-1.1 With their huge ***** speaking ***** populations in the world, Spanish and Chinese occupy important positions in linguistic studies. | ||
| L10-1267 The aim of this study was to assess the retrieval effectiveness of nursing students in the Dutch-***** speaking ***** part of Belgium. | ||
| L16-1078 We hope this corpus will help other research teams in developing tools for supporting public ***** speaking ***** training. | ||
| I17-1061 Experiments show that our approach leads to significant improvements over baseline model quality, generating responses that capture more precisely speakers' traits and ***** speaking ***** styles. | ||
| shallow semantic | 14 | |
| Q17-1009 This paper explores extending ***** shallow semantic ***** parsing beyond lexical-unit triggers, using causal relations as a test case. | ||
| 2020.emnlp-main.403 We show that NSP is detrimental to training due to its context splitting and ***** shallow semantic ***** signal. | ||
| D17-3004 This tutorial describes semantic role labelling (SRL), the task of mapping text to ***** shallow semantic ***** representations of eventualities and their participants. | ||
| R19-1112 We present a live cross-lingual system capable of producing ***** shallow semantic ***** annotations of natural language sentences for 51 languages at this time. | ||
| L06-1317 The paper discusses *****shallow semantic***** annotation of Bulgarian treebank . | ||
| e2e dataset | 14 | |
| 2021.inlg-1.18 This study introduces an enriched version of the *****E2E dataset*****, one of the most popular language resources for data-to-text NLG. | ||
| P19-1080 In this paper, we (1) propose using tree-structured semantic representations, like those used in traditional rule-based NLG systems, for better discourse-level structuring and sentence-level planning; (2) introduce a challenging dataset using this representation for the weather domain; (3) introduce a constrained decoding approach for Seq2Seq models that leverages this representation to improve semantic correctness; and (4) demonstrate promising results on our dataset and the *****E2E dataset*****. | ||
| 2021.inlg-1.44 With extensive experiments on WikiBio and *****E2E dataset*****, we show that our model outperforms the state-of-the models and several standard baseline systems. | ||
| 2021.ranlp-1.92 We evaluate several systems in the *****E2E dataset***** with 6 automatic metrics. | ||
| W17-5525 The *****E2E dataset***** poses new challenges: (1) its human reference texts show more lexical richness and syntactic variation, including discourse phenomena; (2) generating from this set requires content selection. | ||
| multimodal machine learning | 14 | |
| 2021.mmsr-1.1 The last years have shown rapid developments in the field of *****multimodal machine learning*****, combining e.g., vision, text or speech. | ||
| N18-1199 *****Multimodal machine learning***** algorithms aim to learn visual-textual correspondences. | ||
| P17-5002 *****Multimodal machine learning***** is a vibrant multi-disciplinary research field which addresses some of the original goals of artificial intelligence by integrating and modeling multiple communicative modalities, including linguistic, acoustic and visual messages. | ||
| 2020.emnlp-main.62 We hence recommend that researchers in *****multimodal machine learning***** report the performance not only of unimodal baselines, but also the EMAP of their best-performing model. | ||
| W18-3308 *****Multimodal machine learning***** is a core research area spanning the language, visual and acoustic modalities. | ||
| tensor factorization | 14 | |
| Q18-1015 First, our simple yet effective knowledge guided *****tensor factorization***** approach achieves state-of-the-art results on two generics KBs (80% precise) for science, doubling their size at 74%–86% precision. | ||
| E17-2028 We offer a new interpretation of skip-gram based on exponential family PCA-a form of matrix factorization to generalize the skip-gram model to *****tensor factorization*****. | ||
| P18-1146 In this paper, we propose *****Tensor Factorization***** with Back-off and Aggregation (TFBA), a novel framework for the HRSI problem. | ||
| 2021.naacl-main.202 In this paper, we present a novel time-aware knowledge graph embebdding approach, TeLM, which performs 4th-order *****tensor factorization***** of a Temporal knowledge graph using a Linear temporal regularizer and Multivector embeddings. | ||
| W19-4331 Applying a modality-based *****tensor factorization***** method, which adopts different factors for different modalities, results in removing information present in a modality that can be compensated by other modalities, with respect to model outputs. | ||
| tensor decomposition | 14 | |
| N18-1082 We then derive preposition embeddings via *****tensor decomposition***** on a large unlabeled corpus. | ||
| D19-1207 Furthermore, we novelly introduce Low-Rank HOCA which adopts *****tensor decomposition***** to reduce the extremely large space requirement of HOCA, leading to a practical and efficient implementation in real-world applications. | ||
| 2021.emnlp-main.625 The semantic filter module can be added to most geometric and *****tensor decomposition***** models with minimal additional memory. | ||
| 2020.coling-main.346 In particular, we show how a powerful composition function based on the canonical *****tensor decomposition***** can exploit such a rich structure. | ||
| D19-1408 Specifically, we propose to learn low-rank sentence embeddings by *****tensor decomposition***** to capture their contextual semantic similarity, and use K-nearest neighbors (KNNs) of each sentence in the embedding space to generate sample clusters. | ||
| Natural Language Generation ( NLG | 14 | |
| N19-2027 Neural approaches to *****Natural Language Generation ( NLG***** ) have been promising for goal - oriented dialogue . | ||
| W19-8669 In *****Natural Language Generation ( NLG***** ) , End - to - End ( E2E ) systems trained through deep learning have recently gained a strong interest . | ||
| P18-3017 *****Natural Language Generation ( NLG***** ) is a research task which addresses the automatic generation of natural language text representative of an input non - linguistic collection of knowledge . | ||
| W19-8657 Current approaches to *****Natural Language Generation ( NLG***** ) for dialog mainly focus on domain - specific , task - oriented applications ( e.g. | ||
| D19-6313 This paper presents an exploratory study that aims to evaluate the usefulness of back - translation in *****Natural Language Generation ( NLG***** ) from semantic representations for non - English languages . | ||
| internal | 14 | |
| K18-1036 Neural morphological tagging has been regarded as an extension to POS tagging task , treating each morphological tag as a monolithic label and ignoring its *****internal***** structure . | ||
| 2021.acl-long.557 Bubble representations were proposed in the formal linguistics literature decades ago ; they enhance dependency trees by encoding coordination boundaries and *****internal***** relationships within coordination structures explicitly . | ||
| W17-8005 We assume that unknown words with *****internal***** structure ( affixed words or compounds ) can provide speakers with linguistic cues as for their meaning , and thus help their decoding and understanding . | ||
| D19-1405 While neural models show remarkable accuracy on individual predictions , their *****internal***** beliefs can be inconsistent across examples . | ||
| 2021.naacl-main.74 The dominant approach in probing neural networks for linguistic properties is to train a new shallow multi - layer perceptron ( MLP ) on top of the model 's *****internal***** representations . | ||
| knowledge graphs ( KGs | 14 | |
| P19-1466 The recent proliferation of *****knowledge graphs ( KGs***** ) coupled with incomplete or partial information , in the form of missing relations ( links ) between entities , has fueled a lot of research on knowledge base completion ( also known as relation prediction ) . | ||
| 2020.findings-emnlp.207 Complex node interactions are common in *****knowledge graphs ( KGs***** ) , and these interactions can be considered as contextualized knowledge exists in the topological structure of KGs . | ||
| D19-1075 Entity alignment aims to find entities in different *****knowledge graphs ( KGs***** ) that refer to the same real - world object . | ||
| D19-1023 Entity alignment is a viable means for integrating heterogeneous knowledge among different *****knowledge graphs ( KGs***** ) . | ||
| D18-1358 The rapid development of *****knowledge graphs ( KGs***** ) , such as Freebase and WordNet , has changed the paradigm for AI - related applications . | ||
| multi - document | 14 | |
| P19-1098 The most important obstacles facing *****multi - document***** summarization include excessive redundancy in source descriptions and the looming shortage of training data . | ||
| 2021.emnlp-main.778 Multi - text applications , such as *****multi - document***** summarization , are typically required to model redundancies across related texts . | ||
| 2021.naacl-main.54 Allowing users to interact with *****multi - document***** summarizers is a promising direction towards improving and customizing summary results . | ||
| 2020.findings-emnlp.367 Most work on *****multi - document***** summarization has focused on generic summarization of information present in each individual document set . | ||
| Q13-1008 Supervised learning methods and LDA based topic model have been successfully applied in the field of *****multi - document***** summarization . | ||
| Grammatical Error Correction ( GEC | 14 | |
| 2020.latechclfl-1.10 *****Grammatical Error Correction ( GEC***** ) is the task of correcting different types of errors in written texts . | ||
| P18-1127 Metric validation in *****Grammatical Error Correction ( GEC***** ) is currently done by observing the correlation between human and metric - induced rankings . | ||
| 2021.bea-1.12 *****Grammatical Error Correction ( GEC***** ) is a task that has been extensively investigated for the English language . | ||
| 2020.bea-1.21 *****Grammatical Error Correction ( GEC***** ) is concerned with correcting grammatical errors in written text . | ||
| W19-4414 The field of *****Grammatical Error Correction ( GEC***** ) has produced various systems to deal with focused phenomena or general text editing . | ||
| pre - trained word | 14 | |
| P19-2041 Using *****pre - trained word***** embeddings in conjunction with Deep Learning models has become the de facto approach in Natural Language Processing ( NLP ) . | ||
| 2020.figlang-1.18 Recent work on automatic sequential metaphor detection has involved recurrent neural networks initialized with different *****pre - trained word***** embeddings and which are sometimes combined with hand engineered features . | ||
| 2020.coling-main.149 Prior works investigating the geometry of *****pre - trained word***** embeddings have shown that word embeddings to be distributed in a narrow cone and by centering and projecting using principal component vectors one can increase the accuracy of a given set of pre - trained word embeddings . | ||
| D18-1176 While one of the first steps in many NLP systems is selecting what *****pre - trained word***** embeddings to use , we argue that such a step is better left for neural networks to figure out by themselves . | ||
| 2021.acl-srw.15 In this study , we propose a model that extends the continuous space topic model ( CSTM ) , which flexibly controls word probability in a document , using *****pre - trained word***** embeddings . | ||
| electronic | 14 | |
| W18-3804 This paper shows how a Lexicon - Grammar dictionary of English phrasal verbs ( PV ) can be transformed into an *****electronic***** dictionary , and with the help of multiple grammars , dictionaries , and filters within the linguistic development environment , NooJ , how to accurately identify PV in large corpora . | ||
| P19-3004 With the increasing democratization of *****electronic***** media , vast information resources are available in less - frequently - taught languages such as Swahili or Somali . | ||
| 2015.lilt-12.6 Literary works are becoming increasingly available in *****electronic***** formats , thus quickly transforming editorial processes and reading habits . | ||
| 2020.lrec-1.395 Aligning senses across resources and languages is a challenging task with beneficial applications in the field of natural language processing and *****electronic***** lexicography . | ||
| L16-1081 In this paper we present a rule - based method for multi - word term extraction that relies on extensive lexical resources in the form of *****electronic***** dictionaries and finite - state transducers for modelling various syntactic structures of multi - word terms . | ||
| zero - shot | 14 | |
| 2021.acl-long.447 Few - shot crosslingual transfer has been shown to outperform its *****zero - shot***** counterpart with pretrained encoders like multilingual BERT . | ||
| D19-1048 Generative classifiers offer potential advantages over their discriminative counterparts , namely in the areas of data efficiency , robustness to data shift and adversarial examples , and *****zero - shot***** learning ( Ng and Jordan,2002 ; Yogatama et al . , 2017 ; Lewis and Fan,2019 ) . | ||
| 2021.emnlp-main.394 We explore the link between the extent to which syntactic relations are preserved in translation and the ease of correctly constructing a parse tree in a *****zero - shot***** setting . | ||
| 2020.emnlp-main.194 We study the *****zero - shot***** transfer capabilities of text matching models on a massive scale , by self - supervised training on 140 source domains from community question answering forums in English . | ||
| D18-1352 Large multi - label datasets contain labels that occur thousands of times ( frequent group ) , those that occur only a few times ( few - shot group ) , and labels that never appear in the training dataset ( *****zero - shot***** group ) . | ||
| Chinese word segmentation ( CWS | 14 | |
| 2020.acl-main.735 *****Chinese word segmentation ( CWS***** ) and part - of - speech ( POS ) tagging are important fundamental tasks for Chinese language processing , where joint learning of them is an effective one - step solution for both tasks . | ||
| 2020.emnlp-main.317 Taking greedy decoding algorithm as it should be , this work focuses on further strengthening the model itself for *****Chinese word segmentation ( CWS***** ) , which results in an even more fast and more accurate CWS model . | ||
| 2021.acl-demo.12 We present fastHan , an open - source toolkit for four basic tasks in Chinese natural language processing : *****Chinese word segmentation ( CWS***** ) , Part - of - Speech ( POS ) tagging , named entity recognition ( NER ) , and dependency parsing . | ||
| P17-1110 Different linguistic perspectives causes many diverse segmentation criteria for *****Chinese word segmentation ( CWS***** ) . | ||
| 2020.coling-main.187 *****Chinese word segmentation ( CWS***** ) and part - of - speech ( POS ) tagging are two fundamental tasks for Chinese language processing . | ||
| term | 14 | |
| 2001.mtsummit-papers.31 The result shows that bilingual term entries extracted from 2,000 pairs of parallel texts which share a specific domain with the input texts introduce more improvements than a technical *****term***** dictionary with 38,000 entries which covers a broader domain . | ||
| 2020.acl-main.72 We propose a methodology to construct a *****term***** dictionary for text analytics through an interactive process between a human and a machine , which helps the creation of flexible dictionaries with precise granularity required in typical text analysis . | ||
| L06-1142 We present a framework that combines a web - based text acquisition tool , a *****term***** extractor and a two - level workflow management system tailored for facilitating dictionary updates . | ||
| L06-1275 We assume that the term frequency weighted by the types of documents can be an indicator of the *****term***** intelligibility for a certain readership . | ||
| 2020.computerm-1.11 Our contribution is part of a wider research project on *****term***** variation in German and concentrates on the computational aspects of a frame - based model for term meaning representation in the technical field . | ||
| multi - modal | 14 | |
| D17-1114 The rapid increase of the multimedia data over the Internet necessitates *****multi - modal***** summarization from collections of text , image , audio and video . | ||
| D19-1566 In recent times , *****multi - modal***** analysis has been an emerging and highly sought - after field at the intersection of natural language processing , computer vision , and speech processing . | ||
| 2021.maiworkshop-1.5 Large - scale multi - modal classification aim to distinguish between different *****multi - modal***** data , and it has drawn dramatically attentions since last decade . | ||
| 2021.emnlp-main.512 Visual question answering ( VQA ) is challenging not only because the model has to handle *****multi - modal***** information , but also because it is just so hard to collect sufficient training examples there are too many questions one can ask about an image . | ||
| D18-1438 Rapid growth of *****multi - modal***** documents on the Internet makes multi - modal summarization research necessary . | ||
| natural language inference ( NLI | 14 | |
| D19-6609 Recent Deep Learning ( DL ) models have succeeded in achieving human - level accuracy on various natural language tasks such as question - answering , *****natural language inference ( NLI***** ) , and textual entailment . | ||
| 2020.lrec-1.846 Many recent studies have shown that for models trained on datasets for *****natural language inference ( NLI***** ) , it is possible to make correct predictions by merely looking at the hypothesis while completely ignoring the premise . | ||
| W19-5049 This paper presents a multi - task learning approach to *****natural language inference ( NLI***** ) and question entailment ( RQE ) in the biomedical domain . | ||
| S19-1027 Large crowdsourced datasets are widely used for training and evaluating neural models on *****natural language inference ( NLI***** ) . | ||
| 2021.acl-short.99 The general format of *****natural language inference ( NLI***** ) makes it tempting to be used for zero - shot text classification by casting any target label into a sentence of hypothesis and verifying whether or not it could be entailed by the input , aiming at generic classification applicable on any specified label space . | ||
| Dutch | 14 | |
| L12-1396 In this paper we present the first corpus where one million *****Dutch***** words from a variety of text genres have been annotated with semantic roles . | ||
| 2020.peoples-1.9 The paper focuses on a large collection of *****Dutch***** tweets from the Netherlands to get an insight into the perception and reactions of users during the early months of the COVID-19 pandemic . | ||
| W18-6508 This paper presents SimpleNLG - NL , an adaptation of the SimpleNLG surface realisation engine for the *****Dutch***** language . | ||
| L10-1179 A corpus called DutchParl is created which aims to contain all digitally available parliamentary documents written in the *****Dutch***** language . | ||
| L06-1161 This paper discusses the parameterized Equivalence Class Method for Dutch , an approach developed to incorporate standard lexical representations for *****Dutch***** idioms into representations required by any specific NLP system with as minimal manual work as possible . | ||
| Universal Dependencies | 14 | |
| 2020.udw-1.1 We present the first *****Universal Dependencies***** treebank for Hittite . | ||
| 2020.udw-1.21 This paper presents the first treebank for the Laz language , which is also the first *****Universal Dependencies***** Treebank for a South Caucasian language . | ||
| 2021.americasnlp-1.6 We study the performance of several popular neural part - of - speech taggers from the *****Universal Dependencies***** ecosystem on Mayan languages using a small corpus of 1435 annotated K'iche ' sentences consisting of approximately 10,000 tokens , with encouraging results : F_1 scores 93%+ on lemmatisation , part - of - speech and morphological feature assignment . | ||
| W17-1406 This paper introduces the *****Universal Dependencies***** Treebank for Slovenian . | ||
| 2020.udw-1.18 In this paper we present a method for identifying and analyzing adnominal possessive constructions in 66 *****Universal Dependencies***** treebanks . | ||
| unknown | 14 | |
| L06-1333 Usually a high portion of the different word forms in a corpusreceive no reading by the lexical and/or morphological analysis . These *****unknown***** words constitute a huge problem for NLP analysis tasks likePOS - tagging or syntactic parsing . | ||
| W17-4117 We propose a new type of subword embedding designed to provide more information about *****unknown***** compounds , a major source for OOV words in German . | ||
| W17-8005 We assume that *****unknown***** words with internal structure ( affixed words or compounds ) can provide speakers with linguistic cues as for their meaning , and thus help their decoding and understanding . | ||
| 2020.lrec-1.211 To assess the robustness of NER systems , we propose an evaluation method that focuses on subsets of tokens that represent specific sources of errors : *****unknown***** words and label shift or ambiguity . | ||
| L08-1487 A new approach to handle *****unknown***** words in machine translation is presented . | ||
| pipeline | 14 | |
| 2021.bea-1.22 Argument mining is often addressed by a *****pipeline***** method where segmentation of text into argumentative units is conducted first and proceeded by an argument component identification task . | ||
| 2020.emnlp-main.388 Morphologically rich languages seem to benefit from joint processing of morphology and syntax , as compared to *****pipeline***** architectures . | ||
| 2020.lrec-1.556 Named entity recognition ( NER ) from speech is usually made through a *****pipeline***** process that consists in ( i ) processing audio using an automatic speech recognition system ( ASR ) and ( ii ) applying a NER to the ASR outputs . | ||
| 2021.iwpt-1.19 This year the official evaluation metrics was ELAS , therefore dependency parsing might have been avoided as well as other *****pipeline***** stages like POS tagging and lemmatization . | ||
| P19-3011 We present ConvLab , an open - source multi - domain end - to - end dialog system platform , that enables researchers to quickly set up experiments with reusable components and compare a large set of different approaches , ranging from conventional *****pipeline***** systems to end - to - end neural models , in common environments . | ||
| Question Answering ( QA | 14 | |
| 2020.coling-main.231 The structural information of Knowledge Bases ( KBs ) has proven effective to *****Question Answering ( QA***** ) . | ||
| 2021.emnlp-main.301 Natural language ( NL ) explanations of model predictions are gaining popularity as a means to understand and verify decisions made by large black - box pre - trained models , for tasks such as *****Question Answering ( QA***** ) and Fact Verification . | ||
| 2020.sltu-1.49 Dense word vectors or ` word embeddings ' which encode semantic properties of words , have now become integral to NLP tasks like Machine Translation ( MT ) , *****Question Answering ( QA***** ) , Word Sense Disambiguation ( WSD ) , and Information Retrieval ( IR ) . | ||
| 2010.jeptalnrecital-court.36 Question Generation ( QG ) and *****Question Answering ( QA***** ) are some of the many challenges for natural language understanding and interfaces . | ||
| L08-1252 Discovering relations among Named Entities ( NEs ) from large corpora is both a challenging , as well as useful task in the domain of Natural Language Processing , with applications in Information Retrieval ( IR ) , Summarization ( SUM ) , *****Question Answering ( QA***** ) and Textual Entailment ( TE ) . | ||
| grouping | 13 | |
| E17-2070 To measure quality of Latent Dirichlet Allocation (LDA) based topics learned from text, we propose a novel approach based on ***** grouping ***** of topic words into buckets (TBuckets). | ||
| 2021.acl-long.364 We investigate: how to best encode mentions, which clustering algorithms are most effective for ***** grouping ***** mentions, how models transfer to different domains, and how bounding the number of mentions tracked during inference impacts performance. | ||
| L08-1070 Our experiment shows that the tool is capable of ***** grouping ***** many related terms using their definitions. | ||
| 2020.findings-emnlp.176 In this work, we focus on reporting abnormal findings on radiology images; instead of training on complete radiology reports, we propose a method to identify abnormal findings from the reports in addition to ***** grouping ***** them with unsupervised clustering and minimal rules. | ||
| E17-3011 Distributing papers into sessions in scientific conferences is a task consisting in ***** grouping ***** papers with common topics and considering the size restrictions imposed by the conference schedule | ||
| compilation | 13 | |
| L04-1171 We focus on the treebanking task as a trigger for basic language resources ***** compilation *****. | ||
| 2012.freeopmt-1.2 Inline documentation and a library browser facilitate the use of existing resource libraries, and ***** compilation ***** and testing of grammars is greatly improved through single-click launch configurations and an in-built test case manager for running treebank regression tests. | ||
| W18-0535 This paper describes the collection and ***** compilation ***** of the OneStopEnglish corpus of texts written at three reading levels, and demonstrates its usefulness for through two applications - automatic readability assessment and automatic text simplification. | ||
| 1995.iwpt-1.9 In this paper, we describe the fundamental data structures and ***** compilation ***** techniques that we have employed to develop a unification and constraint-resolution engine capable of performance rivaling that of directly compiled Prolog terms while greatly exceeding Prolog in flexibility, expressiveness and modularity. | ||
| L12-1334 We focus on building representative parallel corpora which include a diversity of domains and genres, reflect the relations between Bulgarian and other languages and are consistent in terms of ***** compilation ***** methodology, text representation, metadata description and annotation conventions | ||
| triage | 13 | |
| W18-0606 The large increase in the number of forum users makes the task of the moderators unmanageable without the help of automatic ***** triage ***** systems. | ||
| 2020.coling-main.61 However, for case management and referral to psychiatrists, health-care workers require practical and scalable depressive disorder screening and ***** triage ***** system. | ||
| 2021.bionlp-1.19 document ***** triage *****) and biomedical expression OCR. | ||
| E17-2114 We formalize this as an instance of linear feature-based IR, demonstrating a 34%-43% improvement in recall for candidate ***** triage ***** for QA. | ||
| D19-3015 We describe HARE, a system for highlighting relevant information in document collections to support ranking and ***** triage *****, which provides tools for post-processing and qualitative analysis for model development and tuning | ||
| specifying | 13 | |
| 2005.mtsummit-osmtw.4 The machine translation toolbox, which will most likely be released under a GPL-like license includes (a) the open-source engine itself, a modular shallow-transfer machine translation engine suitable for related languages and largely based upon that of systems we have already developed, such as interNOSTRUM for Spanish—Catalan and Traductor Universia for Spanish—Portuguese, (b) extensive documentation (including document type declarations) ***** specifying ***** the XML format of all linguistic (dictionaries, rules) and document format management files, (c) compilers converting these data into the high-speed (tens of thousands of words a second) format used by the engine, and (d) pilot linguistic data for Spanish—Catalan and Spanish—Galician and format management specifications for the HTML, RTF and plain text formats. | ||
| D19-1124 Therefore, controllable responses can be generated through ***** specifying ***** the value of each dimension of the latent variable. | ||
| 2011.iwslt-evaluation.1 This paper provides an overview of the IWSLT 2011 Evaluation Campaign, which includes: descriptions of the supplied data and evaluation specifications of each track, the list of participants ***** specifying ***** their submitted runs, a detailed description of the subjective evaluation carried out, the main findings of each exercise drawn from the results and the system descriptions prepared by the participants, and, finally, several detailed tables reporting all the evaluation results. | ||
| W19-3106 Reducts are formed over tests and non-tests alike, ***** specifying ***** what is observable. | ||
| C16-2033 The API takes as input an HTTP GET request ***** specifying ***** a valence pattern and outputs a list of exemplifying annotated sentences in JSON format | ||
| alternatively | 13 | |
| 2021.sigdial-1.48 For multi-turn question-answering live chat, typical Question Answering systems are single-turn and focus on factoid questions; ***** alternatively *****, modeling as goal-oriented dialogue limits us to narrower domains. | ||
| 2020.findings-emnlp.316 Finally, we employ the Gumbel-Softmax estimator to ***** alternatively ***** train the dialogue agent and the dialogue reward model without using reinforcement learning. | ||
| D17-1217 We adopt a hierarchical architecture to represent both word level and sentence level information, and use the attention operations for aspect questions and documents ***** alternatively ***** with the multiple hop mechanism. | ||
| 2020.acl-main.52 To alleviate these problems, we propose a novel framework named Curriculum Dual Learning (CDL) which extends the emotion-controllable response generation to a dual task to generate emotional responses and emotional queries ***** alternatively *****. | ||
| 2020.findings-emnlp.209 However, to ***** alternatively ***** update the dialogue policy and the reward model on the fly, we are limited to policy-gradient-based algorithms, such as REINFORCE and PPO | ||
| rephrase | 13 | |
| 2020.emnlp-main.414 For example, for queries like `ask my wife if she can pick up the kids' or `remind me to take my pills', we need to ***** rephrase ***** the content to `can you pick up the kids' and `take your pills'. | ||
| 2020.lrec-1.222 Since these models ***** rephrase ***** text and thus use similar but different words as found in the summarized text, existing metrics such as ROUGE that use n-gram overlap may not be optimal. | ||
| W17-3003 This paper proposes a system that can detect and ***** rephrase ***** profanity in Chinese text. | ||
| D18-1080 Split and ***** rephrase ***** is the task of breaking down a sentence into shorter ones that together convey the same meaning. | ||
| D17-1220 Like text simplification, a goal of ALA is to ***** rephrase ***** the original text in a more easily understandable manner | ||
| decomposing | 13 | |
| 2020.emnlp-main.580 We investigate passage-based briefs, containing a relevant passage from Wikipedia, entity-centric ones consisting of Wikipedia pages of mentioned entities, and Question-Answering Briefs, with questions ***** decomposing ***** the claim, and their answers. | ||
| L06-1494 In this paper we present three methods for ***** decomposing ***** complex questions and we evaluate their impact on the responsiveness of the answers they enable. | ||
| 2014.lilt-9.4 Pursuing this direction, we are convinced that crucial progress will derive from a focus on ***** decomposing ***** the complexity of the TE task into basic phenomena and on their combination. | ||
| 2021.acl-demo.4 Meanwhile, our library maintains sufficient modularity and extensibility by properly ***** decomposing ***** the model architecture, inference, and learning process into highly reusable modules, which allows users to easily incorporate new models into our framework. | ||
| 2021.emnlp-main.678 In this paper we propose to address policy compliance detection via ***** decomposing ***** it into question answering, where questions check whether the conditions stated in the policy apply to the scenario, and an expression tree combines the answers to obtain the label | ||
| singular | 13 | |
| W16-3714 Mismatched crowdsourcing for multilingual channels has certain properties of projection mapping, e.g., it can be interpreted as a clustering based on ***** singular ***** value decomposition of the segment alignments. | ||
| D19-5549 Distinguishing between ***** singular ***** and plural “you” in English is a challenging task which has potential for downstream applications, such as machine translation or coreference resolution. | ||
| W18-5443 We compute the gradients of the states with respect to the input embeddings and decompose the gradient matrix with Singular Value Decomposition to analyze which directions in the embedding space are best transferred to the hidden state space, characterized by the largest ***** singular ***** values. | ||
| W19-6121 Experiments are performed using conditionality and bi-directionality for these models, and using either ***** singular ***** word embeddings or averaged word embeddings for an entire quote, to determine the optimal model design. | ||
| 2020.starsem-1.11 The identified, ***** singular ***** points correspond to polysemous words, i.e. words with multiple meanings | ||
| ngram | 13 | |
| 2020.semeval-1.266 Our experiments indicate that char ***** ngram ***** features are more helpful than word ***** ngram ***** features. | ||
| W16-4802 We submitted results of a single system based on support vector machines (SVM) with linear kernel and using character ***** ngram ***** features, which obtained the first rank at the closed training track for test set A. Besides the linear SVM, we also report additional experiments with a number of deep learning architectures. | ||
| W16-4823 Like previous years, we relied on character ***** ngram ***** features, and a mixture of discriminative and generative statistical classifiers. | ||
| D17-1023 We also demonstrate that the trained ***** ngram ***** representations are useful in many aspects such as finding antonyms and collocations. | ||
| E17-1055 We investigate state-of-the-art learning methods on each level and find large differences, e.g., for deep learning models, traditional ***** ngram ***** features and the subword model of fasttext (Bojanowski et al., 2016) on the character level; for word2vec (Mikolov et al., 2013) on the word level; and for the order-aware model wang2vec (Ling et al., 2015a) on the entity level | ||
| redundancy | 13 | |
| L08-1023 Developing linguistic resources, in particular grammars, is known to be a complex task in itself, because of (amongst others) ***** redundancy ***** and consistency issues. | ||
| R19-1097 Our approach is to train text-to-text rewriting models to correct information ***** redundancy ***** errors that may arise during summarization. | ||
| D19-5622 However, there is no explicit mechanism to ensure that different attention heads indeed capture different features, and in practice, ***** redundancy ***** has occurred in multiple heads. | ||
| 2021.naacl-main.72 Using BERT-base model as an example, this paper provides a comprehensive study on attention ***** redundancy ***** which is helpful for model interpretation and model compression. | ||
| 2020.coling-main.462 We first analyze the extent of information ***** redundancy ***** present in the outputs generated by a baseline model trained using maximum likelihood estimation (MLE) | ||
| symmetric | 13 | |
| 2002.amta-papers.9 A major requirement of these approaches is the accessibility of large amounts of explicit ***** symmetric ***** knowledge for both source and target languages. | ||
| D18-1523 Beyond being more accurate, the use of the recurrent LM allows us to effectively query it in a creative way, using what we call dynamic ***** symmetric ***** patterns. | ||
| D17-1303 We introduce a new pairwise ranking loss function which can handle both ***** symmetric ***** and a***** symmetric ***** similarity between the two modalities. | ||
| 2020.coling-main.44 Furthermore, to deal with ***** symmetric ***** and anti***** symmetric ***** relations, two schemas of score function are designed via a position-adaptive mechanism. | ||
| E17-2056 Re-ranking (APE_Rerank) of the n-best translations from the phrase-based APE and APE_Sym systems provides further substantial improvements over the ***** symmetric ***** neural APE model | ||
| SoTA | 13 | |
| 2020.acl-main.577 We show that the model works well for both nested and flat NER through evaluation on 8 corpora and achieving ***** SoTA ***** performance on all of them, with accuracy gains of up to 2.2 percentage points. | ||
| 2021.acl-long.154 UXLA achieves ***** SoTA ***** results in all the tasks, outperforming the baselines by a good margin. | ||
| 2020.coling-main.565 Our model also achieves a comparable performance to the ***** SoTA ***** on the TACRED dataset. | ||
| 2021.acl-long.451 Our proposed framework is easy-to-implement and achieves state-of-the-art (***** SoTA *****) or near ***** SoTA ***** performance on eight English NER datasets, including two flat NER datasets, three nested NER datasets, and three discontinuous NER datasets. | ||
| 2021.naacl-demos.3 Equipped with techniques including data augmentation and multitasking, we show that the proposed framework outperforms the previous ***** SoTA ***** on CCKS CKBQA dataset | ||
| WAT 2021 | 13 | |
| 2021.wat-1.19 In this paper, we propose our system under the team name Volta for the Multimodal Translation Task of ***** WAT 2021 ***** from English to Hindi. | ||
| 2021.wat-1.20 The experimental results evaluated on the BLEU metric provided by the ***** WAT 2021 ***** evaluation site show that the TMEKU system has achieved the best performance among all the participated systems. | ||
| 2021.wat-1.16 This paper provides the description of shared tasks to the ***** WAT 2021 ***** by our team “NLPHut”. | ||
| 2021.wat-1.3 This paper describes our systems that were submitted to the restricted translation task at ***** WAT 2021 *****. | ||
| 2021.wat-1.28 In this paper , we present the details of the systems that we have submitted for the *****WAT 2021***** MultiIndicMT : An Indic Language Multilingual Task . | ||
| Word2vec | 13 | |
| 2020.coling-main.20 To date, most of the approaches to addressing the problem have relied on hand-crafted affect features, or pre-trained models of non-contextual word embeddings, such as ***** Word2vec *****. | ||
| 2020.winlp-1.13 We demonstrate that ***** Word2vec ***** and nouns-only dimensionality reductions are the most successful and stable vector space reduction variants for our task. | ||
| N18-1043 While some methods represent words as vectors computed from text using predictive model (***** Word2vec *****) or dense count based model (GloVe), others attempt to represent these in a distributional thesaurus network structure where the neighborhood of a word is a set of words having adequate context overlap. | ||
| 2020.lrec-1.310 We present a new word analogy test set considering the original English ***** Word2vec ***** analogy test set and some specific linguistic aspects of the Greek language as well. | ||
| S18-1140 For subtask 1 We experimented with two category of word embeddings namely native embeddings and task specific embedding using ***** Word2vec ***** and Glove algorithms | ||
| compatibility | 13 | |
| L12-1601 Although the spoken versions of Tajik and Farsi are mutually intelligible to educated speakers of both languages, the difference between the writing systems constitutes a barrier to text ***** compatibility ***** between the two languages. | ||
| P18-1142 Our results show the effectiveness of this method for both machine translation and cross-lingual sentence similarity, demonstrating the importance of syntactic structure ***** compatibility ***** for boosting cross-lingual transfer in NLP. | ||
| D19-1273 Our system incrementally evaluates each event's ***** compatibility ***** with already selected events, taking order into account. | ||
| N19-1085 In this work, we propose a transfer learning framework for event coreference resolution that utilizes a large amount of unlabeled data to learn argument ***** compatibility ***** of event mentions. | ||
| 2020.dmr-1.8 We propose an approach and a software framework for semantic parsing of natural language sentences to discourse representation structures with use of fuzzy meaning representations such as fuzzy sets and *****compatibility***** intervals . | ||
| deductive | 13 | |
| D18-1361 The 20 Questions (Q20) game is a well known game which encourages ***** deductive ***** reasoning and creativity. | ||
| L08-1059 This paper describes ***** deductive ***** feature detection, one component of a data selection system for machine translation. | ||
| 2020.findings-emnlp.145 In contrast, humans are typically able to generalize with only a few examples, relying on deeper underlying world knowledge, linguistic sophistication, and/or simply superior ***** deductive ***** powers. | ||
| 2021.emnlp-main.378 Specifically, LRN utilizes an auto-regressive network to conduct ***** deductive ***** reasoning and a bipartite attribute graph to conduct inductive reasoning between labels, which can effectively model, learn and reason complex label dependencies in a sequence-to-set, end-to-end manner. | ||
| 2020.findings-emnlp.100 Once these clues are transformed into formal logic, a ***** deductive ***** reasoning process provides the solution | ||
| nominalizations | 13 | |
| 2020.coling-main.274 In addition, we train a baseline QANom parser for identifying ***** nominalizations ***** and labeling their arguments with question-answer pairs. | ||
| W17-3014 We first collect and analyze a corpus of hand-curated, expert-annotated pejorative ***** nominalizations ***** for four target adjectives: female, gay, illegal, and poor. | ||
| L14-1025 NomLex-PT connects verbs to their ***** nominalizations *****, thereby enabling NLP systems to observe the potential semantic relationships between the two words when analysing a text. | ||
| L10-1327 Starting from the constitution of monolingual comparable corpora, we extract two kinds of paraphrases: paraphrases between ***** nominalizations ***** and verbal constructions and paraphrases between neo-classical compounds and modern-language phrases | ||
| L14-1061 Semantic graph corpora have spurred recent interest in graph transduction formalisms , but it is not yet clear whether such formalisms are a good fit for natural language datain particular , for describing how semantic reentrancies correspond to English pronouns , zero pronouns , reflexives , passives , *****nominalizations***** , etc . | ||
| polyglot | 13 | |
| K19-1029 Furthermore, we examine the non-contextual part of the learned language models (which we call a “decontextual probe”) to demonstrate that ***** polyglot ***** language models better encode crosslingual lexical correspondence compared to aligned monolingual language models. | ||
| 2020.acl-main.720 To explain this phenomena, we explore the sources of multilingual transfer in ***** polyglot ***** NER models and examine the weight structure of ***** polyglot ***** models compared to their monolingual counterparts. | ||
| D19-6118 We find that ***** polyglot ***** training on the source languages produces an overall trend of better results on the target language but the single best result for the target language is obtained by projecting from monolingual source parsing models and then training multi-treebank POS tagging and parsing models on the target side. | ||
| 2020.findings-emnlp.279 In fact, even a simple combination of data has been shown to be effective with ***** polyglot ***** training by representing the distant vocabularies in a shared representation space | ||
| 2021.sigtyp-1.5 UDify is the state - of - the - art language - agnostic dependency parser which is trained on a *****polyglot***** corpus of 75 languages . | ||
| correctly | 13 | |
| S19-2132 The misclassified examples are classified ***** correctly ***** in the second level. | ||
| D17-1189 To this end, we introduce an entity-pair level denoise method which exploits semantic information from ***** correctly ***** labeled entity pairs to correct wrong labels dynamically during training. | ||
| P17-1187 The results indicate that WRL can benefit from sememes via the attention scheme, and also confirm our models being capable of ***** correctly ***** modeling sememe information. | ||
| C18-1082 Furthermore, results from the first study showed that tailoring was accurately recognized in most cases, and that participants struggled with ***** correctly ***** identifying whether a text was written by a human or computer. | ||
| L14-1283 We then run a user experiment with a ***** correctly ***** designed tool that demonstrates that very reliable results can be obtained with this method | ||
| hypothesized | 13 | |
| W18-2602 Through an iterative process, challenging aspects were ***** hypothesized ***** through qualitative analysis of the common error cases. | ||
| C16-1026 In order to generate connected parses for such unfinished sentences, upcoming word types can be ***** hypothesized ***** and structurally integrated with already realized words. | ||
| 2014.amta-wptp.3 (2014) ***** hypothesized ***** that post-editing success may be more pronounced when the monolingual post-editors are experts in the domain of the translated documents. | ||
| 2021.blackboxnlp-1.4 Unlike scoring-based methods for targeted syntactic evaluation, this technique makes it possible to explore completions that are not ***** hypothesized ***** in advance by the researcher. | ||
| 2013.iwslt-papers.15 We show that the second stage works better when the rule hypotheses have categories than when they do not, and that the proposed conditional description length approach combines the rules ***** hypothesized ***** by the two stages better than a mixture model does | ||
| SP | 13 | |
| C16-1266 Experimental results demonstrate that the proposed C***** SP ***** model successfully learns C***** SP ***** and outperforms the conventional ***** SP ***** model in coreference cluster ranking. | ||
| N18-1066 To facilitate modeling of this type, we develop a novel graph-based decoding framework that achieves state-of-the-art performance on the above datasets, and apply this method to two other benchmark ***** SP ***** tasks. | ||
| P19-1071 To provide a better evaluation method for ***** SP ***** models, we introduce ***** SP *****-10K, a large-scale evaluation set that provides human ratings for the plausibility of 10,000 ***** SP ***** pairs over five ***** SP ***** relations, covering 2,500 most frequent verbs, nouns, and adjectives in American English. | ||
| 2021.repl4nlp-1.22 Learning ***** SP ***** has generally been seen as a supervised task, because it requires a parsed corpus as a source of syntactically related word pairs. | ||
| 2021.wat-1.21 We find that ***** SP ***** is the overall best choice for segmentation, and that larger dictionary sizes lead to higher translation quality | ||
| prevalent | 13 | |
| 2021.naacl-main.188 While effective and ***** prevalent *****, these models are usually prohibitively large for resource-limited deployment scenarios. | ||
| D19-1335 Our protocols account for several properties ***** prevalent ***** in common-sense benchmarks including size limitations, structural regularities, and variable instance difficulty. | ||
| 2020.findings-emnlp.70 To manage the multi-mapping relations ***** prevalent ***** in human conversation, we augment contrastive dialogue learning with group-wise dual sampling. | ||
| P18-1043 In addition, we demonstrate how commonsense inference on people's intents and reactions can help unveil the implicit gender inequality ***** prevalent ***** in modern movie scripts. | ||
| N19-3018 The #MeToo movement is an ongoing ***** prevalent ***** phenomenon on social media aiming to demonstrate the frequency and widespread of sexual harassment by providing a platform to speak narrate personal experiences of such harassment | ||
| Semeval | 13 | |
| 2020.semeval-1.223 In this work we describe and analyze a supervised learning system for word emphasis selection in phrases drawn from visual media as a part of the ***** Semeval ***** 2020 Shared Task 10. | ||
| S19-2082 We present a number of models used for hate speech detection for ***** Semeval ***** 2019 Task-5: Hateval. | ||
| S19-2156 This paper describes DM-NLP's system for toponym resolution task at ***** Semeval ***** 2019. | ||
| 2020.semeval-1.207 This paper presents our contribution to the Offensive Language Classification Task (English SubTask A) of ***** Semeval ***** 2020. | ||
| S17-2011 This paper describes our system, entitled Idiom Savant, for the 7th Task of the ***** Semeval ***** 2017 workshop, “Detection and interpretation of English Puns” | ||
| additive | 13 | |
| 2019.iwslt-1.6 Upon conducting extensive experiments, we found that (i) the explored visual integration schemes often harm the translation performance for the transformer and ***** additive ***** deliberation, but considerably improve the cascade deliberation; (ii) the transformer and cascade deliberation integrate the visual modality better than the ***** additive ***** deliberation, as shown by the incongruence analysis. | ||
| D19-1158 We present a model of color modifiers that, compared with previous ***** additive ***** models in RGB space, learns more complex transformations. | ||
| 2021.starsem-1.22 Due to the approaches' ***** additive ***** effects, their combination decreases the cross-lingual transfer gap by 8.9 points (m-BERT) and 18.2 points (XLM-R) on average across all tasks and languages, however. | ||
| 2020.clinicalnlp-1.20 The proposed combination of TAScore and Fusedmax projection achieves a 10 point increase in Longest Common Substring F1 compared to the baseline of ***** additive ***** scoring plus softmax projection | ||
| 2020.findings-emnlp.129 We introduce exBERT , a training method to extend BERT pre - trained models from a general domain to a new pre - trained model for a specific domain with a new *****additive***** vocabulary under constrained training resources ( i.e. , constrained computation and data ) . | ||
| cuneiform | 13 | |
| W19-1402 The deep neural network achieved 77% accuracy on the test data, which turned out to be the best performance at the CLI evaluation, establishing a new state-of-the-art for ***** cuneiform ***** language identification. | ||
| 2020.lrec-1.433 Because ***** cuneiform ***** text does not mark the inflection for logograms, the inflected form needs to be inferred from the sentence context. | ||
| W19-1409 This article introduces a corpus of ***** cuneiform ***** texts from which the dataset for the use of the Cuneiform Language Identification (CLI) 2019 shared task was derived as well as some preliminary language identification experiments conducted using that corpus. | ||
| L16-1642 To our best knowledge, this is the first study of this kind applied to either the Akkadian language or the ***** cuneiform ***** writing system | ||
| W19-1420 Identification of the languages written using *****cuneiform***** symbols is a difficult task due to the lack of resources and the problem of tokenization . | ||
| Emoji | 13 | |
| W18-6230 In this paper, we successfully show that features extracted using multiple pre-trained embeddings can be used to improve the overall performance of the system with ***** Emoji ***** being one of the significant features. | ||
| S18-1022 Our team KDE-AFFECT employs several methods including one-dimensional Convolutional Neural Network for n-grams, together with word embedding and other preprocessing such as vocabulary unification and ***** Emoji ***** conversion into four emotional words. | ||
| 2021.wanlp-1.7 ***** Emoji ***** (the popular digital pictograms) are sometimes seen as a new kind of artificial and universally usable and consistent writing code | ||
| S18-1066 This paper presents our single model to Subtask 1 of SemEval 2018 Task 2 : *****Emoji***** Prediction in English . | ||
| S18-1080 We present the system built for SemEval-2018 Task 2 on *****Emoji***** Prediction . | ||
| incorrect | 13 | |
| D19-1644 In this work, we propose a neural two-stage approach to recognizing discontiguous and overlapping entities by decomposing this problem into two subtasks: 1) it first detects all the overlapping spans that either form entities on their own or present as segments of discontiguous entities, based on the representation of segmental hypergraph, 2) next it learns to combine these segments into discontiguous entities with a classifier, which filters out other ***** incorrect ***** combinations of segments. | ||
| 2020.wmt-1.77 Finally, we investigate whether we can use automatic metrics to flag ***** incorrect ***** human ratings. | ||
| W17-0911 The classifier is trained to distinguish correct story endings given in the training data from ***** incorrect ***** ones that we artificially generate. | ||
| 1998.amta-papers.20 The preliminary results show that correct stored sentences can be retrieved based on the words contained in the ***** incorrect ***** input sentence. | ||
| 2020.coling-main.179 The lack of modeling for table structure and (3) improving text fidelity with less ***** incorrect ***** expressions that are contradicting to the table | ||
| capitalization | 13 | |
| W18-1102 Yet we find that both genders use ***** capitalization ***** in a similar way when expressing sentiment. | ||
| K19-1012 Our technical contributions include ways of handling large vocabularies, algorithms to correct ***** capitalization ***** errors in user data, and efficient finite state transducer algorithms to convert word language models to word-piece language models and vice versa. | ||
| D19-1650 For those languages which use it, ***** capitalization ***** is an important signal for the fundamental NLP tasks of Named Entity Recognition (NER) and Part of Speech (POS) tagging. | ||
| L14-1192 NER on microblogs presents many complications such as informality of language, shortened named entities, brevity of expressions, and inconsistent ***** capitalization ***** (for cased languages). | ||
| L12-1045 Step 1 extracts contexts of training examples as rules describing this category from text, considering part of speech, ***** capitalization ***** and category membership as features | ||
| SuperGLUE | 13 | |
| 2020.sustainlp-1.24 We describe the SustaiNLP 2020 shared task: efficient inference on the ***** SuperGLUE ***** benchmark (Wang et al., 2019). | ||
| 2020.emnlp-main.381 For the first time, a benchmark of nine tasks, collected and organized analogically to the ***** SuperGLUE ***** methodology, was developed from scratch for the Russian language. | ||
| 2020.sustainlp-1.20 Applying the proposed recipes to the ***** SuperGLUE ***** benchmark, we achieve from 9.8x up to 233.9x speed-up compared to out-of-the-box models on CPU. | ||
| 2021.emnlp-main.407 Recently, pre-trained language models (LMs) have achieved strong performance when fine-tuned on difficult benchmarks like ***** SuperGLUE *****. | ||
| 2021.acl-long.381 Our method, initialized with task-specific human-readable prompts, also works in a few-shot setting, outperforming GPT-3 on two ***** SuperGLUE ***** tasks with just 32 training samples | ||
| algebraic | 13 | |
| 2000.iwpt-1.24 Furthermore, it allows to derive parsing schemata for linear indexed grammars (LIG) from parsing schemata for context-free grammars by means of a correctness preserving ***** algebraic ***** transformation. | ||
| W17-2628 We investigate the pertinence of methods from ***** algebraic ***** topology for text data analysis. | ||
| Q16-1028 It also helps explain why low-dimensional semantic embeddings contain linear ***** algebraic ***** structure that allows solution of word analogies, as shown by Mikolov et al. | ||
| 2020.coling-main.321 A new set of word vectors is generated by a spectral decomposition of the similarity matrix, which has a linear ***** algebraic ***** analytic form. | ||
| P19-3028 Parallax allows the user to use both state-of-the-art embedding analysis methods (PCA and t-SNE) and a simple yet effective task-oriented approach where users can explicitly define the axes of the projection through ***** algebraic ***** formulae | ||
| Author | 13 | |
| 2021.dravidianlangtech-1.22 Title: JudithJeyafreedaAndrew@DravidianLangTech-EACL2021:Offensive language detection for Dravidian Code-mixed YouTube comments ***** Author *****: Judith Jeyafreeda Andrew Messaging online has become one of the major ways of communication. | ||
| W17-1205 *****Author***** profiling is the study of how language is shared by people , a problem of growing importance in applications dealing with security , in order to understand who could be behind an anonymous threat message , and marketing , where companies may be interested in knowing the demographics of people that in online reviews liked or disliked their products . | ||
| 2020.starsem-1.19 *****Author***** obfuscation is the task of masking the author of a piece of text , with applications in privacy . | ||
| W19-4023 *****Author***** profiling is the identification of an author 's gender , age , and language from his / her texts . | ||
| 2020.restup-1.1 *****Author***** profiling studies how language is shared by people . | ||
| given | 13 | |
| P19-1497 As contents are highly limited by ***** given ***** answers, these questions are often not worth discussing. | ||
| 2021.acl-long.34 The relevance score is computed between the pseudo reference built from the source document and the ***** given ***** summary, where the pseudo reference content is weighted by the sentence centrality to provide importance guidance. | ||
| W18-1705 Motivated by the document-term co-clustering framework by Dhillon (2001), we propose a landmark-based scalable spectral clustering approach in which we first use the selected landmark set and the ***** given ***** data to form a bipartite graph and then run a diffusion process on it to obtain a family of diffusion coordinates for clustering. | ||
| W19-0419 In contrast, here we seek to establish whether this knowledge can be acquired automatically by a neural network system through a two phase training procedure: A (slow) offline learning stage where the network learns about the general structure of the task and a (fast) online adaptation phase where the network learns the language of a new ***** given ***** speaker. | ||
| W19-4204 First, a lemma is produced for a ***** given ***** word, and then both the lemma and the ***** given ***** word are used for morphological analysis | ||
| mention | 13 | |
| D19-1650 Finally, we show that our proposed solution gives an 8% F1 improvement in ***** mention ***** detection on noisy out-of-domain Twitter data. | ||
| U19-1020 We examine the benefit of MTL for three specific pairs of health informatics tasks that deal with: (a) overlapping symptoms for the same classification problem (personal health ***** mention ***** classification for influenza and for a set of symptoms); (b) overlapping medical concepts for related classification problems (vaccine usage and drug usage detection); and, (c) related classification problems (vaccination intent and vaccination relevance detection). | ||
| W19-5015 We compare the two kinds of representations (word versus context) for three classification problems: influenza infection classification, drug usage classification and personal health ***** mention ***** classification. | ||
| P19-1108 The introduction of figurative usage detection results in an average improvement of 2.21% F-score of personal health ***** mention ***** detection, in the case of the feature augmentation-based approach. | ||
| 2020.lrec-1.1 The first approach is based on the ***** mention ***** detection part of a state of the art coreference resolution system; the second uses ELMO embeddings together with a bidirectional LSTM and a biaffine classifier; the third approach uses the recently introduced BERT model | ||
| diverse | 13 | |
| L10-1372 However, text cluster content is often ***** diverse *****. | ||
| D17-1269 Further, when Wikipedia is available in the target language, our method can enhance Wikipedia based methods to yield state-of-the-art NER results; we evaluate on 7 ***** diverse ***** languages, improving the state-of-the-art by an average of 5.5% F1 points. | ||
| D19-1100 Our approach improves upon current state-of-the-art methods for cross-lingual named entity recognition on 5 ***** diverse ***** languages by an average of 4.1 points. | ||
| 2020.emnlp-main.528 MOCHA contains 40K human judgement scores on model outputs from 6 ***** diverse ***** question answering datasets and an additional set of minimal pairs for evaluation. | ||
| N19-1245 Due to annotation challenges, current datasets in this domain have been either relatively small in scale or did not offer precise operational annotations over ***** diverse ***** problem types | ||
| robust | 13 | |
| 2020.emnlp-main.650 Diverse data is crucial for training ***** robust ***** models, but crowdsourced text often lacks diversity as workers tend to write simple variations from prompts. | ||
| D19-1334 Most previous methods assume that every relation in KGs has enough triples for training, regardless of those few-shot relations which cannot provide sufficient triples for training ***** robust ***** reasoning models. | ||
| 2020.lrec-1.780 For many real-world applications, there is a lack of sufficient data that can be directly used for training ***** robust ***** speech recognition systems. | ||
| 2021.emnlp-main.126 In this work, we propose a learning strategy for training ***** robust ***** models by drawing connections between adversarial examples and the failure cases of zero-shot cross-lingual transfer. | ||
| N18-3016 Although our dataset is much larger than those currently available, it is small on the scale of datasets commonly used for training ***** robust ***** neural network models | ||
| Neural abstractive summarization | 13 | |
| N18-2097 ***** Neural abstractive summarization ***** models have led to promising results in summarizing relatively short documents. | ||
| 2021.naacl-main.384 ***** Neural abstractive summarization ***** models are flexible and can produce coherent summaries, but they are sometimes unfaithful and can be difficult to control. | ||
| W19-8664 ***** Neural abstractive summarization ***** models have been successful in generating fluent and consistent summaries with advancements like the copy (Pointer-generator) and coverage mechanisms. | ||
| 2020.acl-main.458 ***** Neural abstractive summarization ***** models are able to generate summaries which have high overlap with human references. | ||
| 2020.emnlp-main.506 ***** Neural abstractive summarization ***** systems have achieved promising progress, thanks to the availability of large-scale datasets and models pre-trained with self-supervised methods | ||
| translate | 13 | |
| C18-1050 Current neural machine translation (NMT) systems ***** translate ***** a text in a conventional sentence-by-sentence fashion, ignoring such cross-sentence links and dependencies. | ||
| 2018.iwslt-1.7 Multi-source translation systems ***** translate ***** from multiple languages to a single target language. | ||
| 2021.eacl-srw.19 State-of-the-art (SOTA) neural machine translation (NMT) systems ***** translate ***** texts at sentence level, ignoring context: intra-textual information, like the previous sentence, and extra-textual information, like the gender of the speaker. | ||
| 2010.iwslt-papers.10 Current Statistical Machine Translation (SMT) systems ***** translate ***** texts sentence by sentence without considering any cross-sentential context. | ||
| 2018.iwslt-1.11 Our proposal uses back-***** translate *****d data to: (a) create new sentences, so the system can be trained with more data; and (b) ***** translate ***** sentences that are close to the test set, so the model can be fine-tuned to the document to be ***** translate *****d | ||
| personalized | 13 | |
| E17-1091 The cataloging of product listings through taxonomy categorization is a fundamental problem for any e-commerce marketplace, with applications ranging from ***** personalized ***** search recommendations to query understanding. | ||
| 2021.nlp4convai-1.17 Many of the user defects stem from ***** personalized ***** factors, such as user's speech pattern, dialect, or preferences. | ||
| 2021.acl-long.383 Transformer, which is demonstrated with strong language modeling capability, however, is not ***** personalized ***** and fails to make use of the user and item IDs since the ID tokens are not even in the same semantic space as the words. | ||
| 2021.naacl-main.160 We propose an interview assistant system to automatically, and in an objective manner, select an optimal set of technical questions (from question banks) ***** personalized ***** for a candidate. | ||
| 2020.coling-main.166 This task involves the representation and fusion of multimodal information and natural language generation, which presents the challenges from three aspects: (1) how to encode and integrate multimodal inputs; (2) how to generate feedback specific to each modality; and (3) how to fulfill ***** personalized ***** feedback generation | ||
| Image | 13 | |
| 2021.eacl-main.48 *****Image***** captioning has focused on generalizing to images drawn from the same distribution as the training set , and not to the more challenging problem of generalizing to different distributions of images . | ||
| 2020.acl-main.664 *****Image***** captioning is a multimodal problem that has drawn extensive attention in both the natural language processing and computer vision community . | ||
| 2020.coling-main.278 *****Image***** text carries essential information to understand the scene and perform reasoning . | ||
| P19-1652 *****Image***** Captioning aims at generating a short description for an image . | ||
| W19-1803 *****Image***** captioning applied to biomedical images can assist and accelerate the diagnosis process followed by clinicians . | ||
| sentential paraphrases | 13 | |
| L12-1452 We present a framework for the acquisition of ***** sentential paraphrases ***** based on crowdsourcing. | ||
| L14-1094 To generate ***** sentential paraphrases ***** we use a standard phrase-based machine translation (PBMT) framework modified with a re-ranking component (henceforth PBMT-R). | ||
| D17-1126 In addition, we show that more than 30,000 new ***** sentential paraphrases ***** can be easily and continuously captured every month at ~70% precision, and demonstrate their utility for downstream NLP tasks through phrasal paraphrase extraction. | ||
| D17-1026 We use neural machine translation to generate ***** sentential paraphrases ***** via back-translation of bilingual sentence pairs. | ||
| 2020.clinicalnlp-1.30 We propose going beyond data augmentation via paraphrase-optimized multi-task learning and observe that it is useful in correctly handling unseen ***** sentential paraphrases ***** as inputs | ||
| computer assisted | 13 | |
| L08-1494 The ***** computer assisted ***** process engages human annotators to check and correct the automatic annotation rather than starting the annotation from un-annotated data. | ||
| 2010.amta-government.3 Since at least 2004, United Nations has been exploring, piloting, and implementing ***** computer assisted ***** translation (CAT) with Trados as an officially selected vehicle. | ||
| L14-1008 This paper describes a method of generating a reduced phoneme set for dialogue-based ***** computer assisted ***** language learning (CALL)systems. | ||
| 2012.amta-papers.22 This paper addresses the problem of reliably measuring productivity gains by professional translators working with a machine translation enhanced ***** computer assisted ***** translation tool. | ||
| W19-1705 Then we point out some of the challenges associated with developing ***** computer assisted ***** tools for Blissymbolics. | ||
| dictionary definitions | 13 | |
| D19-1357 While the meanings of defining words are important in ***** dictionary definitions *****, it is crucial to capture the lexical semantic relations between defined words and defining words. | ||
| 2021.eacl-main.42 We introduce WDLMPro (Word Definitions Language Model Probing) to evaluate word understanding directly using ***** dictionary definitions ***** of words. | ||
| D18-1181 This paper presents a simple model that learns to compute word embeddings by processing ***** dictionary definitions ***** and trying to reconstruct them. | ||
| 2020.wmt-1.65 In this paper, we describe a new method for “attaching” ***** dictionary definitions ***** to rare words so that the network can learn the best way to use them. | ||
| P18-2043 We explore recently introduced definition modeling technique that provided the tool for evaluation of different distributed vector representations of words through modeling ***** dictionary definitions ***** of words. | ||
| clinical texts | 13 | |
| 2021.louhi-1.2 Negation scope resolution is key to high-quality information extraction from ***** clinical texts *****, but so far, efforts to make encoders used for information extraction negation-aware have been limited to English. | ||
| 2020.lrec-1.561 A popular application for that purpose is named entity recognition (NER), but the annotation policies of existing clinical corpora have not been standardized across ***** clinical texts ***** of different types. | ||
| W16-4214 Our ***** clinical texts ***** therefore differ from those in the i2b2 shared tasks which are in prose form with complete sentences. | ||
| 2020.clinicalnlp-1.8 Medical code assignment, which predicts medical codes from ***** clinical texts *****, is a fundamental task of intelligent medical information systems. | ||
| 2020.starsem-1.1 However, because of the linguistic idiosyncrasies of ***** clinical texts ***** (e.g., shorthand jargon), solely relying on domain knowledge from an external knowledge base (e.g., UMLS) can lead to wrong inference predictions as it disregards contextual information and, hence, does not return the most relevant mapping. | ||
| named entity recognizer | 13 | |
| 2010.jeptalnrecital-court.36 A ***** named entity recognizer ***** and a part of speech tagger are applied on each of these sentences to encode necessary information.We classify the sentences based on their subject, verb, object and preposition for determining the possible type of questions to be generated. | ||
| D19-5531 Robustness to capitalization errors is a highly desirable characteristic of ***** named entity recognizer *****s, yet we find standard models for the task are surprisingly brittle to such noise.Existing methods to improve robustness to the noise completely discard given orthographic information, which significantly degrades their performance on well-formed text. | ||
| L12-1377 The toolset comprise paragraph-, sentence- and token-level segmenter, morphological analyser, disambiguating tagger, shallow and deep parser, ***** named entity recognizer ***** and coreference resolver. | ||
| L16-1097 The methods are based on a bilingual ***** named entity recognizer ***** that uses a monolingual ***** named entity recognizer ***** with transliteration. | ||
| D19-1399 In addition, the performance of a ***** named entity recognizer ***** could benefit from the long-distance dependencies between the words in dependency trees. | ||
| nowadays | 13 | |
| 2012.amta-tutorials.6 In fact, an increasing number of language service providers and in-house translation services of large companies is ***** nowadays ***** integrating SMT in their workflow. | ||
| 2021.wat-1.18 Neural Machine Translation (NMT) is a predominant machine translation technology ***** nowadays ***** because of its end-to-end trainable flexibility. | ||
| L06-1315 The aim of this article is to provide a statistical representation of significant terms used in the field of Natural Language Processing from the 1960s till ***** nowadays *****, in order to draft a survey on the most significant research trends in that period. | ||
| L08-1444 Since large text corpora ***** nowadays ***** are easily available and inflectional systems are in general well understood, it seems feasible to acquire lexical data from raw texts, guided by our knowledge of inflection. | ||
| 2020.acl-main.660 We then aim to offer a climax, suggesting that incorporating symbolic ideas proposed in SPMRL terms into ***** nowadays ***** neural architectures has the potential to push NLP for MRLs to a new level. | ||
| lexical ambiguity | 13 | |
| 2021.wmt-1.63 This paper presents a document-level corpus annotated in English with context-aware issues that arise when translating from English into Brazilian Portuguese, namely ellipsis, gender, ***** lexical ambiguity *****, number, reference, and terminology, with six different domains. | ||
| 2006.amta-papers.22 We demonstrate that SMT accuracy can be improved in a cross-domain application by using a controlled language (CL) interface to help reduce ***** lexical ambiguity ***** in the input text. | ||
| S18-1088 The paper describes our search for a universal algorithm of detecting intentional ***** lexical ambiguity ***** in different forms of creative language. | ||
| 2020.emnlp-main.328 To investigate whether this is the case, we operationalise the ***** lexical ambiguity ***** of a word as the entropy of meanings it can take, and provide two ways to estimate this—one which requires human annotation (using WordNet), and one which does not (using BERT), making it readily applicable to a large number of languages. | ||
| 2021.starsem-1.13 Second, we argue that models should not exploit the synthetic topic structure of the standard ECB+ dataset, forcing models to confront the ***** lexical ambiguity ***** challenge, as intended by the dataset creators. | ||
| canonical correlation analysis | 13 | |
| D19-1577 In this work, we present the first large-scale investigation of the arbitrariness of gender assignment that uses ***** canonical correlation analysis ***** as a method for correlating the gender of inanimate nouns with their lexical semantic meaning. | ||
| 2020.emnlp-main.187 We propose to fuse both views using singular vector ***** canonical correlation analysis ***** and study what kind of information is induced from each source. | ||
| D19-1448 In this work, we use ***** canonical correlation analysis ***** and mutual information estimators to study how information flows across Transformer layers and observe that the choice of the objective determines this process. | ||
| W18-3305 This work focuses on improving the representations of these views by performing a deep ***** canonical correlation analysis ***** with the representations of the better performing manual transcription view. | ||
| 2020.emnlp-main.115 We also explore the relationships between learned features from structured and unstructured variables using projection-weighted ***** canonical correlation analysis *****. | ||
| aspect terms | 13 | |
| 2020.findings-emnlp.72 To address the issue, we present a novel view of ABSA as an opinion triplet extraction task, and propose a multi-task learning framework to jointly extract ***** aspect terms ***** and opinion terms, and simultaneously parses sentiment dependencies between them with a biaffine scorer. | ||
| 2020.findings-emnlp.6 The polarities sequence is designed to depend on the generated ***** aspect terms ***** labels. | ||
| P19-1048 Aspect-based sentiment analysis produces a list of ***** aspect terms ***** and their corresponding sentiments for a natural language sentence. | ||
| 2020.emnlp-main.164 Aspect term extraction (ATE) aims to extract ***** aspect terms ***** from a review sentence that users have expressed opinions on. | ||
| 2021.acl-long.27 Our goal is to transfer ***** aspect terms ***** by actively supplementing transferable knowledge. | ||
| linguistic annotations | 13 | |
| L08-1412 The high level of heterogeneity between ***** linguistic annotations ***** usually complicates the interoperability of processing modules within an NLP pipeline. | ||
| L12-1437 We describe FoLiA, a new XML format geared at rich ***** linguistic annotations *****. | ||
| 2020.law-1.2 We demonstrate how ***** linguistic annotations ***** from separate corpora can be reliably linked from the start, and thereby be accessed and queried as if they were a single dataset. | ||
| D19-5901 Crowdsourcing is frequently employed to quickly and inexpensively obtain valuable ***** linguistic annotations ***** but is rarely used for parsing, likely due to the perceived difficulty of the task and the limited training of the available workers. | ||
| 2020.lrec-1.13 These dialogs are given rich ***** linguistic annotations ***** by expert linguists for several types of reference mentions and named entity mentions, either of which can span multiple words, as well as for coreference links between mentions. | ||
| nominal compounds | 13 | |
| 2014.lilt-10.1 The initial evaluation of our approach to ***** nominal compounds ***** are fixed expressions, requiring individual semantic specification at the lexical level. | ||
| W17-2211 In this paper, we present our preliminary study on an ontology-based method to extract and classify compositional ***** nominal compounds ***** in specific domains of knowledge. | ||
| W19-5107 To evaluate the impact of the association metrics we manually annotated corpora with three different syntactic patterns of collocations (adjective-noun, verb-object and ***** nominal compounds *****). | ||
| L14-1429 Focusing on ***** nominal compounds *****, the expressions obtained from each corpus are of comparable quality and indicate that corpus origin has no impact on this task. | ||
| 2006.bcs-1.9 The linguistic analyzer processes both documents to be indexed and queries to produce a set of normalized lemmas, a set of named entities and a set of ***** nominal compounds ***** with their morpho-syntactic tags. | ||
| text classifiers | 13 | |
| P19-1284 We focus on ***** text classifiers ***** and make them more interpretable by having them provide a justification–a rationale–for their predictions. | ||
| 2020.lrec-1.669 In the second experiment, ***** text classifiers ***** were trained with the original questions, and tested when assigning each variation to one of three possible sources, or assigning them as out-of-domain. | ||
| 2020.emnlp-main.24 In this paper, we propose FIND – a framework which enables humans to debug deep learning ***** text classifiers ***** by disabling irrelevant hidden features. | ||
| 2020.emnlp-main.670 Recent advances in weakly supervised learning enable training high-quality ***** text classifiers ***** by only providing a few user-provided seed words. | ||
| 2020.acl-main.492 In this work, we investigate the use of influence functions for NLP, providing an alternative approach to interpreting neural ***** text classifiers *****. | ||
| language research | 13 | |
| L10-1546 Linguistic Data Consortiums Human Subjects Data Collection lab conducts multi-modal speech collections to develop corpora for use in speech, speaker and ***** language research ***** and evaluations. | ||
| L12-1240 Automatic annotation of gesture strokes is important for many gesture and sign ***** language research *****ers. | ||
| W19-4013 The annotation campaign, specific in terms of setting, subjectivity and the multifunctionality of items under investigation, resulted in a preliminary lexicon of formulaic sequences in spoken Slovenian with immediate potential for future explorations in formulaic ***** language research *****. | ||
| P19-1182 Describing images with text is a fundamental problem in vision-***** language research *****. | ||
| 2020.signlang-1.34 Processing strings of characters instead of images can significantly contribute to sign ***** language research *****. | ||
| edit distance | 13 | |
| L10-1410 Our search engine uses a generalized variant of the ***** edit distance ***** algorithm that allows defining text-specific string to string transformations in addition to the default edit operations defined in ***** edit distance *****. | ||
| L14-1601 None of the computed results (inter-annotator agreement, ***** edit distance *****, majority annotation) allow any strong correlation between the considered criteria and the level of seriousness to be shown, which underlines the difficulty for a human to determine whether a ASR error is serious or not. | ||
| 2020.eamt-1.19 In this paper, we introduce sentence encoders to improve matching and retrieving process in Translation Memories systems - an effective and efficient solution to replace ***** edit distance *****-based algorithms. | ||
| L10-1622 In particular, we investigate the applicability of a DBN framework initially proposed by Filali and Bilmes (2005) to learn ***** edit distance ***** estimation parameters for use in pronunciation classification. | ||
| 2012.amta-wptp.2 Post-editing machine translations has been attracting increasing attention both as a common practice within the translation industry and as a way to evaluate Machine Translation (MT) quality via ***** edit distance ***** metrics between the MT and its post-edited version. | ||
| automatically extracting | 13 | |
| C16-1320 As an additional objective, we discuss two novel use cases including ***** automatically extracting ***** links to public datasets from the proceedings, which would further accelerate the advancement in digital libraries. | ||
| L14-1477 In this paper, we present T2Kˆ2, a suite of tools for ***** automatically extracting ***** domain―specific knowledge from collections of Italian and English texts. | ||
| P19-1408 In this paper, we propose the MINA algorithm for ***** automatically extracting ***** minimum spans to benefit from minimum span evaluation in all corpora. | ||
| E17-1083 A well-established technique for ***** automatically extracting ***** paraphrases leverages bilingual corpora to find meaning-equivalent phrases in a single language by “pivoting” over a shared translation in another language. | ||
| P19-1513 In this paper we build two datasets and develop a framework (TDMS-IE) aimed at ***** automatically extracting ***** task, dataset, metric and score from NLP papers, towards the automatic construction of leaderboards. | ||
| electronic dictionary | 13 | |
| L06-1373 The EDR ***** electronic dictionary ***** is a machine-tractable dictionary developed for advanced computer-based processing of natural lan-guage. | ||
| 2005.mtsummit-wpt.6 Domain tuned resources are based on contrastive studies of multilingual patent documents and are handled by an ***** electronic dictionary ***** with a powerful user-friendly environment for acquisition, editing, browsing, defaulting and coherence proofing. | ||
| L14-1538 Besides, FSTs/FSA are also used to match our ***** electronic dictionary ***** entries (ALUs, or Atomic Linguistic Units) to RDF subject, object and predicate (SKOS Core Vocabulary). | ||
| W18-3804 The only drawback is that PV not listed in the dictionary (e.g., archaic forms, recent neologisms) are not identified; however, new PV can easily be added to the ***** electronic dictionary *****, which is freely available to all. | ||
| L10-1303 We describe a new Arabic spelling correction system which is intended for use with *****electronic dictionary***** search by learners of Arabic . | ||
| deep linguistic processing | 13 | |
| W19-4820 Our results reveal surprisingly strong differences between language models, and give insights into where the ***** deep linguistic processing *****, that integrates information over multiple sentences, is happening in these models. | ||
| L04-1157 The aim of the work reported in this paper is to provide robust ***** deep linguistic processing ***** in order to make the grammar more adequate for industrial NLP applications. | ||
| L10-1343 The task of parse disambiguation has gained in importance over the last decade as the complexity of grammars used in ***** deep linguistic processing ***** has been increasing. | ||
| 2016.iwslt-1.24 Three MT systems are tested: (1) our Chimera, a tight combination of phrase-based MT and ***** deep linguistic processing *****, (2) Neural Monkey, our implementation of a NMT system in TensorFlow and (3) Nematus, an established NMT system. | ||
| L08-1012 We evaluate the extent to which the distinction between semantically core and non - core dependents as used in the FrameNet corpus corresponds to the traditional distinction between syntactic complements and modifiers of a verb , for the purposes of harvesting a wide - coverage verb lexicon from FrameNet for use in *****deep linguistic processing***** applications . | ||
| environment | 13 | |
| L12-1611 We focus specifically on speech and gesture interaction which can enhance the quality of lifestyle of people living in assistive ***** environment *****s, be they seniors or people with physical or cognitive disabilities. | ||
| 2003.mtsummit-systems.4 Its Web-based interface and multi-user architecture enable a centralized and efficient work ***** environment ***** for local and geographically disbursed individual users and teams. | ||
| D19-1218 We build a game ***** environment ***** to study this scenario, and learn to map user instructions to system actions. | ||
| I17-3014 A grammar pattern consists of a head word (verb, noun, or adjective) and its syntactic ***** environment *****. | ||
| I17-3015 The proposed framework provides a pioneering example of on-demand knowledge validation in dialog ***** environment ***** to address such needs in AI agents/chatbots. | ||
| document processing | 13 | |
| 2021.ranlp-1.65 This is particularly evident in the historical ***** document processing ***** field. | ||
| 2020.findings-emnlp.237 Combining this with Ensemble Knowledge Distillation, we maintain state-of-the-art performance 66.9% of CoNLL F1 on ETRI test set while achieving 2x speedup (30 doc/sec) in ***** document processing ***** time. | ||
| 2020.sdp-1.1 However, the various strands of research on scholarly ***** document processing ***** remain fragmented. | ||
| 2020.sdp-1.9 We introduce a novel scientific ***** document processing ***** task for making previously inaccessible information in printed paper documents available to automatic processing. | ||
| 2020.wildre-1.4 Natural language understanding by automatic tools is the vital requirement for *****document processing***** tools . | ||
| differences | 13 | |
| E17-5002 The technical ***** differences ***** between NMT and the previously dominant phrase-based statistical approach require that practictioners learn new best practices for building MT systems, ranging from different hardware requirements, new techniques for handling rare words and monolingual data, to new opportunities in continued learning and domain adaptation.This tutorial is aimed at researchers and users of machine translation interested in working with NMT. | ||
| Q15-1023 We attack this confusion by analyzing ***** differences ***** between several versions of the EL problem and presenting a simple yet effective, modular, unsupervised system, called Vinculum, for entity linking. | ||
| P18-1089 Owing to these ***** differences *****, cross-domain sentiment classification is still a challenging task. | ||
| 2021.woah-1.10 In hate speech detection, however, equalizing model predictions may ignore important ***** differences ***** among targeted social groups, as hate speech can contain stereotypical language specific to each SGT. | ||
| 2020.acl-main.38 We then compare the subtle ***** differences ***** in computation order in considerable detail, and present a parameter initialization method that leverages the Lipschitz constraint on the initialization of Transformer parameters that effectively ensures training convergence. | ||
| named entity annotation | 13 | |
| 2020.lrec-1.862 Using semantic and contextual information, non-speakers of a language familiar with the Latin script can produce high quality ***** named entity annotation *****s to support construction of a name tagger. | ||
| 2020.lrec-1.37 In this paper, we investigate which variables influence the time spent on a ***** named entity annotation ***** task by a human. | ||
| C16-1111 The gold-standard ***** named entity annotation *****s are made by a combination of NLP experts and crowd workers, which enables us to harness crowd recall while maintaining high quality. | ||
| 2020.lrec-1.565 We evaluate the quality of our annotations intrinsically by double annotating the entire treebank and extrinsically by comparing our annotations to a recently released ***** named entity annotation ***** of the validation and test sections of the Danish Universal Dependencies treebank. | ||
| 2020.lrec-1.245 In response, we have designed a novel ***** named entity annotation ***** scheme and associated guidelines for this domain, which covers hazards, consequences, mitigation strategies and project attributes. | ||
| high performance | 13 | |
| 2014.amta-wptp.15 Such post-editing (e.g., PET [Aziz et al., 2012]) can be used practically for translation between European languages, which has a ***** high performance ***** in statistical machine translation. | ||
| 2020.findings-emnlp.74 Existing NLP datasets contain various biases that models can easily exploit to achieve ***** high performance *****s on the corresponding evaluation sets. | ||
| 2021.naacl-main.416 However, existing report generation systems, despite achieving ***** high performance *****s on natural language generation metrics such as CIDEr or BLEU, still suffer from incomplete and inconsistent generations. | ||
| L08-1543 Many taggers are designed with different approaches to reach ***** high performance ***** and accuracy. | ||
| 2021.ranlp-1.158 A NER system is trained by applying the state-of-the-art deep learning method BERT to the collected data and its ***** high performance ***** on search engine queries is reported. | ||
| entity mention | 13 | |
| L10-1596 This paper describes the 2009 resource creation efforts, with particular focus on the selection and development of named ***** entity mention *****s for the Entity Linking task evaluation. | ||
| N18-2057 In particular, we focus on the task of named entity classification, defined as identifying the correct label (e.g., person or organization name) of an ***** entity mention ***** in a given context. | ||
| N18-1002 The task of Fine-grained Entity Type Classification (FETC) consists of assigning types from a hierarchy to ***** entity mention *****s in text. | ||
| 2020.findings-emnlp.409 The goal of Document-level Relation Extraction (DRE) is to recognize the relations between ***** entity mention *****s that can span beyond sentence boundary. | ||
| W19-2804 Clustering unlinkable ***** entity mention *****s across documents in multiple languages (cross-lingual NIL Clustering) is an important task as part of Entity Discovery and Linking (EDL). | ||
| biomedical natural language | 13 | |
| W19-5010 Automatic identification and expansion of ambiguous abbreviations are essential for ***** biomedical natural language ***** processing applications, such as information retrieval and question answering systems. | ||
| W18-2323 High quality word embeddings are of great significance to advance applications of ***** biomedical natural language ***** processing. | ||
| W18-5622 Many applications in ***** biomedical natural language ***** processing rely on sequence tagging as an initial step to perform more complex analysis. | ||
| W16-5111 Our objective in this paper is to present to the ***** biomedical natural language ***** processing, data science, and public health communities data sets (annotated and unannotated), tools and resources that we have collected and created from social media. | ||
| 2021.naacl-main.139 Contextual word embedding models, such as BioBERT and Bio_ClinicalBERT, have achieved state-of-the-art results in ***** biomedical natural language ***** processing tasks by focusing their pre-training process on domain-specific corpora. | ||
| segments | 13 | |
| 2020.lrec-1.641 The texts come from different sources: daily newspaper articles, Czech translation of the Wall Street Journal, transcribed dialogs and a small amount of user-generated, short, often non-standard language ***** segments ***** typed into a web translator. | ||
| Q14-1014 We introduce a method for automatically segmenting a corpus into chunks such that many uncertain labels are grouped into the same chunk, while human supervision can be omitted altogether for other ***** segments *****. | ||
| W18-3303 Most of the current multimodal research in this area deals with various techniques to fuse the modalities, and mostly treat the ***** segments ***** of a video independently. | ||
| N18-2075 Text segmentation, the task of dividing a document into contiguous ***** segments ***** based on its semantic structure, is a longstanding challenge in language understanding. | ||
| W16-3714 To this end, we explore the use of distinctive feature weights, lexical tone confusions, and a two-step clustering algorithm to learn projections of phoneme ***** segments ***** from mismatched multilingual transcriber languages to the target language. | ||
| experiment | 13 | |
| 2021.acl-long.96 We also carry out multiple ***** experiment *****s to measure how much each augmentation strategy improves the performance of automatic scoring systems. | ||
| L12-1393 Our ***** experiment *****al evaluation showed that this approach is promising for applying SMT, even when a source-side parallel corpus is lacking. | ||
| C18-1281 Our ***** experiment *****s show how annotators diverge in language annotation tasks due to a range of ineliminable factors. | ||
| 2020.emnlp-main.459 The ***** experiment *****al results on five datasets sampled from Freebase, NELL and Wikidata show that our method outperforms state-of-the-art baselines. | ||
| L14-1587 We ***** experiment *****ed with the proposed methodology over a sample of triples extracted from 10 DBpedia ontology properties. | ||
| model selection | 13 | |
| 2021.bionlp-1.23 Experimental results demonstrate the necessity of building a dedicated medical dataset and show that models that leverage extra resources achieve the best performance for both tasks, which provides certain guidance for future studies on ***** model selection ***** in the medical domain. | ||
| P19-1281 We present FIESTA, a ***** model selection ***** approach that significantly reduces the computational resources required to reliably identify state-of-the-art performance from large collections of candidate models. | ||
| 2021.emnlp-main.368 However, the introduction of neural networks in NLP has led to a different use of these standard splits; the development set is now often used for ***** model selection ***** during the training procedure. | ||
| P19-1235 Results show that variance of average surprisal (VAS) better correlates with parsing accuracy than data likelihood and that using VAS instead of data likelihood for ***** model selection ***** provides a significant accuracy boost. | ||
| Q15-1033 In contrast, Bayesian methods allow efficient ***** model selection ***** by maximizing the evidence on the training data through gradient-based methods. | ||
| future | 13 | |
| 2020.sdp-1.2 I will discuss the status and ***** future ***** of arXiv, and possibilities and plans to make more effective use of the research database to enhance ongoing research efforts. | ||
| W19-6129 Finally we discuss the dataset in light of the results and point to ***** future ***** research and plans for further improving both the dataset and methods of predicting prosodic prominence from text. | ||
| 2020.emnlp-main.748 We hope that these architectures and experiments may serve as strong points of comparison for ***** future ***** work. | ||
| L06-1294 While results are often encouraging, the paper also highlights evident problems and drawbacks with the method, and outlines suggestions for ***** future ***** work. | ||
| W17-3106 We then illustrate the ***** future ***** possibility of this work with an example of an exposure scenario authored with our application. | ||
| field | 13 | |
| 2020.coling-main.581 Deep pre-trained language models tend to become ubiquitous in the ***** field ***** of Natural Language Processing (NLP). | ||
| D19-1236 The review and selection process for scientific paper publication is essential for the quality of scholarly publications in a scientific ***** field *****. | ||
| 2020.coling-main.334 In recent years, transformer models and more specifically the BERT model developed at Google revolutionised the ***** field ***** of NLP. | ||
| 2021.econlp-1.9 In decision making in the economic ***** field *****, an especially important requirement is to rapidly understand news to absorb ever-changing economic situations. | ||
| L06-1406 The EQueR Evaluation Campaign included two tasks of automatic answer retrieval : the first one was a QA task over a heterogeneous collection of texts - mainly newspaper articles , and the second one a specialised one in the Medical *****field***** over a corpus of medical texts . | ||
| linguistic complexity | 13 | |
| 2020.emnlp-main.318 Word-level information is important in natural language processing (NLP), especially for the Chinese language due to its high ***** linguistic complexity *****. | ||
| W18-4601 It presents a grounded language learning system that can be used to study ***** linguistic complexity ***** from a developmental point of view and introduces a tool for generating a gold standard in order to evaluate the performance of the learning system. | ||
| 2021.bea-1.5 A broad range of quantifiable ***** linguistic complexity ***** features (lexical, morphological and syntactic) are extracted and calculated. | ||
| W18-6002 We evaluate corpus-based measures of ***** linguistic complexity ***** obtained using Universal Dependencies (UD) treebanks. | ||
| W16-4114 We bring together knowledge from two different types of language learning data , texts learners read and texts they write , to improve *****linguistic complexity***** classification in the latter . | ||
| word formation | 13 | |
| W16-4706 Germanic languages with their rich ***** word formation ***** morphology may be particularly good candidates for the approach advocated here. | ||
| W19-4610 A challenge in applying natural language processing techniques to these languages is the data sparsity problem that arises from their rich internal morphology, where the substructure is inherently non-concatenative and morphemes are interdigitated in ***** word formation *****. | ||
| K19-1082 Previous work has explored slang in terms of dictionary construction, sentiment analysis, ***** word formation *****, and interpretation, but scarce research has attempted the basic problem of slang detection and identification. | ||
| L14-1112 Chinese word structures are often represented by binary trees, the nodes of which are labeled with syntactic categories, due to the syntactic nature of Chinese ***** word formation *****. | ||
| 2020.coling-main.413 Using the extracted morphological data, we develop multilingual neural models for predicting three types of ***** word formation *****—clipping, contraction, and eye dialect—and improve upon a standard attention baseline by using copy attention. | ||
| behavior | 13 | |
| 2021.acl-long.523 We show that Polyjuice produces diverse sets of realistic counterfactuals, which in turn are useful in various distinct applications: improving training and evaluation on three different tasks (with around 70% less annotation effort than manual generation), augmenting state-of-the-art explanation techniques, and supporting systematic counterfactual error analysis by revealing ***** behavior *****s easily missed by human experts. | ||
| W19-3018 For the ***** behavior *****al model approach, we model each user's behaviour and thoughts with four groups of features: posting behaviour, sentiment, motivation, and content of the user's posting. | ||
| 2020.intexsempar-1.4 We introduce a neural semantic parsing system that learns new high-level abstractions through decomposition: users interactively teach the system by breaking down high-level utterances describing novel ***** behavior ***** into low-level steps that it can understand. | ||
| D19-1384 We demonstrate that complex linguistic ***** behavior ***** observed in natural language can be reproduced in this simple setting: i) the outcome of contact between communities is a function of inter- and intra-group connectivity; ii) linguistic contact either converges to the majority protocol, or in balanced cases leads to novel creole languages of lower complexity; and iii) a linguistic continuum emerges where neighboring languages are more mutually intelligible than farther removed languages. | ||
| 2021.dash-1.15 We present the Everyday Living Artificial Intelligence (AI) Hub, a novel proof-of-concept framework for enhancing human health and wellbeing via a combination of tailored wear-able and Conversational Agent (CA) solutions for non-invasive monitoring of physiological signals, assessment of ***** behavior *****s through unobtrusive wearable devices, and the provision of personalized interventions to reduce stress and anxiety. | ||
| large pretrained language | 13 | |
| 2021.ranlp-srw.3 With the recent success of ***** large pretrained language ***** models, we explore the possibility of using multilingual pretrained transformers like mBART and mT5 for exploring one such task of code-mixed Hinglish to English machine translation. | ||
| W19-8665 Recent advances in transfer-learning from ***** large pretrained language ***** models give rise to alternative approaches that do not rely on copy-attention and instead learn to generate concise and abstractive summaries. | ||
| 2021.nlp4convai-1.21 We experiment in detail with various controlled generation methods for ***** large pretrained language ***** models: specifically, conditional training, guided fine-tuning, and guided decoding. | ||
| 2020.emnlp-main.586 The success of ***** large pretrained language ***** models (LMs) such as BERT and RoBERTa has sparked interest in probing their representations, in order to unveil what types of knowledge they implicitly capture. | ||
| 2021.emnlp-main.608 We explore the use of ***** large pretrained language ***** models as few-shot semantic parsers. | ||
| sentence embedding | 13 | |
| S17-2031 The first stage deals with constructing neural word embeddings, the components of ***** sentence embedding *****s. | ||
| 2021.emnlp-main.11 Recent studies have leveraged graph neural networks to capture the inter-sentential relationship (e.g., the discourse graph) within the documents to learn contextual ***** sentence embedding *****. | ||
| D19-5404 To overcome these limitations, we present a novel method, which makes use of two types of ***** sentence embedding *****s: universal embeddings, which are trained on a large unrelated corpus, and domain-specific embeddings, which are learned during training. | ||
| N19-1274 Our error analysis indicates that alignments over character, word, and ***** sentence embedding *****s capture substantially different semantic information. | ||
| 2021.emnlp-main.185 Conventional approaches employ the siamese-network for this task, which obtains the ***** sentence embedding *****s through modeling the context-response semantic relevance by applying a feed-forward network on top of the sentence encoders. | ||
| unsupervised morphological | 13 | |
| W18-5808 However, while LIMS worked best on average and outperforms other state-of-the-art ***** unsupervised morphological ***** segmentation approaches, it did not provide the optimal AG configuration for five out of the six languages. | ||
| Q15-1012 In contrast, we propose a model for ***** unsupervised morphological ***** analysis that integrates orthographic and semantic views of words. | ||
| 2020.acl-main.598 We propose the task of ***** unsupervised morphological ***** paradigm completion. | ||
| 2020.sigmorphon-1.9 In this paper, we present the systems of the University of Stuttgart IMS and the University of Colorado Boulder (IMS–CUBoulder) for SIGMORPHON 2020 Task 2 on ***** unsupervised morphological ***** paradigm completion (Kann et al., 2020). | ||
| 2020.sigmorphon-1.8 We describe the NYU-CUBoulder systems for the SIGMORPHON 2020 Task 0 on typologically diverse morphological inflection and Task 2 on ***** unsupervised morphological ***** paradigm completion. | ||
| identify | 13 | |
| 2020.nlpcss-1.9 While this task has been closely associated with emotion prediction, we argue and show that ***** identify *****ing worry needs to be addressed as a separate task given the unique challenges associated with it. | ||
| W17-1606 Speakers' dialect and gender was controlled for by using videos uploaded as part of the “accent tag challenge”, where speakers explicitly ***** identify ***** their language background. | ||
| N18-4018 While some work has been done on code-mixed social media text and in emotion prediction separately, our work is the first attempt which aims at ***** identify *****ing the emotion associated with Hindi-English code-mixed social media text. | ||
| N18-2017 We show that, in a significant portion of such data, this protocol leaves clues that make it possible to ***** identify ***** the label by looking only at the hypothesis, without observing the premise. | ||
| 2020.lrec-1.143 Our corpus can be used as a resource for analyzing persuasiveness and training an argument mining system to ***** identify ***** and extract argument structures. | ||
| historical text | 13 | |
| L14-1565 Recently, the focus of many projects has shifted from the analysis of newspaper text to that of non-standard varieties such as user-generated content, ***** historical text *****s, and learner language. | ||
| D19-6112 This paper evaluates 63 multi-task learning configurations for sequence-to-sequence-based ***** historical text ***** normalization across ten datasets from eight languages, using autoencoding, grapheme-to-phoneme mapping, and lemmatization as auxiliary tasks. | ||
| P19-1157 Policy gradient training enables direct optimization for exact matches, and while the small datasets in ***** historical text ***** normalization are prohibitive of from-scratch reinforcement learning, we show that policy gradient fine-tuning leads to significant improvements across the board. | ||
| P17-1031 Automated processing of ***** historical text *****s often relies on pre-normalization to modern word forms. | ||
| 2021.eacl-main.273 We introduce the task of *****historical text***** summarisation , where documents in historical forms of a language are summarised in the corresponding modern language . | ||
| deep contextualized word | 13 | |
| S19-2028 The model is based on two Recurrent Neural Networks, the first one is fed with a state-of-the-art ELMo ***** deep contextualized word ***** representation and the second one is fed with a static Word2Vec embedding augmented with 10-dimensional affective word feature vector. | ||
| 2020.louhi-1.16 We build strong classification models based on ***** deep contextualized word ***** representations and show that they outperform previously applied statistical models with simple linguistic features by large margins. | ||
| K19-2007 We extended the basic transition-based parser with two improvements: a) Efficient Training by realizing Stack LSTM parallel training; b) Effective Encoding via adopting ***** deep contextualized word ***** embeddings BERT. | ||
| K18-2005 We base our submission on Stanford's winning system for the CoNLL 2017 shared task and make two effective extensions: 1) incorporating ***** deep contextualized word ***** embeddings into both the part of speech tagger and parser; 2) ensembling parsers trained with different initialization. | ||
| 2020.iwpt-1.4 However, with the recent adoption of ***** deep contextualized word ***** representations, the chief weakness of graph-based models, i.e., their limited scope of features, has been mitigated. | ||
| prior | 13 | |
| 2020.acl-main.560 Through this paper, we attempt to convince the ACL community to ***** prior *****itise the resolution of the predicaments highlighted here, so that no language is left behind. | ||
| D19-1124 Existing works simply assume the Gaussian ***** prior *****s of the latent variable, which are incapable of representing complex latent variables effectively. | ||
| 1998.amta-papers.25 All parse trees are converted to this format ***** prior ***** to semantic interpretation. | ||
| 2021.deelio-1.1 While some of these patterns confirm the conventional ***** prior ***** linguistic knowledge, the rest are relatively unexpected, which may provide new insights. | ||
| D19-1568 Among these characteristics of persuasive arguments, ***** prior ***** work in NLP does not explicitly investigate the effect of the pragmatic and discourse context when determining argument quality. | ||
| web service | 13 | |
| L10-1039 We propose a language resource management system, called WordNet Management System (WNMS), as a distributed management system that allows the server to perform the cross language WordNet retrieval, including the fundamental ***** web service ***** applications for editing, visualizing and language processing. | ||
| L10-1110 We present the problem of categorizing ***** web service *****s according to a shallow ontology for presentation on a specialist portal, using their WSDL and associated textual documents found by a crawler. | ||
| 2011.freeopmt-1.10 Recently, the closed-source translation back-end has been replaced by a free/open-source solution completely managed by Softcatala`: the Apertium machine translation platform and the ScaleMT ***** web service ***** framework. | ||
| L12-1416 The AT&T VoiceBuilder provides a new tool to researchers and practitioners who want to have their voices synthesized by a high - quality commercial - grade text - to - speech system without the need to install , configure , or manage speech processing software and equipment . It is implemented as a web service on the AT&T Speech Mashup Portal . The system records and validates users ' utterances , processes them to build a synthetic voice and provides a *****web service***** API to make the voice available to real - time applications through a scalable cloud - based processing platform . | ||
| 2020.stoc-1.6 In this paper , we present a *****web service***** platform for disinformation detection in hotel reviews written in English . | ||
| multiple choice | 13 | |
| W19-3819 An ensemble of QA and BERT-based ***** multiple choice ***** and sequence classification models further improves the F1 (23.3% absolute improvement upon the baseline). | ||
| 2020.conll-1.11 To this end, we collect a new eye-tracking dataset with a large number of participants engaging in a ***** multiple choice ***** reading comprehension task. | ||
| L14-1370 Two experiments were performed, a part-of-speech tagging task in where the annotators were asked to choose a correct word-category from a ***** multiple choice ***** list and case ending identification task. | ||
| 2021.emnlp-main.564 For example, they can perform ***** multiple choice ***** tasks simply by conditioning on a question and selecting the answer with the highest probability. | ||
| W18-0533 We investigate how machine learning models , specifically ranking models , can be used to select useful distractors for *****multiple choice***** questions . | ||
| production | 13 | |
| 2020.lrec-1.740 A very compact modeling of a signer is built and a Convolutional-Recurrent Neural Network is trained and tested on Dicta-Sign-LSF-v2, with state-of-the-art results, including the ability to detect iconicity in SL ***** production *****. | ||
| 2020.lrec-1.622 The exercise reported here shows that, in general, the re***** production ***** of these systems is successful with scores in line with those reported in SemEval2018. | ||
| 2021.inlg-1.30 We describe the study design, present the results from the original and the re***** production ***** study, and then compare and analyse the differences between the two sets of results. | ||
| W19-2903 To our knowledge, this is the first computational cognitive model that aims to simulate code-switched sentence ***** production *****. | ||
| L06-1265 The goal of this paper is ( 1 ) to illustrate a specific procedure for merging different monolingual lexicons , focussing on techniques for detecting and mapping equivalent lexical entries , and ( 2 ) to sketch a *****production***** model that enables one to obtain lexical resources via unification of existing data . | ||
| false | 13 | |
| W18-4106 Sentences with presuppositions are often treated as uninterpretable or unvalued (neither true nor ***** false *****) if their presuppositions are not satisfied. | ||
| R19-1104 A true label is used if the claim is true, ***** false ***** if it is ***** false *****, if the claim has no relation with the source then it is classified as out-of-context and if the claim cannot be verified at all then it is classified as inappropriate. | ||
| 2020.wnut-1.59 Among such an overabundance of data, it is crucial to distinguish which information is actually informative or merely sensational, redundant or ***** false *****. | ||
| N19-1028 Finally, human annotation is used to remove any ***** false ***** positive in these matched triples. | ||
| 2020.wanlp-1.19 It can generate blocks of text based on brief writing prompts that look like they were written by humans, facilitating the spread ***** false ***** or auto-generated text. | ||
| feature extraction | 13 | |
| P18-1182 In this paper, we present a new approach (NeuralREG), relying on deep neural networks, which makes decisions about form and content in one go without explicit ***** feature extraction *****. | ||
| 2021.smm4h-1.29 The steps for pre-processing tweets, ***** feature extraction *****, and the development of the machine learning models, are described extensively in the documentation. | ||
| 2021.semeval-1.15 XLMR performs better than mBERT in the cross-lingual setting both with fine-tuning and ***** feature extraction *****, whereas these two models give a similar performance in the multilingual setting. | ||
| 2020.sltu-1.19 Next, two conventional ***** feature extraction ***** models Visual Geometry Group (VGG) OxfordNet 16-layer and 19-layer are compared. | ||
| W18-3925 Especially, we focused on using different ***** feature extraction ***** methods and how to combine them, since they influenced very differently the performance of the system. | ||
| sequence prediction | 13 | |
| 2020.wanlp-1.16 Our system is developed for the Fairseq framework, which allows for a fast and easy use for any other ***** sequence prediction ***** problem. | ||
| N18-1154 In order to alleviate data sparsity and overfitting problems in maximum likelihood estimation (MLE) for ***** sequence prediction ***** tasks, we propose the Generative Bridging Network (GBN), in which a novel bridge module is introduced to assist the training of the ***** sequence prediction ***** model (the generator network). | ||
| P19-1055 In this paper we focus on a popular class of learning problems, ***** sequence prediction ***** applied to several sentiment analysis tasks, and suggest a modular learning approach in which different sub-tasks are learned using separate functional modules, combined to perform the final task while sharing information. | ||
| P18-1155 Inspired by the connection, we propose two ***** sequence prediction ***** algorithms, one extending RAML with fine-grained credit assignment and the other improving Actor-Critic with a systematic entropy regularization. | ||
| K19-1002 Supertagging is a *****sequence prediction***** task where each word is assigned a piece of complex syntactic structure called a supertag . | ||
| robot | 13 | |
| W17-2804 This requires the understanding of the user utterance with an accuracy able to trigger the ***** robot ***** reaction. | ||
| W17-2811 However, very little research relates a combination of multimodal social signals and language features detected during spoken face-to-face human-***** robot ***** interaction to the resulting user perception of a ***** robot *****. | ||
| W19-1603 The framework can be used as a ***** robot *****ics testbed: the results of our simulations can be compared with the output of algorithms in real ***** robot *****s, to validate such algorithms. | ||
| 2020.lrec-1.84 The experimental set up consisted of conversations between the participant in a functional magnetic resonance imaging (fMRI) scanner and a human confederate or conversational ***** robot ***** outside the scanner room, connected via bidirectional audio and unidirectional videoconferencing (from the outside to inside the scanner). | ||
| W18-1408 This position paper argues that , while prior work in spatial language understanding for tasks such as *****robot***** navigation focuses on mapping natural language into deep conceptual or non - linguistic representations , it is possible to systematically derive regular patterns of spatial language usage from existing lexical - semantic resources . | ||
| emotional speech | 13 | |
| W18-5044 Although the use of ***** emotional speech ***** responses have been shown to be effective in a limited domain, e.g., scenario-based and counseling dialogue, the effect is still not clear in the non-task-oriented dialogue such as voice chatting. | ||
| L06-1009 This paper describes an ***** emotional speech ***** database recorded for standard Basque. | ||
| L16-1634 There exists a major incompatibility in emotion labeling framework among ***** emotional speech ***** corpora, that is, category-based and dimension-based. | ||
| L06-1183 The ***** emotional speech ***** material used for this study comes from the previously collected SAFE Database (Situation Analysis in a Fictional and Emotional Database) which consists of audio-visual sequences extracted from movie fictions. | ||
| L08-1212 This paper describes the evaluation process of an *****emotional speech***** database recorded for standard Basque , in order to determine its adequacy for the analysis of emotional models and its use in speech synthesis . | ||
| assessment | 13 | |
| D19-1583 Preliminary results on Wikipedia indicate that this prediction is feasible, and yields informative ***** assessment *****s. | ||
| 2021.dash-1.15 We present the Everyday Living Artificial Intelligence (AI) Hub, a novel proof-of-concept framework for enhancing human health and wellbeing via a combination of tailored wear-able and Conversational Agent (CA) solutions for non-invasive monitoring of physiological signals, ***** assessment ***** of behaviors through unobtrusive wearable devices, and the provision of personalized interventions to reduce stress and anxiety. | ||
| P19-1035 This is a crucial step towards generating learner-adaptive exercises for self-directed language learning and preparing language ***** assessment ***** tests. | ||
| 2020.isa-1.9 Overcoming the deficiencies and gaps that were found, we propose a number of extensions to the ISO annotation scheme, making it a powerful analytical and modelling instrument for the analysis, modelling and ***** assessment ***** of medical communication. | ||
| 2020.wnut-1.9 In addition, we propose a novel application of language models to perform automatic linguistic quality ***** assessment *****. | ||
| deep syntactic | 13 | |
| L16-1566 Our higher-order parsing model, gaining thus up to 4 points, establishes the state of the art for parsing French ***** deep syntactic ***** structures. | ||
| 2009.iwslt-evaluation.15 To integrate ***** deep syntactic ***** information, we propose the use of parse trees and semantic dependencies on English sentences described respectively by Head-driven Phrase Structure Grammar and Predicate-Argument Structures. | ||
| L14-1410 We define a ***** deep syntactic ***** representation scheme for French, which abstracts away from surface syntactic variation and diathesis alternations, and describe the annotation of ***** deep syntactic ***** representations on top of the surface dependency trees of the Sequoia corpus. | ||
| C16-1040 Experiments conducted on a French corpus annotated with semantic frames showed that a semantic parser reaches better performances with such a ***** deep syntactic ***** input. | ||
| 2020.lrec-1.641 Altogether, the treebank contains around 180,000 sentences with their morphological, surface and ***** deep syntactic ***** annotation. | ||
| unsupervised domain | 13 | |
| 2020.coling-main.603 Motivated by the latest advances, in this survey we review neural ***** unsupervised domain ***** adaptation techniques which do not require labeled target domain data. | ||
| 2021.adaptnlp-1.2 In ***** unsupervised domain ***** adaptation, we aim to train a model that works well on a target domain when provided with labeled source samples and unlabeled target samples. | ||
| P19-1591 Pivot Based Language Modeling (PBLM) (Ziser and Reichart, 2018a), combining LSTMs with pivot-based methods, has yielded significant progress in ***** unsupervised domain ***** adaptation. | ||
| 2020.emnlp-main.497 On six ***** unsupervised domain ***** adaptation tasks involving named entity recognition, our method strongly outperforms the random masking strategy and achieves up to +1.64 F1 score improvements. | ||
| 2020.acl-main.370 In this paper, we investigate how to efficiently apply the pre-training language model BERT on the ***** unsupervised domain ***** adaptation. | ||
| corpus of spoken | 13 | |
| L06-1295 In this paper we present an application of AGTK to a ***** corpus of spoken ***** Italian annotated at many different linguistic levels. | ||
| W19-4013 This paper presents the identification of formulaic sequences in the reference ***** corpus of spoken ***** Slovenian and their annotation in terms of syntactic structure, pragmatic function and lexicographic relevance. | ||
| L14-1118 The most common approach to acquiring affect labels is to ask a panel of listeners to rate a ***** corpus of spoken ***** utterances along one or more dimensions of interest. | ||
| L14-1548 This paper presents a multimodal ***** corpus of spoken ***** human-human dialogues collected as participants played a series of Rapid Dialogue Games (RDGs). | ||
| L08-1080 We present a *****corpus of spoken***** dialogues between students and an adaptive Wizard - of - Oz tutoring system , in which student uncertainty was manually annotated in real - time . | ||
| Romanian | 13 | |
| 2020.lrec-1.546 We present RONEC - the Named Entity Corpus for the *****Romanian***** language . | ||
| 2021.ranlp-1.37 In this paper we investigate the etymology of *****Romanian***** words . | ||
| L08-1343 In this work we propose a new strategy for the authorship identification problem and we test it on an example from *****Romanian***** literature : did Radu Albala found the continuation of Mateiu Caragiales novel Sub pecetea tainei , or did he write himself the respective continuation ? | ||
| 2021.acl-short.136 In this work , we introduce a corpus for satire detection in *****Romanian***** news . | ||
| L08-1020 The success rate of 96.53 % for the automatic import of the temporal annotation from English to *****Romanian***** shows that the automatic transfer is a worth doing enterprise if temporality is to be studied in another language than the one for which TimeML , the annotation standard used , was developed . | ||
| this | 13 | |
| 2021.eacl-main.38 Typological features from databases such as the World Atlas of Language Structures ( WALS ) are a prime candidate for *****this***** , as such data exists even for very low - resource languages . | ||
| L10-1221 Given the fact that an emotion is triggered by cause events and that cause events are an integral part of emotion , *****this***** paper constructs a Chinese emotion cause corpus as a first step towards automatic inference of cause - emotion correlation . | ||
| W18-5205 Building intelligent systems capable of discriminating useful content within *****this***** ocean of information is thus becoming a urgent need . | ||
| 2021.emnlp-main.714 To train most AMR parsers , one needs to segment the graph into subgraphs and align each such subgraph to a word in a sentence ; *****this***** is normally done at preprocessing , relying on hand - crafted rules . | ||
| C16-1074 Although poetry is a literary form that makes use standard meters usually repeated among different authors , we will see in *****this***** paper how performing such analyses is a difficult task in machine learning due to the unexpected deviations from such standard patterns . | ||
| physical | 13 | |
| L14-1491 We introduce a spoken language resource for the analysis of impact that *****physical***** exercising has on human speech production . | ||
| L14-1164 Digital libraries are frequently treated just as a new method of storage of digitized artifacts , with all consequences of transferring long - established ways of dealing with *****physical***** objects into the digital world . | ||
| 2021.sigdial-1.37 Intelligent agents that are confronted with novel concepts in situated environments will need to ask their human teammates questions to learn about the *****physical***** world . | ||
| 2020.sigdial-1.17 Spoken interaction with a *****physical***** robot requires a dialogue system that is modular , multimodal , distributive , incremental and temporally aligned . | ||
| W17-2807 As robots begin to cohabit with humans in semi - structured environments , the need arises to understand instructions involving rich variabilityfor instance , learning to ground symbols in the *****physical***** world . | ||
| Visual Question Answering ( VQA | 13 | |
| 2020.findings-emnlp.417 We present MMFT - BERT(MultiModal FusionTransformer with BERT encodings ) , to solve *****Visual Question Answering ( VQA***** ) ensuring individual and combined processing of multiple input modalities . | ||
| 2020.emnlp-main.265 In the task of *****Visual Question Answering ( VQA***** ) , most state - of - the - art models tend to learn spurious correlations in the training set and achieve poor performance in out - of - distribution test data . | ||
| N18-1201 *****Visual Question Answering ( VQA***** ) is a well - known and challenging task that requires systems to jointly reason about natural language and vision . | ||
| D19-1596 While models for *****Visual Question Answering ( VQA***** ) have steadily improved over the years , interacting with one quickly reveals that these models lack consistency . | ||
| 2021.acl-long.317 While sophisticated neural - based models have achieved remarkable success in *****Visual Question Answering ( VQA***** ) , these models tend to answer questions only according to superficial correlations between question and answer . | ||
| Question answering ( QA ) | 13 | |
| 2020.findings-emnlp.171 *****Question answering ( QA )***** tasks have been posed using a variety of formats , such as extractive span selection , multiple choice , etc . | ||
| 2020.acl-main.653 *****Question answering ( QA )***** models have shown rapid progress enabled by the availability of large , high - quality benchmark datasets . | ||
| 2021.acl-srw.21 *****Question answering ( QA )***** models for reading comprehension have achieved human - level accuracy on in - distribution test sets . | ||
| R19-2011 *****Question answering ( QA )***** systems permit the user to ask a question using natural language , and the system provides a concise and correct answer . | ||
| D19-5827 With a large number of datasets being released and new techniques being proposed , *****Question answering ( QA )***** systems have witnessed great breakthroughs in reading comprehension ( RC)tasks . | ||
| European | 13 | |
| W16-4619 Unlike *****European***** languages , many Asian languages like Chinese and Japanese do not have typographic boundaries in written system . | ||
| L14-1722 In this paper , we present a speech recording interface developed in the context of a project on automatic speech recognition for elderly native speakers of *****European***** Portuguese . | ||
| L14-1426 We present a corpus of child and child - directed speech of *****European***** Portuguese . | ||
| 1963.earlymt-1.20 The paper will investigate a few major construction types in several related *****European***** languages : relative clauses , attributive phrases , and certain instances of coordinate conjunction involving these constructions . | ||
| E17-2038 We present Arab - Acquis , a large publicly available dataset for evaluating machine translation between 22 *****European***** languages and Arabic . | ||
| natural language generation ( NLG ) | 13 | |
| 2020.eval4nlp-1.5 Evaluation is a bottleneck in the development of *****natural language generation ( NLG )***** models . | ||
| 2021.emnlp-main.53 We propose a novel framework to train models to classify acceptability of responses generated by *****natural language generation ( NLG )***** models , improving upon existing sentence transformation and model - based approaches . | ||
| P19-2032 Comments on social media are very diverse , in terms of content , style and vocabulary , which make generating comments much more challenging than other existing *****natural language generation ( NLG )***** tasks . | ||
| 2021.acl-long.468 A well - known limitation in pretrain - finetune paradigm lies in its inflexibility caused by the one - size - fits - all vocabulary . This potentially weakens the effect when applying pretrained models into *****natural language generation ( NLG )***** tasks , especially for the subword distributions between upstream and downstream tasks with significant discrepancy . | ||
| 2021.eacl-main.25 Despite growing interest in *****natural language generation ( NLG )***** models that produce diverse outputs , there is currently no principled method for evaluating the diversity of an NLG system . | ||
| meta - | 13 | |
| N18-2031 Creating accurate *****meta -***** embeddings from pre - trained source embeddings has received attention lately . | ||
| P19-1538 We present open domain dialogue generation with *****meta -***** words . | ||
| 2021.acl-long.545 In this paper we explore the improvement of intent recognition in conversational systems by the use of *****meta -***** knowledge embedded in intent identifiers . | ||
| P19-1589 In this paper we frame the task of supervised relation classification as an instance of *****meta -***** learning . | ||
| 2021.eacl-main.325 A prerequisite for the computational study of literature is the availability of properly digitized texts , ideally with reliable *****meta -***** data and ground - truth annotation . | ||
| machine reading comprehension ( MRC | 13 | |
| 2020.acl-main.361 Neural models have achieved great success on *****machine reading comprehension ( MRC***** ) , many of which typically consist of two components : an evidence extractor and an answer predictor . | ||
| 2020.acl-main.701 Many tasks aim to measure *****machine reading comprehension ( MRC***** ) , often focusing on question types presumed to be difficult . | ||
| D18-1235 The task of *****machine reading comprehension ( MRC***** ) has evolved from answering simple questions from well - edited text to answering real questions from users out of web data . | ||
| 2020.coling-main.248 Neural models have achieved great success on the task of *****machine reading comprehension ( MRC***** ) , which are typically trained on hard labels . | ||
| 2021.rocling-1.7 With the recent breakthrough of deep learning technologies , research on *****machine reading comprehension ( MRC***** ) has attracted much attention and found its versatile applications in many use cases . | ||
| cross - lingual word | 13 | |
| 2020.starsem-1.5 In this paper , we propose a novel method for learning *****cross - lingual word***** embeddings , that incorporates sub - word information during training , and is able to learn high - quality embeddings from modest amounts of monolingual data and a bilingual lexicon . | ||
| N19-1161 Recent approaches to *****cross - lingual word***** embedding have generally been based on linear transformations between the sets of embedding vectors in the two languages . | ||
| 2021.acl-long.506 Recent research on *****cross - lingual word***** embeddings has been dominated by unsupervised mapping approaches that align monolingual embeddings . | ||
| P19-1492 Recent research in *****cross - lingual word***** embeddings has almost exclusively focused on offline methods , which independently train word embeddings in different languages and map them to a shared space through linear transformations . | ||
| E17-1072 While *****cross - lingual word***** embeddings have been studied extensively in recent years , the qualitative differences between the different algorithms remain vague . | ||
| Word embedding | 13 | |
| W19-0423 *****Word embedding***** representations provide good estimates of word meaning and give state - of - the art performance in semantic tasks . | ||
| 2020.lrec-1.581 *****Word embedding***** learning is the task to map each word into a low - dimensional and continuous vector based on a large corpus . | ||
| N18-2116 *****Word embedding***** parameters often dominate overall model sizes in neural methods for natural language processing . | ||
| D18-1521 *****Word embedding***** models have become a fundamental component in a wide range of Natural Language Processing ( NLP ) applications . | ||
| P19-1162 *****Word embedding***** models have gained a lot of traction in the Natural Language Processing community , however , they suffer from unintended demographic biases . | ||
| Neural machine | 13 | |
| 2020.emnlp-main.364 *****Neural machine***** translation achieves impressive results in high - resource conditions , but performance often suffers when the input domain is low - resource . | ||
| D19-1446 *****Neural machine***** translation , which achieves near human - level performance in some languages , strongly relies on the large amounts of parallel sentences , which hinders its applicability to low - resource language pairs . | ||
| N18-3014 *****Neural machine***** translation has achieved levels of fluency and adequacy that would have been surprising a short time ago . | ||
| D18-1396 *****Neural machine***** translation usually adopts autoregressive models and suffers from exposure bias as well as the consequent error propagation problem . | ||
| W19-3821 *****Neural machine***** translation has significantly pushed forward the quality of the field . | ||
| Edited News | 13 | |
| 2020.semeval-1.136 Task 7 , Assessing the Funniness of *****Edited News***** Headlines , in the International Workshop SemEval2020 introduces two sub - tasks to predict the funniness values of edited news headlines from the Reddit website . | ||
| 2020.semeval-1.129 In this paper we describe our system submitted to SemEval 2020 Task 7 : Assessing Humor in *****Edited News***** Headlines . | ||
| 2020.semeval-1.140 We describe the UTFPR system for SemEval-2020 's Task 7 : Assessing Humor in *****Edited News***** Headlines . | ||
| 2020.semeval-1.135 This paper presents two different systems for the SemEval shared task 7 on Assessing Humor in *****Edited News***** Headlines , sub - task 1 , where the aim was to estimate the intensity of humor generated in edited headlines . | ||
| 2020.semeval-1.142 This paper describes xsysigma team 's system for SemEval 2020 Task 7 : Assessing the Funniness of *****Edited News***** Headlines . | ||
| controlled | 13 | |
| L10-1085 This paper is concerned with resources for *****controlled***** languages for alert messages and protocols in the European perspective . | ||
| W16-4710 This paper presents the construction and evaluation of Japanese and English controlled bilingual terminologies that are particularly intended for *****controlled***** authoring and machine translation with special reference to the Japanese municipal domain . | ||
| P19-2055 In this work , we consider the medical concept normalization problem , i.e. , the problem of mapping a health - related entity mention in a free - form text to a concept in a *****controlled***** vocabulary , usually to the standard thesaurus in the Unified Medical Language System ( UMLS ) . | ||
| 2003.mtsummit-papers.34 This paper presents a source language diagnostic system for *****controlled***** translation . | ||
| D19-5605 We propose a simple and effective modeling framework for *****controlled***** generation of multiple , diverse outputs . | ||
| back - | 13 | |
| 2021.vardial-1.6 The most successful approach to Neural Machine Translation ( NMT ) when only monolingual training data is available , called unsupervised machine translation , is based on *****back -***** translation where noisy translations are generated to turn the task into a supervised one . | ||
| 2021.wmt-1.11 This system paper describes an end - to - end NMT pipeline for the Japanese English news translation task as submitted to WMT 2021 , where we explore the efficacy of techniques such as tokenizing with language - independent and language - dependent tokenizers , normalizing by orthographic conversion , creating a politeness - and - formality - aware model by implementing a tagger , *****back -***** translation , model ensembling , and n - best reranking . | ||
| D19-6313 This paper presents an exploratory study that aims to evaluate the usefulness of *****back -***** translation in Natural Language Generation ( NLG ) from semantic representations for non - English languages . | ||
| W19-5206 Recent work in Neural Machine Translation ( NMT ) has shown significant quality gains from noised - beam decoding during *****back -***** translation , a method to generate synthetic parallel data . | ||
| 1995.iwpt-1.26 We also built a smaller version of the grammar based on higher frequency patterns for use as a *****back -***** up when the larger grammar is unable to produce a parse due to memory limitation . | ||
| Named entity | 13 | |
| 2020.emnlp-main.133 *****Named entity***** recognition and relation extraction are two important fundamental problems . | ||
| 2020.aacl-main.74 *****Named entity***** disambiguation is an important task that plays the role of bridge between text and knowledge . | ||
| 2021.bsnlp-1.11 *****Named entity***** recognition , in particular for morphological rich languages , is challenging task due to the richness of inflected forms and ambiguity . | ||
| L12-1664 *****Named entity***** recognition , which focuses on the identification of the span and type of named entity mentions in texts , has drawn the attention of the NLP community for a long time . | ||
| Q16-1026 *****Named entity***** recognition is a challenging task that has traditionally required large amounts of knowledge in the form of feature engineering and lexicons to achieve high performance . | ||
| code - | 13 | |
| 2021.emnlp-main.656 Spoken dialogue systems need to be able to handle both multiple languages and multilinguality inside a conversation ( e.g in case of *****code -***** switching ) . | ||
| N18-2096 Social media is known for its multi - cultural and multilingual interactions , a natural product of which is *****code -***** mixing . | ||
| P18-3008 While growing code - mixed content on Online Social Networks(OSN ) provides a fertile ground for studying various aspects of *****code -***** mixing , the lack of automated text analysis tools render such studies challenging . | ||
| W18-3208 One language is often assumed to be dominant in *****code -***** switching but this assumption has not been empirically tested . | ||
| 2021.calcs-1.17 Usage - based analyses of teacher corpora and *****code -***** switching ( Boztepe , 2003 ) are an important next stage in understanding language acquisition . | ||
| phrase - based | 13 | |
| E17-2099 Many errors in *****phrase - based***** SMT can be attributed to problems on three linguistic levels : morphological complexity in the target language , structural differences and lexical choice . | ||
| 2012.iwslt-papers.17 We present a new approach to domain adaptation for SMT that enriches standard *****phrase - based***** models with lexicalised word and phrase pair features to help the model select appropriate translations for the target domain ( TED talks ) . | ||
| 2021.emnlp-main.64 We introduce SelfExplain , a novel self - explaining model that explains a text classifier 's predictions using *****phrase - based***** concepts . | ||
| W18-2710 Despite impressive progress in high - resource settings , Neural Machine Translation ( NMT ) still struggles in low - resource and out - of - domain scenarios , often failing to match the quality of *****phrase - based***** translation . | ||
| 2017.iwslt-1.18 Bilingual sequence models improve *****phrase - based***** translation and reordering by overcoming phrasal independence assumption and handling long range reordering . | ||
| parallelism | 12 | |
| 2020.lrec-1.683 Because we chose not to contact the original authors for our reproduction study, the uncertainty about the degree of ***** parallelism ***** that was achieved between the original study and our reproduction limits the value of our findings as an assessment of the reliability of the original results. | ||
| 1995.iwpt-1.17 All possible interpretations concerning comma usages and coordinate structure scopes are ranked by taking advantage of ***** parallelism ***** between conjoined phrases/clauses/sentences and calculating their similarity scores. | ||
| 1991.mtsummit-papers.15 The goal of our work is to develop a scalable and high-performance memory-based machine translation system which utilizes the high degree of ***** parallelism ***** provided by the SNAP machine. | ||
| 2020.wmt-1.110 The National Research Council of Canada's team submissions to the parallel corpus filtering task at the Fifth Conference on Machine Translation are based on two key components: (1) iteratively refined statistical sentence alignments for extracting sentence pairs from document pairs and (2) a crosslingual semantic textual similarity metric based on a pretrained multilingual language model, XLM-RoBERTa, with bilingual mappings learnt from a minimal amount of clean parallel data for scoring the ***** parallelism ***** of the extracted sentence pairs. | ||
| W89-0232 As of now, especially the procedural aspects have received attention: instead of having wild-running uncontrollable interactions, PEP restricts the interactions to explicit communications on a structured blackboard; the communication protocols are a compromise betwenn maximum ***** parallelism ***** and controllability | ||
| Assuming | 12 | |
| 2021.emnlp-main.73 ***** Assuming ***** that the capacity to process information is roughly constant across human populations, we expect a surprisal–duration trade-off to arise both across and within languages. | ||
| D19-1394 ***** Assuming ***** access to unlabeled utterances from the true distribution, we combine crowdsourcing with a paraphrase model to detect correct logical forms for the unlabeled utterances. | ||
| L08-1404 ***** Assuming ***** that the right to use a particular application or resource is licensed by the rightful owner, the user is faced with the often not so easy task of interfacing it with his/her own systems. | ||
| W19-8660 ***** Assuming ***** an agent which maximally satisfies a set of rules specified in an object-oriented temporal logic, the user can ask factual questions (about the agent's rules, actions, and the extent to which the agent violated the rules) as well as “why” questions that require the agent comparing actual behavior to counterfactual trajectories with respect to these rules. | ||
| L14-1409 ***** Assuming ***** so, we compute a representativeness score for a corpus by extracting word frequency and word association statistics from it and by comparing these statistics to the human data | ||
| composing | 12 | |
| 2021.naacl-main.381 Abstractive summarization, the task of generating a concise summary of input documents, requires: (1) reasoning over the source document to determine the salient pieces of information scattered across the long document, and (2) ***** composing ***** a cohesive text by reconstructing these salient facts into a shorter summary that faithfully reflects the complex relations connecting these facts. | ||
| W17-4118 We present a general-purpose tagger based on convolutional neural networks (CNN), used for both ***** composing ***** word vectors and encoding context information. | ||
| P17-2059 In contrast, the higher layers compose meaningful phrases and clauses, whose lengths increase as the networks get deeper until fully ***** composing ***** the sentence. | ||
| 2021.acl-short.136 We gathered 55,608 public news articles from multiple real and satirical news sources, ***** composing ***** one of the largest corpora for satire detection regardless of language and the only one for the Romanian language. | ||
| 2021.eval4nlp-1.14 Among the different tested methods, ***** composing ***** explanations in the form of attention weights scaled by the norm of value vectors yielded the best results | ||
| appropriateness | 12 | |
| 2020.bea-1.18 Because producing sentences in this way requires the speaker to integrate syntactic and semantic knowledge in a complex manner, responses are typically evaluated on several different dimensions of ***** appropriateness ***** yielding a single composite score for each response. | ||
| L10-1071 Working within the EU funded COMPANIONS program, we investigate the use of ***** appropriateness ***** as a measure of conversation quality, the hypothesis being that good companions need to be good conversational partners . | ||
| L14-1186 The ***** appropriateness ***** of this expansion process is assessed by examining the structural coherence of the expanded set and by validating the expanded lexicon against human judgment. | ||
| 2020.acl-main.54 In the human evaluation, our dialogue system achieved the success rate of 68.32%, the language understanding score of 4.149, and the response ***** appropriateness ***** score of 4.287, which ranked the system at the top position in the end-to-end multi-domain dialogue system task in the 8th dialogue systems technology challenge (DSTC8). | ||
| I17-1072 A relation sequence model (RSM) is proposed to encode the sequence of ***** appropriateness ***** of current response with respect to the earlier utterances | ||
| sentiments | 12 | |
| L14-1194 This study seeks to examine the effectiveness of this approach by applying factuality annotations, based on FactBank, on top of the MPQA Corpus, a corpus containing news texts annotated for ***** sentiments ***** and other private states. | ||
| L16-1008 In our proposed framework, (1) we use ANEW, a lexical dictionary to identify affective emotional feelings associated to a message according to the Russell's model of affection; (2) we design a topic modeling mechanism called Sent_LDA, based on the Latent Dirichlet Allocation (LDA) generative model, which allows us to find the topic distribution in a general conversation and we associate topics with emotions; (3) we detect communities in the network according to the density and frequency of the messages among the users; and (4) we compare the ***** sentiments ***** of the communities by using the Russell's model of affect versus polarity and we measure the extent to which topic distribution strengthen likeness in the ***** sentiments ***** of the users of a community. | ||
| C16-1248 Aspect-level analysis of ***** sentiments ***** contained in a review text is important to reveal a detailed picture of consumer opinions. | ||
| 2020.lrec-1.328 We present in this paper our work on Algerian language, an under-resourced North African colloquial Arabic variety, for which we built a comparably large corpus of more than 36,000 code-switched user-generated comments annotated for ***** sentiments *****. | ||
| W19-0509 We show that the proposed method, Latent Semantic Analysis with explicit word features, finds topics with a much smaller bias for ***** sentiments ***** than other similar methods | ||
| IDs | 12 | |
| 2021.sigtyp-1.12 This memo describes NTR-TSU submission for SIGTYP 2021 Shared Task on predicting language ***** IDs ***** from speech. | ||
| N19-1252 This approach presumes a sequence of cluster ***** IDs ***** is a `ciphertext' and seeks a POS tag-to-cluster ID mapping that will reveal the POS sequence. | ||
| 2021.acl-long.383 In these tasks, user and item ***** IDs ***** are important identifiers for personalization. | ||
| 2021.calcs-1.10 Experimental results show that including language ***** IDs ***** to the learning model significantly improves accuracy over other approaches. | ||
| E17-1072 We observe that whether or not an algorithm uses a particular feature set (sentence ***** IDs *****) accounts for a significant performance gap among these algorithms | ||
| extent | 12 | |
| 2021.blackboxnlp-1.4 We inspect to which ***** extent ***** neural language models (LMs) exhibit uncertainty over such analyses when processing temporarily ambiguous inputs, and how that uncertainty is modulated by disambiguating cues. | ||
| 2020.findings-emnlp.22 While several works have investigated the fluency and grammatical correctness of such models, it is still unclear to which ***** extent ***** the generated text is consistent with factual world knowledge. | ||
| P19-1103 Experiments on three popular datasets using convolutional as well as LSTM models show that PWWS reduces the classification accuracy to the most ***** extent *****, and keeps a very low word substitution rate. | ||
| 2020.fnp-1.31 This raises the question to which ***** extent ***** the information is actually included and if this information is at all relevant for investors. | ||
| 2021.ranlp-1.72 The efficiently learning an encoder that classifies token replacements accurately (ELECTRA) pretraining model replaces the BERT pretraining method's masked language modeling with a method called replaced token detection, which improves the computational efficiency and allows the additional pretraining of the model to a practical ***** extent ***** | ||
| citations | 12 | |
| Q17-1014 We present Latent Topical-Authority Indexing (LTAI) for jointly modeling the topics, ***** citations *****, and topical authority in a corpus of academic papers. | ||
| 2020.acl-demos.27 The interactive visualizations presented here, and the associated dataset of papers mapped to ***** citations *****, have additional uses as well including understanding how the field is growing (both overall and across sub-areas), as well as quantifying the impact of different types of papers on subsequent publications. | ||
| L14-1027 We manually annotate ***** citations ***** in 50 patents, train a CRF classifier to find new ***** citations *****, and apply a reranker to incorporate non-local information. | ||
| 2020.acl-main.702 In this work, we examine female first author percentages and the ***** citations ***** to their papers in Natural Language Processing (1965 to 2019). | ||
| 2020.wanlp-1.12 We present our work on automatically detecting isnads, the chains of authorities for a re-port that serve as ***** citations ***** in hadith and other classical Arabic texts | ||
| geometry | 12 | |
| 2020.textgraphs-1.3 To merge text segments, we introduce a novel mechanism that captures both ***** geometry ***** information as well as semantic information based on pre-trained language model. | ||
| D18-1180 In this paper we match each preposition's left- and right context, and their interplay to the ***** geometry ***** of the word vectors to the left and right of the preposition. | ||
| N19-1025 Through a series of experiments and analysis over latent space, we show that our model learns latent distributions that respect latent space ***** geometry ***** and is able to generate sentences that are more diverse. | ||
| S17-1029 We collect a new dataset of demonstrative ***** geometry ***** solutions from textbooks and explore approaches that learn to interpret these demonstrations as well as to use these interpretations to solve ***** geometry ***** problems. | ||
| 2021.eacl-main.144 Whereas this strategy results in high performance, it is difficult to interpret these representations in relation to the ***** geometry ***** of the underlying tree structure | ||
| Bidirectional Encoder Representation | 12 | |
| 2021.wanlp-1.42 Our MTL model's architecture consists of a ***** Bidirectional Encoder Representation ***** from Transformers (BERT) model, a multi-task attention interaction module, and two task classifiers. | ||
| 2020.semeval-1.278 For sub-task B, we opted to fine-tune a ***** Bidirectional Encoder Representation ***** from a Transformer (BERT) to accommodate the limited data for categorizing offensive tweets. | ||
| 2020.semeval-1.293 In the common task of Offensive Language Identification in Social Media, pre-trained models such as ***** Bidirectional Encoder Representation ***** from Transformer (BERT) have achieved good results. | ||
| 2020.semeval-1.131 In the shared task of assessing the funniness of edited news headlines, which is a part of the SemEval 2020 competition, we preprocess datasets by replacing abbreviation, stemming words, then merge three models including Light Gradient Boosting Machine (LightGBM), Long Short-Term Memory (LSTM), and ***** Bidirectional Encoder Representation ***** from Transformer (BERT) by taking the average to perform the best. | ||
| S19-2011 In the shared task of identifying and categorizing offensive language in social media, we preprocess the dataset according to the language behaviors on social media, and then adapt and fine-tune the ***** Bidirectional Encoder Representation ***** from Transformer (BERT) pre-trained by Google AI Language team | ||
| comprehending | 12 | |
| 2021.acl-long.478 (Explicit guidance on how to resolve Conversational Dependency) to enhance the abilities of QA models in ***** comprehending ***** conversational context. | ||
| W16-4122 The results indicated that the enhanced version, which encompasses the ASR errors addresses most of the L2 learners' difficulties and better assists them in ***** comprehending ***** challenging video segments as compared with the baseline. | ||
| W18-3702 Among the challenges of teaching reading comprehension in K – 12 are identifying the portions of a text that are difficult for a student, ***** comprehending ***** major critical ideas, and understanding context-dependent polysemous words. | ||
| K19-1041 This paper addresses the problem of ***** comprehending ***** procedural commonsense knowledge. | ||
| 2020.emnlp-main.119 Computational and cognitive studies of event understanding suggest that identifying, ***** comprehending *****, and predicting events depend on having structured representations of a sequence of events and on conceptualizing (abstracting) its components into (soft) event categories | ||
| dimensions | 12 | |
| 2020.coling-main.402 We demonstrate the feasibility of large-scale AQ annotation, show that exploiting relations between ***** dimensions ***** yields performance improvements, and explore the synergies between theory-based prediction and practical AQ assessment. | ||
| 2020.coling-main.401 We show that two elementary ***** dimensions ***** of aspectual class, states vs. events, and telic vs. atelic events, can be modelled effectively with distributional semantics. | ||
| W18-4006 In conceptual spaces, among others, ***** dimensions ***** are interpretable and grouped into facets, and properties and concepts are explicitly modelled as (vague) regions. | ||
| I17-2006 Through our evaluations on standard word embedding evaluation tasks, we show that for ***** dimensions ***** higher than or equal to the bound, we get better results as compared to the ones below it. | ||
| W19-4734 Linguistic factors can be understood as data ***** dimensions ***** which show complex interrelationships | ||
| Gigaword corpus | 12 | |
| 2020.nuse-1.14 By augmenting GPT 2.0 with information retrieval we achieve a zero shot 15% relative reduction in perplexity on ***** Gigaword corpus ***** without any re-training. | ||
| Q16-1024 We evaluate SNM language models on two corpora: the One Billion Word Benchmark and a subset of the LDC English ***** Gigaword corpus *****. | ||
| 2013.iwslt-evaluation.19 For the Turkish-to-English direction, we use ***** Gigaword corpus ***** as an additional language model with the training data. | ||
| 2012.iwslt-evaluation.13 Part of this improvment was due to the use of an extraction from the ***** Gigaword corpus *****. | ||
| C16-1085 In this paper, we adopt the Chinese ***** Gigaword corpus ***** and HSK corpus as L1 and L2 corpora, respectively | ||
| 1a | 12 | |
| 2021.semeval-1.171 For Task ***** 1a *****, we apply an ensemble of fine-tuned pre-trained language models; for Tasks 1b, 1c, and 2a, we investigate various tree-based and linear machine learning models. | ||
| 2021.semeval-1.163 Our best model consists of an ensemble of all tested configurations, and achieves a 95.66% F1-score and 94.70% accuracy for Task ***** 1a *****, while obtaining RMSE scores of 0.6200 and 0.5318 for Tasks 1b and 2, respectively. | ||
| 2021.smm4h-1.14 Our text classification submissions have achieved competitive performance with F1-score of 0.46 and 0.90 on ADE Classification (Task ***** 1a *****) and Profession Classification (Task 7a) respectively. | ||
| 2020.sdp-1.32 In Task ***** 1a *****, we apply and compare different methods in combination with similarity scores to identify spans of the reference text for the given citance. | ||
| 2021.smm4h-1.24 We developed a system based on RoBERTa (for Task ***** 1a ***** & 4) and BioBERT (for Task 8) | ||
| enumeration | 12 | |
| 2021.acl-long.497 Towards these issues, we propose a neural transition-based model for argumentation mining, which incrementally builds an argumentation graph by generating a sequence of actions, avoiding inefficient ***** enumeration ***** operations. | ||
| L12-1618 The described datasource provides an appropriate means for modeling an individual's pursuit of power within an on-line discussion group and also allows for ***** enumeration ***** and validation of current theories on the ways in which individuals strive for power. | ||
| N19-1308 We further observe that the span ***** enumeration ***** approach is good at detecting nested span entities, with significant F1 score improvement on the ACE dataset. | ||
| W89-0230 For ***** enumeration ***** of k parses in the order of the total weight of all applied productions, the time and space complexities of our algorithm are 0(n^3 + kn^2) and 0(n^3 + kn), respectively. | ||
| C16-1105 In this work we propose the use of imitation learning for structured prediction which learns an incremental model that handles the large search space by avoiding explicit ***** enumeration ***** of the outputs | ||
| personas | 12 | |
| 2020.findings-emnlp.238 Still, the representation of such ***** personas ***** has thus far been limited to a fact-based representation (e.g. “I have two cats.”). | ||
| 2020.aacl-main.65 In this paper, we present a model called DAPPER that can learn to embed persona from natural language and alleviate task or domain-specific data sparsity issues related to ***** personas *****. | ||
| 2020.findings-emnlp.127 Experimental results show that FIRE outperforms previous methods by margins larger than 2.8% and 4.1% on the PERSONA-CHAT dataset with original and revised ***** personas ***** respectively, and margins larger than 3.1% on the CMU_DoG dataset in terms of top-1 accuracy. | ||
| D18-1298 (2018) showed that the engagement level of end-to-end dialogue models increases when conditioning them on text ***** personas ***** providing some personalized back-story to the model. | ||
| 2021.emnlp-main.86 To tackle this issue, we study a new task, named Speaker Persona Detection (SPD), which aims to detect speaker ***** personas ***** based on the plain conversational text | ||
| exponential | 12 | |
| E17-2028 We offer a new interpretation of skip-gram based on ***** exponential ***** family PCA-a form of matrix factorization to generalize the skip-gram model to tensor factorization. | ||
| N18-1119 However, while theoretically sound, existing approaches have computational complexities that are either linear (Hokamp and Liu, 2017) or ***** exponential ***** (Anderson et al., 2017) in the number of constraints. | ||
| 2021.wanlp-1.51 Within the last few years, the number of Arabic internet users and Arabic online content is in ***** exponential ***** growth. | ||
| W18-5414 We observe that the model requires ***** exponential ***** memory in terms of the number of characters and embedded depth, where a sub-linear memory should suffice. | ||
| 2021.naacl-main.87 We further present a practical simulated ***** exponential ***** mechanism that has efficient inference with certified robustness | ||
| GENIA | 12 | |
| 2020.findings-emnlp.114 On BioNLP 2011 ***** GENIA ***** Event Extraction task, our approach achieved 1.41% F1 and 3.19% F1 improvements on all events and complex events, respectively. | ||
| L08-1237 Inspired by the work in the ***** GENIA ***** Corpus, which is one of the very few of such corpora, extensively used in the biomedical field, and in order to fulfil the needs of our research, we have collected a Swedish medical corpus, the MEDLEX Corpus. | ||
| N18-1131 Extensive evaluation shows that our dynamic model outperforms state-of-the-art feature-based systems on nested NER, achieving 74.7% and 72.2% on ***** GENIA ***** and ACE2005 datasets, respectively, in terms of F-score. | ||
| 2021.acl-long.275 We provide extensive experimental results on ACE2004, ACE2005, and ***** GENIA ***** datasets to show the effectiveness and efficiency of our proposed method. | ||
| L08-1071 In our experiments, we built a simple machine learning-based pronoun resolution system, and evaluated the system on three different corpora: MUC, ACE, and ***** GENIA ***** | ||
| STAPLE | 12 | |
| 2020.ngt-1.18 We present our submission to the Simultaneous Translation And Paraphrase for Language Education (***** STAPLE *****) challenge. | ||
| 2020.ngt-1.13 In this paper, we introduce a system built for the Duolingo Simultaneous Translation And Paraphrase for Language Education (***** STAPLE *****) shared task at the 4th Workshop on Neural Generation and Translation (WNGT 2020). | ||
| 2020.ngt-1.19 This paper describes our submission to the 2020 Duolingo Shared Task on Simultaneous Translation And Paraphrase for Language Education (***** STAPLE *****). | ||
| 2020.ngt-1.21 Unlike the standard machine translation task, ***** STAPLE ***** requires generating a set of outputs for a given input sequence, aiming to cover the space of translations produced by language learners. | ||
| 2020.ngt-1.22 This paper presents the Johns Hopkins University submission to the 2020 Duolingo Shared Task on Simultaneous Translation and Paraphrase for Language Education (***** STAPLE *****) | ||
| dictation | 12 | |
| 2012.iwslt-papers.13 The system captures the user speech which is the ***** dictation ***** of the target language sentence. | ||
| L16-1635 This paper introduces and evaluates a corpus of more than 55 hours of English-to-Japanese user activity data that were collected within the ENJA15 project, in which translators were observed while writing and speaking translations (translation ***** dictation *****) and during machine translation post-editing. | ||
| 2008.iwslt-papers.3 We also found that ***** dictation ***** using an ASR with WER of 4 percent or less would have resulted in statistically significant (p less than 0.6) productivity gains in the order of 25.1 percent to 44.9 percent Translated Words Per Minute. | ||
| 2020.nlpmc-1.8 Experiments performed on ***** dictation ***** and conversational style corpora show that our proposed model achieves 5% absolute improvement on ground truth text and 10% improvement on ASR outputs over baseline models under F1 metric | ||
| L16-1124 In this paper the authors present a speech corpus designed and created for the development and evaluation of *****dictation***** systems in Latvian . | ||
| CKY | 12 | |
| 1995.iwpt-1.6 We show how the addition of `link counters' to standard parsing algorithms such as ***** CKY *****- and Earley-based methods for TAG results in a polynomial time complexity algorithm for parsing lexicalized V-TAG, a multi-component version of TAGs defined in (Rambow, 1994). | ||
| I17-2001 This paper proposes a new attention mechanism for neural machine translation (NMT) based on convolutional neural networks (CNNs), which is inspired by the ***** CKY ***** algorithm. | ||
| W03-3007 Based on an algorithm proposed by [Nederhof and Satta, 2002] for the non-probabilistic case, left-to-right strategies for the search for the best solution based on ***** CKY ***** and Earley parsers are discussed. | ||
| E17-2049 Our experimental results using the English Penn Treebank corpus show that the proposed algorithm is faster than the standard ***** CKY ***** parsing algorithm. | ||
| 2020.iwpt-1.14 Our process outperforms a ***** CKY ***** baseline and other Spanish parsers in terms of global metrics and also for some specific Spanish phenomena, such as clitics reduplication and relative referents | ||
| coded | 12 | |
| N18-1182 Automatic identification of spurious instances (those with potentially wrong labels in datasets) can improve the quality of existing language resources, especially when annotations are obtained through crowdsourcing or automatically generated based on ***** coded ***** rankings. | ||
| W17-2335 We identified a random sample of 2000 Veterans Administration patients, ***** coded ***** as current tobacco users, from 2008 to 2014. | ||
| 2020.acl-main.11 This formulation allows for a simple integration of conversational knowledge ***** coded ***** in large pretrained conversational models such as ConveRT (Henderson et al., 2019). | ||
| 2021.rocling-1.18 The interruption behavior during the conversation was also ***** coded ***** and analyzed. | ||
| 2021.emnlp-main.29 Despite much attention being paid to characterize and detect discriminatory speech, most work has focused on explicit or overt hate speech, failing to address a more pervasive form based on ***** coded ***** or indirect language | ||
| discrepancy | 12 | |
| 2021.emnlp-main.264 To alleviate the above ***** discrepancy *****, we propose scheduled sampling methods based on decoding steps, increasing the selection chance of predicted tokens with the growth of decoding steps. | ||
| D19-1429 To take the multi-level domain relevance ***** discrepancy ***** into account, in this paper, we propose a fine-grained knowledge fusion model with the domain relevance modeling scheme to control the balance between learning from the target domain data and learning from the source domain model. | ||
| 2021.acl-long.420 Existing methods utilize syntax of text either in the pre-training stage or in the fine-tuning stage, so that they suffer from ***** discrepancy ***** between the two stages. | ||
| 2021.semeval-1.5 The main contributions of our system are 1) revealing the performance ***** discrepancy ***** of different transformer-based pretraining models on the downstream task, 2) presentation of an efficient method to generate large task-adaptive corpora for pretraining. | ||
| 2020.emnlp-main.599 We further extend CFd to a cross-language setting, in which language ***** discrepancy ***** is studied | ||
| scenarios | 12 | |
| 2020.coling-main.606 It is even more challenging for domains where limited training data is available or ***** scenarios ***** in which the length of the summary is not known beforehand. | ||
| N18-3007 While the RF models are competitive for ***** scenarios ***** with smaller amounts of training data and somewhat more robust, they are clearly outperformed by the SN models when the amount of training data is larger. | ||
| 2020.findings-emnlp.425 It is also shown that reasonable performance is obtained when ZEN is trained on a small corpus, which is important for applying pre-training techniques to ***** scenarios ***** with limited data. | ||
| W19-8656 In our work, we rely on a newly developed prediction model, which assigns patients to ***** scenarios *****. | ||
| 2021.emnlp-main.209 This paper studies the keyphrase generation (KG) task for ***** scenarios ***** where structure plays an important role | ||
| distilled | 12 | |
| 2020.emnlp-main.29 Our implementation, along with ***** distilled ***** test suites for eleven Text-to-SQL datasets, is publicly available. | ||
| 2021.emnlp-main.829 Experiments consider classification and generation tasks, yielding among other results a pruned model that is a 2.4x faster, 74% smaller BERT on SQuAD v1, with a 1% drop on F1, competitive both with ***** distilled ***** models in speed and pruned models in size. | ||
| 2021.acl-long.408 In this work, we demonstrate that our proposed distillation method, which is a simple extension of CBOW-based training, allows to significantly improve computational efficiency of NLP applications, while outperforming the quality of existing static embeddings trained from scratch as well as those ***** distilled ***** from previously proposed methods. | ||
| 2020.blackboxnlp-1.9 Finally, we demonstrate the utility of our ***** distilled ***** representations by showing that they outperform the original contextualized representations in a few-shot parsing setting. | ||
| 2020.sustainlp-1.5 We show that classification tasks that require the capturing of general lexical semantics can be successfully ***** distilled ***** by very simple and efficient models and require relatively small amount of labeled training data | ||
| interdisciplinary | 12 | |
| 2021.acl-long.321 We argue that reliability testing — with an emphasis on ***** interdisciplinary ***** collaboration — will enable rigorous and targeted testing, and aid in the enactment and enforcement of industry standards. | ||
| 2020.isa-1.11 The purpose of this paper is to present a prospective and ***** interdisciplinary ***** research project seeking to ontologize knowledge of the domain of Outsider Art, that is, the art created outside the boundaries of official culture. | ||
| 2021.case-1.3 Many challenges remain, and these are best addressed in collaborative projects which build on ***** interdisciplinary ***** expertise. | ||
| 2020.conll-1.51 The paper presents the first dataset that aims to serve ***** interdisciplinary ***** purposes for the utility of computer vision community and sign language linguistics. | ||
| 2020.lrec-1.255 Improving access for outsiders can help ***** interdisciplinary ***** research like Nature Inspired Engineering | ||
| readable | 12 | |
| 2020.inlg-1.3 It goes without saying that ethical considerations rise with these sensitive documents made ***** readable ***** and available at scale, legitimate concerns that we address in this paper. | ||
| L14-1213 This evolution has followed a number of important design principles including: conceptually simple annotator interfaces, ***** readable ***** pipeline descriptions, minimal collection readers, type system agnostic code, modules organized for ease of import, and assisting user comprehension of the complex UIMA framework. | ||
| 2020.coling-main.65 We also find that BERT's capacity to encode different kind of linguistic properties has a positive influence on its predictions: the more it stores ***** readable ***** linguistic information of a sentence, the higher will be its capacity of predicting the expected label assigned to that sentence. | ||
| K18-1040 Despite being exposed to no target data, our unsupervised models learn to generate imperfect but reasonably ***** readable ***** sentence summaries. | ||
| 2020.acl-demos.38 The library utilizes batched, vectorized operations and exploits auto-differentiation to produce ***** readable *****, fast, and testable code | ||
| unnatural | 12 | |
| 2020.coling-main.434 To capture the diversity when applied to natural questions, we learn a projection model to map natural questions into their most similar ***** unnatural ***** questions for which the parser can work well. | ||
| D19-1303 However, maximizing the noisy sensationalism reward will generate ***** unnatural ***** phrases instead of sensational headlines. | ||
| 2020.emnlp-main.113 However, current augmentation approaches such as random insertion or repetition fail to resemble training corpus well and usually resulted in ***** unnatural ***** and limited types of disfluencies. | ||
| W17-4416 This paper presents an effective method of distinguishing ***** unnatural ***** language from natural language, and evaluates the impact of un-natural language detection on NLP tasks such as document clustering. | ||
| 2021.naacl-main.400 Existing techniques of generating such examples are typically driven by local heuristic rules that are agnostic to the context, often resulting in ***** unnatural ***** and ungrammatical outputs | ||
| nested NER | 12 | |
| 2020.acl-main.525 The proposed method achieves state-of-the-art F1 scores in ***** nested NER ***** on ACE-2004, ACE-2005, GENIA, and NNE, which are 80.27, 79.42, 77.78, and 93.70 with conventional embeddings, and 87.74, 86.34, 79.31, and 94.68 with pre-trained contextualized embeddings. | ||
| P19-1510 We hope the public release of this large dataset for English newswire will encourage development of new techniques for ***** nested NER *****. | ||
| N18-1131 Extensive evaluation shows that our dynamic model outperforms state-of-the-art feature-based systems on ***** nested NER *****, achieving 74.7% and 72.2% on GENIA and ACE2005 datasets, respectively, in terms of F-score. | ||
| 2020.findings-emnlp.430 Usually, some entity mentions are nested in other entities, which leads to the ***** nested NER ***** problem. | ||
| 2020.lrec-1.1 For ***** nested NER *****, the evaluation of our model on the GENIA corpora shows that our model matches or outperforms state-of-the-art models despite not being specifically designed for this task | ||
| adjacency | 12 | |
| 2020.emnlp-main.583 We point out that both graph structure and ***** adjacency ***** matrix are task-related prior knowledge, and graph-attention can be considered as a special case of self-attention. | ||
| I17-1073 We propose a novel SimCluster algorithm that extends standard K-means algorithm to simultaneously cluster user utterances and agent utterances by taking their ***** adjacency ***** information into account. | ||
| 2020.coling-main.461 However, traditional GCN simply takes word nodes and ***** adjacency ***** matrix to represent graphs, which is difficult to establish direct connections between distant entity pairs. | ||
| 2021.acl-long.303 Our neural model learns to Accept, Break, Copy or Drop elements of a graph that combines word ***** adjacency ***** and grammatical dependencies. | ||
| 2021.eacl-main.34 We explore four different discriminator objectives which each capture a different aspect of coherence, including whether salient spans of generated abstracts are hallucinated or appear in the input context, and the likelihood of sentence ***** adjacency ***** in generated abstracts | ||
| episodic | 12 | |
| P19-1258 Furthermore, we introduce a context-sensitive perceptual process for the token representations of dialog history, and then feed them into the ***** episodic ***** memory. | ||
| 2021.naacl-main.142 By learning to compute distances among the senses of a given word through ***** episodic ***** training, MetricWSD transfers knowledge (a learned metric space) from high-frequency words to infrequent ones. | ||
| D19-3025 The system can perform QA over memories by responding to user queries to recall specific attributes and associated media (e.g. photos) of past ***** episodic ***** memories. | ||
| 2020.acl-main.178 Additionally, we measure the differential recruitment of knowledge attributed to semantic memory versus ***** episodic ***** memory (Tulving, 1972) for imagined and recalled storytelling by comparing the frequency of descriptions of general commonsense events with more specific realis events. | ||
| K19-1068 We introduce Episodic Memory QA , the task of answering personal user questions grounded on memory graph ( MG ) , where *****episodic***** memories and related entity nodes are connected via relational edges . | ||
| coronavirus | 12 | |
| 2020.knlp-1.1 The ***** coronavirus ***** disease (COVID-19) has claimed the lives of over one million people and infected more than thirty-five million people worldwide. | ||
| 2021.smm4h-1.25 Since the outbreak of ***** coronavirus ***** at the end of 2019, there have been numerous studies on coro- navirus in the NLP arena. | ||
| 2021.acl-srw.28 The internet has actually come to be an essential resource of health knowledge for individuals around the world in the present situation of the ***** coronavirus ***** condition pandemic(COVID-19). | ||
| 2020.wnut-1.66 The proposed approach includes pre-processing techniques and pre-trained RoBERTa with suitable hyperparameters for English ***** coronavirus ***** tweet classification. | ||
| 2020.nlpcovid19-acl.1 The COVID-19 Open Research Dataset (CORD-19) is a growing resource of scientific papers on COVID-19 and related historical ***** coronavirus ***** research | ||
| unambiguous | 12 | |
| N19-1418 Despite these being ***** unambiguous ***** examples, the model successfully generalizes from them, leading to improved results (both overall, and especially on unseen words) in comparison to a baseline that does not use context. | ||
| L16-1029 These documents are designed to be as ***** unambiguous ***** as possible for their users. | ||
| 2020.lrec-1.439 The idea is to train recurrent neural networks on the output that the morphological analyser produces for ***** unambiguous ***** words. | ||
| L16-1463 One of the important criterions for adding terms to the lexicon, was that they be as ***** unambiguous ***** as possible | ||
| W17-1915 This paper compares two approaches to word sense disambiguation using word embeddings trained on *****unambiguous***** synonyms . | ||
| spectral | 12 | |
| P17-1087 We present a new signed ***** spectral ***** normalized graph cut algorithm, signed clustering, that overlays existing thesauri upon distributionally derived vector representations of words, so that antonym relationships between word pairs are represented by negative weights. | ||
| L10-1489 Using Nakamura's method, we use classical speech recognition features, MFCC, and try to represent the effects of the speaking styles on the ***** spectral ***** space. | ||
| L10-1317 On the other hand, two systems get significantly better results than the rest: one is based on statistical parametric synthesis and the other one is a concatenative system that makes use of a sinusoidal model to modify both prosody and smooth ***** spectral ***** joints. | ||
| 2020.emnlp-main.32 Here ***** spectral ***** impact considers the perturbation to the dominant eigenvalue of affinity matrix when dropping the summary candidate from the document cluster | ||
| 2021.nodalida-main.10 In this paper , we propose *****spectral***** modification by sharpening formants and by reducing the spectral tilt to recognize children 's speech by automatic speech recognition ( ASR ) systems developed using adult speech . | ||
| example | 12 | |
| 1999.mtsummit-1.36 Our approach is an ***** example *****-based approach which relies sorely on ***** example ***** translations kept in a Bilingual Knowledge Bank (BKB). | ||
| 2021.nodalida-main.25 We demonstrate the method in the Finnish language, outperforming traditional lemmatizers in ***** example ***** task of document similarity comparison, but the approach is language independent and can be trained for new languages with mild requirements. | ||
| W17-3516 Specifically, we focus on text generation of function descriptions from ***** example ***** software projects. | ||
| 2020.emnlp-main.312 The results of the user study show that the proposed agent can find out ***** example ***** sentences that help students learn more easily and efficiently. | ||
| 2021.eacl-main.36 In this paper, we introduce FEWS (Few-shot Examples of Word Senses), a new low-shot WSD dataset automatically extracted from ***** example ***** sentences in Wiktionary | ||
| predictions | 12 | |
| C16-1051 Tests with millions of products show that our first ***** predictions ***** matches 81% of merchants' assignments, when “others” categories are excluded. | ||
| D17-2016 The presented tool features a Web interface for all-word disambiguation of texts that makes the sense ***** predictions ***** human readable by providing interpretable word sense inventories, sense representations, and disambiguation results. | ||
| P19-1621 Although current evaluation of question-answering systems treats ***** predictions ***** in isolation, we need to consider the relationship between ***** predictions ***** to measure true understanding. | ||
| W19-1706 We investigate whether per forming speech recognition on the speaking-side of a conversation can improve language model based ***** predictions *****. | ||
| 2021.iwpt-1.23 Our main system component is a hybrid tree-graph parser that integrates (a) ***** predictions ***** of spanning trees for the enhanced graphs with (b) additional graph edges not present in the spanning trees | ||
| unbalanced | 12 | |
| 2020.coling-main.366 We have introduced an extension to the DBSCAN algorithm and presented a density-based clustering algorithm ITER-DBSCAN for ***** unbalanced ***** data clustering. | ||
| 2021.wmt-1.97 We further improve the base classifier by (i) adding a weighted sampler to deal with ***** unbalanced ***** data and (ii) introducing feature engineering, where features related to toxicity, named-entities and sentiment, which are potentially indicative of critical errors, are extracted using existing tools and integrated to the model in different ways. | ||
| 2020.semeval-1.256 The main challenge of data preprocessing is the ***** unbalanced ***** class distribution, abbreviation, and emoji. | ||
| W19-6207 The classifiers are applied to a songtext–artist dataset which is large, ***** unbalanced ***** and noisy. | ||
| 2020.acl-srw.9 We make sure this regularizer increases the quality of topic models, trained on ***** unbalanced ***** collections | ||
| variant | 12 | |
| C16-1207 We use this corpus to develop a part-of-speech tagger and phrase table for the ***** variant ***** of English that is used and a classifier for identifying tweets that express grieving and aggression. | ||
| N18-2068 We assess its relevance with respect to a linguistic benchmark and its utility for the tasks of VMWE classification and ***** variant ***** identification on a French corpus. | ||
| 2021.acl-long.369 We also introduce an attention ***** variant ***** called leaky attention, which alleviates the problem of unexpected high cross-attention weights on special tokens such as periods. | ||
| 2020.mmw-1.4 Using as complementary resources, one dictionary and thesentiment valences of the words, we check if the word of the lexicon matches with the meaning of the synset, and if it matches we addthe word as ***** variant ***** to the Basque WordNet. | ||
| W18-4805 A natural language generation environment (ivi/Vinci) embedded in a web environment (VinciLingua) makes it possible to produce, by rule, ***** variant ***** forms of indefinite complexity | ||
| activation | 12 | |
| 2021.emnlp-main.627 We show that transformers have unique quantization challenges – namely, high dynamic ***** activation ***** ranges that are difficult to represent with a low bit fixed-point format. | ||
| 2020.acl-main.573 Inspired by the mechanism in human long-term memory formation, we introduce episodic memory ***** activation ***** and reconsolidation (EMAR) to continual relation learning. | ||
| W16-4102 The composition cost of a sentence depends on the semantic coherence of the event being constructed and on the ***** activation ***** degree of the linguistic constructions. | ||
| 2020.lrec-1.197 As far as we know, most emotional databases contain static annotations in discrete categories or in dimensions such as ***** activation ***** or valence. | ||
| 2021.cl-3.16 The ratio of the ***** activation ***** outside the token and the total ***** activation ***** forms the basis of our measure | ||
| clause | 12 | |
| 2003.mtsummit-papers.29 To develop a speech-to-speech machine translation system for monologues based on the ***** clause ***** as the translation unit, we need a monologue parallel corpus with ***** clause ***** alignment. | ||
| 2020.starsem-1.7 Previous work approached this in three ways, namely (1) as text classification into an inventory of predefined possible stimuli (“Is the stimulus category A or B?”), (2) as sequence labeling of tokens (“Which tokens describe the stimulus?”), and (3) as ***** clause ***** classification (“Does this ***** clause ***** contain the emotion stimulus?”). | ||
| 1963.earlymt-1.34 Equivalently, there appear to be severe restrictions on ***** clause ***** order for any given meaning. | ||
| 2021.cl-4.30 Evaluation with manual methods shows that most of the errors made by Google NMT are located in the ***** clause ***** containing the ellipsis, the frequency of such errors is slightly more in Telugu than Hindi, and the translation adequacy shows improvement when ellipses are reconstructed with their antecedents | ||
| 2020.iwdp-1.9 NT Clause Complex Framework defines a *****clause***** complex as a combination of NT clauses through component sharing and logic - semantic relationship . | ||
| untrimmed | 12 | |
| D19-1518 In this paper, we focus on natural language video localization: localizing (ie, grounding) a natural language description in a long and ***** untrimmed ***** video sequence. | ||
| D18-1015 We introduce an effective and efficient method that grounds (i.e., localizes) natural sentences in long, ***** untrimmed ***** video sequences. | ||
| 2021.emnlp-main.327 Given an ***** untrimmed ***** video and a natural language query, Natural Language Video Localization (NLVL) aims to identify the video moment described by query. | ||
| D19-1157 We propose weakly supervised language localization networks (WSLLN) to detect events in long, ***** untrimmed ***** videos given language queries. | ||
| 2020.coling-main.167 Temporal sentence localization in videos aims to ground the best matched segment in an ***** untrimmed ***** video according to a given sentence query | ||
| Jupyter | 12 | |
| 2021.dash-1.13 Furthermore, CrossCheck is implemented as a ***** Jupyter ***** widget, which allows for rapid and convenient integration into existing model development workflows. | ||
| 2021.teachingnlp-1.8 MiniVQA is a ***** Jupyter ***** notebook to build a tailored VQA competition for your students. | ||
| D19-1546 Interactive programming with interleaved code snippet cells and natural language markdown is recently gaining popularity in the form of ***** Jupyter ***** notebooks, which accelerate prototyping and collaboration. | ||
| 2021.dash-1.10 We incorporated our sieve into an end-to-end system for cleaning NLP corpora, implemented as a modular collection of ***** Jupyter ***** notebooks built on extensions to the Pandas DataFrame library. | ||
| D19-3043 It supports multimodal sources and multiple text references, providing visualization in ***** Jupyter ***** notebook or a web app interface | ||
| nonverbal | 12 | |
| L16-1550 We expect this corpus to be a useful resource for researchers interested in natural language generation, intelligent virtual agents, generation of ***** nonverbal ***** behavior, and story and narrative representations. | ||
| W18-6247 Most existing works have investigated hirability from the perspective of ***** nonverbal ***** behavior, with verbal content receiving little interest. | ||
| L12-1044 This paper describes how the ISO 24617-2 annotation scheme can be used, together with the DIT++ method of multidimensional segmentation', to annotate ***** nonverbal ***** and multimodal dialogue behaviour. | ||
| L14-1177 The qualitative analysis of ***** nonverbal ***** communication is more and more relying on 3D recording technology. | ||
| L06-1278 There has been a lot of psychological researches on emotion and ***** nonverbal ***** communication | ||
| analytical | 12 | |
| 2002.jeptalnrecital-poster.13 This ATG has been developed to work with an automatic item generation system for ***** analytical ***** reasoning items for use in tests with high-stakes outcomes (such as college admissions decisions). | ||
| 2020.coling-main.342 We further study their corresponding functions through ***** analytical ***** study. | ||
| 2021.dash-1.3 Information visualization is critical to ***** analytical ***** reasoning and knowledge discovery. | ||
| 1984.bcs-1.22 For purposes of computational linguistics, this paper makes these analogies precise (on quantitative ***** analytical ***** basis), with emphasis on discrete recursive generation of larger structures, and equivalents of coding and decoding for machine translation process. | ||
| 2021.ranlp-1.183 This paper proposes AutoChart, a large dataset for the ***** analytical ***** description of charts, which aims to encourage more research into this important area | ||
| superficial | 12 | |
| 2021.acl-long.317 While sophisticated neural-based models have achieved remarkable success in Visual Question Answering (VQA), these models tend to answer questions only according to ***** superficial ***** correlations between question and answer. | ||
| 2021.naacl-main.304 Here, we propose to explicitly learn a model that does well on both the easy test set with ***** superficial ***** cues and the hard test set without ***** superficial ***** cues. | ||
| 2020.emnlp-main.616 We posit that some of the incorrect disambiguation choices are due to models' over-reliance on dataset artifacts found in training data, specifically ***** superficial ***** word co-occurrences, rather than a deeper understanding of the source text. | ||
| D19-6115 Statistical natural language inference (NLI) models are susceptible to learning dataset bias: ***** superficial ***** cues that happen to associate with the label on a particular dataset, but are not useful in general, e.g., negation words indicate contradiction. | ||
| S19-2157 In the effort to tackle the challenge of Hyperpartisan News Detection , i.e. , the task of deciding whether a news article is biased towards one party , faction , cause , or person , we experimented with two systems : i ) a standard supervised learning approach using *****superficial***** text and bag - of - words features from the article title and body , and ii ) a deep learning system comprising a four - layer convolutional neural network and max - pooling layers after the embedding layer , feeding the consolidated features to a bi - directional recurrent neural network . | ||
| nominal | 12 | |
| P19-1077 One of the key steps in language resource creation is the identification of the text segments to be annotated, or markables, which depending on the task may vary from ***** nominal ***** chunks for named entity resolution to (potentially nested) noun phrases in coreference resolution (or mentions) to larger text segments in text segmentation. | ||
| L16-1325 We present the first version of an operational rule-based MAC resolution strategy for patent material that covers the three major types of MAC: (i) ***** nominal ***** MAC, (ii) MAC with personal / relative pronouns, and MAC with reflexive / reciprocal pronouns. | ||
| L08-1558 These systems do not generalize to cope with compound ***** nominal ***** classes of multi word expressions. | ||
| L12-1063 However, if verbal designations of events are well studied and easier to detect than ***** nominal ***** ones, ***** nominal ***** designations do not claim as much definition effort and resources. | ||
| W17-4913 The results suggest that the style of German literary studies is characterized by ***** nominal ***** structures and the style of linguistics by verbal ones | ||
| edge | 12 | |
| 2021.emnlp-main.665 We present two graph algorithms for ***** edge ***** prediction: one inspired by recommender systems and one based on network link prediction. | ||
| 2020.lrec-1.244 We analyze differences between five existing gold standard corpora, create a standardized benchmark corpus, and provide a strong baseline model for ***** edge ***** detection. | ||
| 2021.naacl-main.468 Our results indicate that retrieve-and-read can be a viable option even in a highly constrained serving environment such as ***** edge ***** devices, as we show that it can achieve better accuracy than a purely parametric model with comparable docker-level system size. | ||
| 2020.findings-emnlp.250 However, they consume a lot of memory which poses a challenge for ***** edge ***** deployment | ||
| 2021.eacl-main.212 Significant memory and computational requirements of large deep neural networks restricts their application on *****edge***** devices . | ||
| argument extraction | 12 | |
| 2020.findings-emnlp.318 Event ***** argument extraction ***** (EAE) aims to identify the arguments of an event and classify the roles that those arguments play. | ||
| 2021.naacl-main.356 We propose a neural event coreference model in which event coreference is jointly trained with five tasks: trigger detection, entity coreference, anaphoricity determination, realis detection, and ***** argument extraction *****. | ||
| 2020.emnlp-main.128 ii) Our model is excelled in the data-scarce scenario, for example, obtaining 49.8% in F1 for event ***** argument extraction ***** with only 1% data, compared with 2.2% of the previous method. | ||
| 2021.emnlp-main.214 Implicit event ***** argument extraction ***** (EAE) is a crucial document-level information extraction task that aims to identify event arguments beyond the sentence level. | ||
| 2020.findings-emnlp.99 Our model is a sequence-labeling system with an efficient and effective ***** argument extraction ***** method. | ||
| similarity metrics | 12 | |
| S17-2024 We experimented with various dimensions of the vector and three state-of-the-art ***** similarity metrics *****. | ||
| 2003.mtsummit-papers.42 For best results, our implementation uses singular value decomposition, entropy-based weights, and second-order ***** similarity metrics *****. | ||
| C18-1058 These variations arise in complex ways that cannot be captured using textual ***** similarity metrics *****. | ||
| D18-1329 However, the evidence surrounding whether the images are useful is unconvincing due to inconsistencies between text-***** similarity metrics ***** and human judgements. | ||
| 2020.eval4nlp-1.6 In this paper, we present two techniques for improving encoding representations for ***** similarity metrics *****: a batch-mean centering strategy that improves statistical properties; and a computationally efficient tempered Word Mover Distance, for better fusion of the information in the contextualized word representations. | ||
| syntactic patterns | 12 | |
| 1998.amta-papers.36 It may in fact be superior to other approaches in that it can handle target surface-structure constraints, variation of ***** syntactic patterns *****, discourse-structure constraints, and stylistic preference. | ||
| L14-1052 This paper is a partial report of an on-going Kakenhi project which aims to improve sub-sentential alignment and release multilingual ***** syntactic patterns ***** for statistical and example-based machine translation. | ||
| L14-1451 Overall, our classifier successfully identifies very specific and not highly frequent lexical items such as complex-types with high accuracy, and distinguishes them from those instances that are not complex types by using lexico-***** syntactic patterns ***** indicative of the semantic classes corresponding to each of the individual sense components of the complex type. | ||
| C16-1130 In this paper, we study WSD with a sequence learning neural net, LSTM, to better capture the sequential and ***** syntactic patterns ***** of the text. | ||
| 2021.ranlp-1.140 Second, the transformer-based models BERT, XLM-RoBERTa, and M-BERT, known for their ability to capture semantic and ***** syntactic patterns ***** in the same representation. | ||
| spoken dialogues | 12 | |
| I17-1025 Unsupervised segmentation of phoneme sequences is an essential process to obtain unknown words during ***** spoken dialogues *****. | ||
| N19-2004 Data for human-human ***** spoken dialogues ***** for research and development are currently very limited in quantity, variety, and sources; such data are even scarcer in healthcare. | ||
| N19-2013 Successful contextual understanding of multi-turn ***** spoken dialogues ***** requires resolving referring expressions across turns and tracking the entities relevant to the conversation across turns. | ||
| L10-1176 In ***** spoken dialogues *****, if a spoken dialogue system does not respond at all during users utterances, the user might feel uneasy because the user does not know whether or not the system has recognized the utterances. | ||
| L14-1062 This paper presents the first release of the KiezDeutsch Korpus (KiDKo), a new language resource with multiparty ***** spoken dialogues ***** of Kiezdeutsch, a newly emerging language variety spoken by adolescents from multiethnic urban areas in Germany. | ||
| noisy labels | 12 | |
| 2020.coling-main.507 To ensure high-quality data, it is crucial to infer the correct labels by aggregating the ***** noisy labels *****. | ||
| P17-2077 Finally, a new simulation method is designed for validating the effectiveness of the proposed framework in aggregating ***** noisy labels *****. | ||
| D18-1230 After identifying the nature of ***** noisy labels ***** in distant supervision, we go beyond the traditional framework and propose a novel, more effective neural model AutoNER with a new Tie or Break scheme. | ||
| K19-1065 To denoise the ***** noisy labels *****, we apply a recently proposed deep probabilistic logic learning framework to incorporate both sentence-level and cross-sentence linguistic indicators for indirect supervision. | ||
| D17-1005 These annotations, referred as heterogeneous supervision, often conflict with each other, which brings a new challenge to the original relation extraction task: how to infer the true label from ***** noisy labels ***** for a given instance. | ||
| generating text | 12 | |
| 2021.naacl-main.276 Given a pre-existing model G for ***** generating text ***** from a distribution of interest, FUDGE enables conditioning on a desired attribute a (for example, formality) while requiring access only to G's output logits. | ||
| 2016.gwc-1.36 This can be a problem when ***** generating text ***** from input that does not specify the classifier, as in machine translation (MT) from English to Chinese. | ||
| 2020.emnlp-main.93 In this paper, we tackle this problem by using infilling techniques involving prediction of missing steps in a narrative while ***** generating text *****ual descriptions from a sequence of images. | ||
| D18-1436 In this paper, we introduce the task of automatically ***** generating text ***** to describe the differences between two similar images. | ||
| W19-8652 ***** generating text ***** which is unrelated to the input specification. | ||
| dependency representations | 12 | |
| L12-1412 State-of-the-art ***** dependency representations ***** such as the Stanford Typed Dependencies may represent the grammatical relations in a sentence as directed, possibly cyclic graphs. | ||
| W18-6005 On the other hand, the effects of written training data addition and speech-specific ***** dependency representations ***** largely depend on the parsing system selected. | ||
| N19-1020 The problem of right-adjunction is more resistant to solution, and has been tackled in the past using revealing-based approaches that often rely either on the higher-order unification over lambda terms (Pareschi and Steedman,1987) or heuristics over ***** dependency representations ***** that do not cover the whole CCGbank (Ambati et al., 2015). | ||
| W17-4805 In addition we enrich our system with ***** dependency representations ***** from an external parser and character representations of the source sentence. | ||
| 2020.findings-emnlp.398 In this paper, we propose a novel joint model of syntactic and semantic parsing on both span and ***** dependency representations *****, which incorporates syntactic information effectively in the encoder of neural network and benefits from two representation formalisms in a uniform way. | ||
| bidirectional recurrent neural | 12 | |
| C18-1009 In this paper, we propose new Japanese PAS analysis models that integrate the label prediction information of arguments in multiple PASs by extending the input and last layers of a standard deep ***** bidirectional recurrent neural ***** network (bi-RNN) model. | ||
| D18-1245 To alleviate this issue, we propose a novel multi-level structured (2-D matrix) self-attention mechanism for DS-RE in a multi-instance learning (MIL) framework using ***** bidirectional recurrent neural ***** networks (BiRNN). | ||
| E17-1063 Our model which we call DENSE (as shorthand for Dependency Neural Selection) produces a distribution over possible heads for each word using features obtained from a ***** bidirectional recurrent neural ***** network. | ||
| K19-1082 We examine the extent to which deep learning methods support automatic detection and identification of slang from natural sentences using a combination of ***** bidirectional recurrent neural ***** networks, conditional random field, and multilayer perceptron. | ||
| R19-1151 We compare wide range of methods including machine learning on bag-of-words representation, ***** bidirectional recurrent neural ***** networks as well as the most recent pre-trained architectures ELMO and BERT. | ||
| natural questions | 12 | |
| 2021.repl4nlp-1.24 Specially, neural semantic parsers (NSPs) effectively translate ***** natural questions ***** to logical forms, which execute on KB and give desirable answers. | ||
| D19-5803 The collect of ***** natural questions ***** is reduced to a validation/test set. | ||
| 2021.wnut-1.25 The main challenges of current QP models include lack of training data and difficulty in generating diverse and ***** natural questions *****. | ||
| 2020.emnlp-main.128 Our approach includes an unsupervised question generation process, which can transfer event schema into a set of ***** natural questions *****, followed by a BERT-based question-answering process to retrieve answers as EE results. | ||
| D18-1434 Generating ***** natural questions ***** from an image is a semantic task that requires using visual and language modality to learn multimodal representations. | ||
| author profiling | 12 | |
| 2020.bucc-1.3 This latter information can also be useful for ***** author profiling ***** tasks. | ||
| C18-1154 We evaluate the proposed model on three different tasks: natural language inference (NLI), ***** author profiling *****, and sentiment classification. | ||
| L14-1030 Over the last years, ***** author profiling ***** in general and author gender identification in particular have become a popular research area due to their potential attractive applications that range from forensic investigations to online marketing studies. | ||
| 2020.lrec-1.151 The plateform allowed the creation of the largest Algerian dialect dataset annotated for both sentiment (9,000 tweets), emotion (about 5,000 tweets) and extra-linguistic information including ***** author profiling ***** (age and gender). | ||
| L16-1541 However, to the best of our knowledge there is no such resource for ***** author profiling ***** of health forum data. | ||
| semantic spaces | 12 | |
| S18-1153 We experiment with state-of-the-art ***** semantic spaces ***** and with simple co-occurrence statistics. | ||
| L10-1206 This study examines the relationship between two kinds of ***** semantic spaces ***** ― i.e., spaces based on term frequency (tf) and word cooccurrence frequency (co) ― and four semantic relations ― i.e., synonymy, coordination, superordination, and collocation ― by comparing, for each semantic relation, the performance of two ***** semantic spaces ***** in predicting word association. | ||
| W19-6106 We show that starting from simple lists of word pairs (rather than a list of entities with directional links) it is possible to build diachronic hierarchical ***** semantic spaces ***** which allow us to model a process towards specialization for selected scientific fields. | ||
| 2020.acl-main.150 The interlingual network enables the model to learn a language-independent representation from the ***** semantic spaces ***** of different languages, while still allowing for language-specific specialization of a particular language-pair. | ||
| Q16-1020 We ground word embeddings in ***** semantic spaces ***** studied in the cognitive-psychometric literature, taking these spaces as the primary objects to recover. | ||
| constraint satisfaction | 12 | |
| U18-1004 In this paper we propose an extensible and efficient framework for inducing relations via the use of ***** constraint satisfaction *****. | ||
| 2011.freeopmt-1.3 Extensible Dependency Grammar (XDG; Debusmann, 2007) is a flexible, modular dependency grammar framework in which sentence analyses consist of multigraphs and processing takes the form of ***** constraint satisfaction *****. | ||
| 2021.acl-long.9 The MF models can be trained to generate tokens in a hypothesis until all constraints are satisfied, guaranteeing high ***** constraint satisfaction *****. | ||
| W16-4120 This previous work allows us to compare the surprisal measure, which is based on ***** constraint satisfaction ***** theories of language modeling, to those previously used measures, which are more directly linked to empirical observations of processing complexity. | ||
| 2020.emnlp-demos.10 This work presents CoSaTa , an intuitive *****constraint satisfaction***** solver and interpreted language for knowledge bases of semi - structured tables expressed as text . | ||
| language learner | 12 | |
| P19-3034 We introduce a system aimed at improving and expanding second ***** language learner *****s' English vocabulary. | ||
| 2020.lrec-1.34 Accordingly, we also report on on-going proof-of-concept efforts aiming at developing the first prototypical implementation of the approach in order to correct and extend an LR called ConceptNet based on the input crowdsourced from ***** language learner *****s. | ||
| 2020.emnlp-main.312 Many English-as-a-second ***** language learner *****s have trouble using near-synonym words (e.g., small vs.little; briefly vs.shortly) correctly, and often look for example sentences to learn how two nearly synonymous terms differ. | ||
| W19-4407 To address this, we present and release an annotated data set of 6,121 spelling errors in context, based on a corpus of essays written by English ***** language learner *****s. | ||
| 2020.inlg-1.31 Japanese sentence-ending predicates intricately combine content words and functional elements, such as aspect, modality, and honorifics; this can often hinder the understanding of ***** language learner *****s and children. | ||
| user feedback | 12 | |
| 2020.emnlp-main.568 Aspect-based sentiment analysis of review texts is of great value for understanding ***** user feedback ***** in a fine-grained manner. | ||
| P18-2052 We demonstrate how the common machine translation problem of domain mismatch between training and deployment can be reduced solely based on chunk-level ***** user feedback *****. | ||
| L06-1023 This paper presents the main features of the computer-aided summarisation environment and explains the changes introduced to it as a result of ***** user feedback *****. | ||
| W19-1601 Data analysis shows that multimodality is key to successful interaction, measured both quantitatively and qualitatively via ***** user feedback *****. | ||
| 2020.emnlp-main.191 Specifically, we split the document into clause-like elementary discourse units (EDU) using a pre-trained discourse segmentation model, and we train our model in a weakly-supervised manner to predict whether each EDU is entailed by the ***** user feedback ***** in a conversation. | ||
| representations of text | 12 | |
| C18-1315 We study the problem of grounding distributional ***** representations of text *****s on the visual domain, namely visual-semantic embeddings (VSE for short). | ||
| P17-1032 Kernel methods enable the direct usage of structured ***** representations of text *****ual data during language learning and inference tasks. | ||
| D19-1112 Learning general ***** representations of text ***** is a fundamental problem for many natural language understanding (NLU) tasks. | ||
| P18-1216 Word embeddings are effective intermediate representations for capturing semantic regularities between words, when learning the ***** representations of text ***** sequences. | ||
| D17-1114 For vision information, we learn joint ***** representations of text *****s and images using a neural network. | ||
| sequence labelling | 12 | |
| 2021.emnlp-main.93 We propose the first model for the resulting graph extension problem based on autoregressive ***** sequence labelling *****. | ||
| P19-1529 We evaluate three different ways of encoding syntactic parses and three different ways of injecting them into a state-of-the-art neural ELMo-based SRL ***** sequence labelling ***** model. | ||
| W19-5408 We start with a model similar to Shef-bRNN, which we modify by using conditional random fields for ***** sequence labelling *****. | ||
| W18-2404 We propose that an attention-based RNN architecture can be used to simulate semantic priming for ***** sequence labelling *****. | ||
| 2020.acl-main.189 We apply LEAQI to three ***** sequence labelling ***** tasks, demonstrating significantly fewer queries to the expert and comparable (or better) accuracies over a passive approach. | ||
| limited data | 12 | |
| P18-4021 The library prioritises efficiency, modularity, and extensibility with the goal to make it easier to develop dialogue systems from scratch and with ***** limited data ***** available. | ||
| 2020.acl-main.343 Previous studies in multimodal sentiment analysis have used ***** limited data *****sets, which only contain unified multimodal annotations. | ||
| 2020.sigmorphon-1.1 Non-neural learners and manually designed grammars showed competitive and even superior performance on some languages (such as Ingrian, Tajik, Tagalog, Zarma, Lingala), especially with very ***** limited data *****. | ||
| W17-2710 To maximize our use of ***** limited data *****, we reverse the typical schema induction steps and introduce new similarity measures, building an intuitive process for inducing the structure of unknown events. | ||
| 2021.acl-long.14 Under different ***** limited data ***** settings, both automatic and human evaluations demonstrate that the proposed model outperforms strong baselines in response quality and persona consistency. | ||
| language sentences | 12 | |
| 2012.amta-monomt.1 In detail, this approach first translates into Spanish simplified forms and then predicts the final inflected forms through a morphology generation step based on shallow and deep-projected linguistic information available from both the source and target-***** language sentences *****. | ||
| P19-1581 Using all five of the proposed target language words as queries we mine target-***** language sentences *****. | ||
| R19-1107 One of these methods is back-translation, which consists on generating synthetic sentences by translating a set of monolingual, target-***** language sentences ***** using a Machine Translation (MT) model. | ||
| 2020.acl-main.22 Paraphrasing natural ***** language sentences ***** is a multifaceted process: it might involve replacing individual words or short phrases, local rearrangement of content, or high-level restructuring like topicalization or passivization. | ||
| C18-2005 The increased demand for structured knowledge has created considerable interest in knowledge extraction from natural ***** language sentences *****. | ||
| issues | 12 | |
| L12-1590 This paper addresses theoretical and practical ***** issues ***** experienced in the construction of Turkish National Corpus (TNC). | ||
| 2001.mtsummit-papers.2 Section 5 addresses generation ***** issues ***** outside of MT. | ||
| 2020.emnlp-main.40 These reproducibility ***** issues ***** are also present for other tasks with different pre-trained embeddings (e.g., MLQA with XLM-R). | ||
| D17-1174 One of the most pressing ***** issues ***** in discontinuous constituency transition-based parsing is that the relevant information for parsing decisions could be located in any part of the stack or the buffer. | ||
| D18-1513 Although there is some correlation with the human judgements, a range of ***** issues ***** limit the performance of the automated metrics. | ||
| entity relation extraction | 12 | |
| P19-1131 We develop a new paradigm for the task of joint ***** entity relation extraction *****. | ||
| 2021.emnlp-main.218 In this paper, we adapt the popular dependency parsing model, the biaffine parser, to this ***** entity relation extraction ***** task. | ||
| 2020.emnlp-main.132 In this paper, we integrate span-related information into pre-trained encoder for ***** entity relation extraction ***** task. | ||
| D18-1249 We investigate the task of joint ***** entity relation extraction *****. | ||
| 2021.acl-long.19 Many joint ***** entity relation extraction ***** models setup two separated label spaces for the two sub-tasks (i.e., entity detection and relation classification). | ||
| grammatical gender | 12 | |
| 2020.emnlp-main.456 A ***** grammatical gender ***** system divides a lexicon into a small number of relatively fixed grammatical categories. | ||
| 2020.acl-main.597 We know that form and meaning are often also indicative of ***** grammatical gender *****—which, as we quantitatively verify, can itself share information with declension class—so we also control for gender. | ||
| W19-3622 Many natural languages assign ***** grammatical gender ***** also to inanimate nouns in the language. | ||
| 2020.acl-main.690 In Neural Machine Translation (NMT) gender bias has been shown to reduce translation quality, particularly when the target language has ***** grammatical gender *****. | ||
| K19-1043 Many natural languages assign ***** grammatical gender ***** also to inanimate nouns in the language. | ||
| correct answer | 12 | |
| 2020.emnlp-main.189 However, they sometimes result in predicting the ***** correct answer ***** text but in a context irrelevant to the given question. | ||
| K17-1009 Here we propose an approach that uses answer ranking as distant supervision for learning how to select informative justifications, where justifications serve as inferential connections between the question and the ***** correct answer ***** while often containing little lexical overlap with either. | ||
| D17-1215 Our method tests whether systems can answer questions about paragraphs that contain adversarially inserted sentences, which are automatically generated to distract computer systems without changing the ***** correct answer ***** or misleading humans. | ||
| 2020.coling-main.222 CommonsenseQA is a task in which a ***** correct answer ***** is predicted through commonsense reasoning with pre-defined knowledge. | ||
| 2020.scai-1.2 The dependency between an adequate question formulation and *****correct answer***** selection is a very intriguing but still underexplored area . | ||
| engineering | 12 | |
| 2005.mtsummit-swtmt.1 The bottleneck has been the ***** engineering ***** of sufficiently comprehensive bodies of relevant knowledge The Semantic Web offers opportunities for the gradual evolution of a global heterogeneous knowledge base. | ||
| 2021.eacl-main.55 Non-neural approaches to argument mining (AM) are often pipelined and require heavy feature-***** engineering *****. | ||
| L12-1406 The annotation tool is implemented as a component of the Ellogon language ***** engineering ***** platform, exploiting its extensive annotation engine, its cross-platform abilities and its linguistic processing components, if such a need arises. | ||
| W18-0540 We developed solutions following three approaches: (i) a feature ***** engineering ***** method using lexical, n-gram and psycholinguistic features, (ii) a shallow neural network method using only word embeddings, and (iii) a Long Short-Term Memory (LSTM) language model, which is pre-trained on a large text corpus to produce a contextualized word vector. | ||
| 2020.coling-main.535 Conventional AES typically relies on handcrafted features , whereas recent studies have proposed AES models based on deep neural networks ( DNNs ) to obviate the need for feature *****engineering***** . | ||
| automatically detecting | 12 | |
| 2005.jeptalnrecital-recitalcourt.12 In this paper we present an experiment that explores the possibility of ***** automatically detecting ***** the emerging textual patterns that are slowly taking shape on the Web. | ||
| 2020.acl-main.403 Beyond research in linguistics and political communication, accurately and ***** automatically detecting ***** parody is important to improving fact checking for journalists and analytics such as sentiment analysis through filtering out parodical utterances. | ||
| 2021.wanlp-1.34 Dialect identification is the task of ***** automatically detecting ***** the source variety of a given text or speech segment. | ||
| N19-1219 In this work, we study the content and structure of peer reviews under the argument mining framework, through ***** automatically detecting ***** (1) the argumentative propositions put forward by reviewers, and (2) their types (e.g., evaluating the work or making suggestions for improvement). | ||
| 2021.acl-short.101 We benchmark the novel task of ***** automatically detecting ***** those needs on short posts in English, by modelling it as a ternary classification task, and as three binary classification tasks. | ||
| language variation | 12 | |
| W17-1202 In this work, we propose an information-theoretic approach to geographic ***** language variation ***** using a corpus based on Twitter. | ||
| 2021.acl-long.341 RADDLE also includes a diagnostic checklist that facilitates detailed robustness analysis in aspects such as ***** language variation *****s, speech errors, unseen entities, and out-of-domain utterances. | ||
| P16-5005 We describe a recent trend in NMT, that is to translate at the sub-word level (Chung et al., 2016; Luong and Manning, 2016; Sennrich et al., 2016), so that ***** language variation *****s can be effectively handled. | ||
| 2020.lrec-1.434 The dataset consist of 4,262 unique sentences with average length of 10 words, illustrating 15 types of modifications such as simplification, generalization, or formal and informal ***** language variation *****. | ||
| Q17-1021 In this paper, we show how to exploit social networks to make sentiment analysis more robust to social ***** language variation *****. | ||
| support | 12 | |
| 2021.naacl-main.258 We provide ***** support *****ive evidence by ex-perimentally confirming that well-performingmodels show a low sensitivity to noise andfine-tuning with LNSR exhibits clearly bet-ter generalizability and stability. | ||
| 2021.acl-long.523 We show that Polyjuice produces diverse sets of realistic counterfactuals, which in turn are useful in various distinct applications: improving training and evaluation on three different tasks (with around 70% less annotation effort than manual generation), augmenting state-of-the-art explanation techniques, and ***** support *****ing systematic counterfactual error analysis by revealing behaviors easily missed by human experts. | ||
| L10-1105 We describe our computer-***** support *****ed framework to overcome the rule of metadata schism. | ||
| 2020.winlp-1.17 In the following, we present a system for assisted typing in LS whose accuracy and speed is largely due to the deployment of real time natural-language processing enabling efficient prediction and context-sensitive grammar ***** support *****. | ||
| L08-1262 We present an experiment in extracting collocations from the FrameNet corpus , specifically , *****support***** verbs such as direct in Environmentalists directed strong criticism at world leaders . | ||
| political ideology | 12 | |
| 2020.emnlp-main.620 We suggest to break the broad policy frames suggested by Boydstun et al., 2014 into fine-grained subframes which can capture differences in ***** political ideology ***** in a better way. | ||
| 2020.coling-main.428 Finally, as a way to mitigate the bias, we propose to learn a text representation that is invariant to ***** political ideology ***** while still judging topic relevance. | ||
| 2021.eacl-main.152 In this paper, we challenge the assumption that ***** political ideology ***** is inherently built into text by presenting an investigation into the impact of experiential factors on annotator perceptions of ***** political ideology *****. | ||
| P17-1068 This study examines users' ***** political ideology ***** using a seven-point scale which enables us to identify politically moderate and neutral users – groups which are of particular interest to political scientists and pollsters. | ||
| N19-1216 In the context of fake news , bias , and propaganda , we study two important but relatively under - explored problems : ( i ) trustworthiness estimation ( on a 3 - point scale ) and ( ii ) *****political ideology***** detection ( left / right bias on a 7 - point scale ) of entire news outlets , as opposed to evaluating individual articles . | ||
| hierarchical attention | 12 | |
| N18-2098 To this end, we propose a mixed ***** hierarchical attention ***** based encoder-decoder model which is able to leverage the structure in addition to the content of the tables. | ||
| D19-1045 In this work, we propose a ***** hierarchical attention ***** prototypical networks (HAPN) for few-shot text classification. | ||
| 2021.acl-srw.9 For spatial features, we propose a ***** hierarchical attention ***** network to model the spatial information from object-level to video-level. | ||
| S18-1042 Our system consists of three main modules: preprocessing module, stacking module to solve the intensity prediction of emotion and sentiment, LSTM network module to solve multi-label classification, and the ***** hierarchical attention ***** network module for solving emotion and sentiment classification problem. | ||
| K18-1018 Therefore, we propose a ***** hierarchical attention ***** based position-aware network (HAPN), which introduces position embeddings to learn the position-aware representations of sentences and further generate the target-specific representations of contextual words. | ||
| linguistically motivated | 12 | |
| 2020.iwpt-1.15 Experiments in this work using a labeled evaluation metric, RH, show that ***** linguistically motivated ***** predictions about grammar sparsity and use of categories can only be revealed through labeled evaluation. | ||
| L10-1413 In this paper we use statistical machine translation and morphology information from two different morphological analyzers to try to improve translation quality by ***** linguistically motivated ***** segmentation. | ||
| E17-2099 We explore combinations of ***** linguistically motivated ***** approaches to address these problems in English-to-German SMT and show that they are complementary to one another, but also that the popular verbal pre-ordering can cause problems on the morphological and lexical level. | ||
| P17-1005 The induced predicate-argument structures shed light on the types of representations useful for semantic parsing and how these are different from ***** linguistically motivated ***** ones. | ||
| W18-2901 This work introduces a novel , *****linguistically motivated***** architecture for composing morphemes to derive word embeddings . | ||
| discourse annotation | 12 | |
| R17-1054 The naive approach to annotation projection is not effective to project ***** discourse annotation *****s from one language to another because implicit relations are often changed to explicit ones and vice-versa in the translation. | ||
| 2020.lrec-1.854 Apart from the project it was originally designed for, in which hundreds of texts were annotated by three annotators, TIARA has already been adopted by a second ***** discourse annotation ***** study, which uses it in the teaching of argumentation. | ||
| L16-1165 In this paper, we describe ouron annotating two spoken domains from the SPICE Ireland corpus (telephone conversations and broadcast interviews) according todifferent ***** discourse annotation ***** schemes, PDTB 3.0 and CCR. | ||
| W19-4014 We present a *****discourse annotation***** study , in which an annotation method based on Questions under Discussion ( QuD ) is applied to Italian data . | ||
| W17-0803 Traditional *****discourse annotation***** tasks are considered costly and time - consuming , and the reliability and validity of these tasks is in question . | ||
| small data | 12 | |
| N19-3007 We demonstrate the competitiveness of BoC by comparing with methods of higher complexity, and explore its effectiveness on this ***** small data *****set. | ||
| 2021.acl-long.241 In direct comparison on ***** small data *****bases, our approach increases overall answer accuracy from 85% to 90%. | ||
| S18-1021 The major issue was to apply a state-of-the-art system despite the ***** small data *****set provided: the system would quickly overfit. | ||
| 2020.wanlp-1.33 Unfortunately, deep learning training on ***** small data ***** sets is not the best option because most of the time traditional machine learning algorithms could get better scores. | ||
| 2021.clpsych-1.17 We compare and contrast different types of linguistic features as well as different classification algorithms and explore the limitations of applying these techniques on a ***** small data *****set. | ||
| text complexity | 12 | |
| L12-1172 We examined several factors of ***** text complexity ***** (average sentence length, Automated Readability Index, sentence complexity and passive voice) in the 20th century for two main English language varieties - British and American, using the `Brown family' of corpora. | ||
| 2020.lrec-1.887 The task of automatic assessment of conceptual ***** text complexity *****, important for maintaining reader's interest and text adaptation for struggling readers, has only been proposed recently. | ||
| 2020.readi-1.2 We evaluate the resulting corpus using a set of features that has proven to predict ***** text complexity ***** of Swedish texts. | ||
| 2020.lrec-1.177 Recently, we introduced the task of automatic assessment of conceptual ***** text complexity *****, proposing a set of graph-based deep semantic features using DBpedia as a proxy to human knowledge. | ||
| 2021.semeval-1.90 Beyond encoding out-of-context information about the lemma, we implemented features based on pre-trained language models to model the target word's in-con***** text complexity *****. | ||
| scientific paper | 12 | |
| D19-1236 The review and selection process for ***** scientific paper ***** publication is essential for the quality of scholarly publications in a scientific field. | ||
| D19-5220 This year we participate in ***** scientific paper ***** tasks and focus on the language pair between English and Japanese. | ||
| 2020.coling-main.468 The experimental results demonstrate that our model not only substantially achieves state-of-the-art results on CNN/DM and NYT datasets but also considerably outperforms existing approaches on ***** scientific paper ***** datasets consisting of much longer documents, indicating its better robustness in document genres and lengths. | ||
| 2021.acl-long.115 In summary, our contributions are (1) a new dataset for numerical table-to-text generation using pairs of a table and a paragraph of a table description with richer inference from ***** scientific paper *****s, and (2) a table-to-text generation framework enriched with numerical reasoning. | ||
| L16-1350 It consists of a Japanese-English ***** scientific paper ***** abstract corpus of approximately 3 million parallel sentences (ASPEC-JE) and a Chinese-Japanese ***** scientific paper ***** excerpt corpus of approximately 0.68 million parallel sentences (ASPEC-JC). | ||
| position | 12 | |
| W18-4106 Sentences with presup***** position *****s are often treated as uninterpretable or unvalued (neither true nor false) if their presup***** position *****s are not satisfied. | ||
| L12-1283 This work is part of a project for MWE extraction and characterization using different techniques aiming at measuring the properties related to idiomaticity, as institutionalization, non-com***** position *****ality and lexico-syntactic fixedness. | ||
| 1997.mtsummit-workshop.4 (1) From a linguistic viewpoint, the expected benefits include a refinement of the aspectual model in (Olsen, 1994; Olsen, 1997) (which provides necessary but not sufficient conditions for aspectual com- ***** position *****), and a refinement of the verb classifications in (Levin, 1993); we also expect our approach to eventually produce a systematic definition (in terms of LCSs and com***** position *****al operations) of the precise meaning components responsible for Levin's classification. | ||
| 2021.emnlp-main.237 Relative position embedding ( RPE ) is a successful method to explicitly and efficaciously encode *****position***** information into Transformer models . | ||
| 2020.lrec-1.643 That is , our approach addresses the parsing of dependency trees with a sequence model implemented with a bidirectional LSTM over BERT embeddings , where the tag to be predicted at each token position is the relative *****position***** of the corresponding head . | ||
| sentence representation | 12 | |
| S19-2048 Our model extends the Recurrent Convolutional Neural Network (RCNN) by using external fine-tuned word representations and DeepMoji ***** sentence representation *****s. | ||
| 2021.nlp4convai-1.18 In this work, we aim to construct a robust ***** sentence representation ***** learning model, that is specifically designed for dialogue response generation, with Transformer-based encoder-decoder structure. | ||
| 2020.emnlp-main.225 We find that (i) sentence positional encoding can lead to a large improvement for identifying discourse elements; (ii) a structural relative positional encoding of sentences shows to be most effective; (iii) inter-sentence attention vectors are useful as a kind of ***** sentence representation *****s for identifying discourse elements. | ||
| N19-1351 In addition, we show that even though ***** sentence representation ***** learning through prediction of discourse marker yields state of the art results across different transfer tasks, it's not clear that our models made use of the semantic relation between sentences, thus leaving room for further improvements. | ||
| 2020.aacl-main.9 In this paper, we propose a ***** sentence representation ***** approximating oriented distillation framework that can distill the pre-trained BERT into a simple LSTM based model without specifying tasks. | ||
| hierarchical neural | 12 | |
| 2020.findings-emnlp.18 Syntax has been shown useful for various NLP tasks, while existing work mostly encodes singleton syntactic tree using one ***** hierarchical neural ***** network. | ||
| D17-1219 We propose a ***** hierarchical neural ***** sentence-level sequence tagging model for this task, which existing approaches to question generation have ignored. | ||
| W19-4208 We approach this task with a ***** hierarchical neural ***** conditional random field (CRF) model which predicts each coarse-grained feature (eg. | ||
| C18-1149 In this paper, we develop a multi-attention-based neural network (MANN) with well-designed optimizations, like Highway Network, and concatenated features with embedding representations into the ***** hierarchical neural ***** network model. | ||
| K19-1093 In particular, we construct a ***** hierarchical neural ***** network that leverages valuable information from a person's past expressions, and offer a better understanding of the sentiment from the expresser's perspective. | ||
| single language | 12 | |
| L10-1119 These issues are of a more cultural nature, and may even come into play when several documents in a ***** single language ***** are involved. | ||
| D19-1450 Distributed representations of words which map each word to a continuous vector have proven useful in capturing important linguistic information not only in a ***** single language ***** but also across different languages. | ||
| 2018.iwslt-1.8 The parameter transfer mechanism is evaluated in two scenarios: i) to adapt a trained ***** single language ***** NMT system to work with a new language pair and ii) to continuously add new language pairs to grow to a multilingual NMT system. | ||
| C16-1294 Original experiments justifying the design of HTER, as opposed to other possible formulations, were limited to a small sample of translations and a ***** single language ***** pair, however, and this motivates our re-evaluation of a range of human-targeted metrics on a substantially larger scale. | ||
| E17-1083 A well-established technique for automatically extracting paraphrases leverages bilingual corpora to find meaning-equivalent phrases in a ***** single language ***** by “pivoting” over a shared translation in another language. | ||
| translating natural language | 12 | |
| P18-5006 Semantic parsing, the study of ***** translating natural language ***** utterances into machine-executable programs, is a well-established research area and has applications in question answering, instruction following, voice assistants, and code generation. | ||
| 2021.naacl-main.219 Semantic parsing aims at ***** translating natural language ***** (NL) utterances onto machine-interpretable programs, which can be executed against a real-world environment. | ||
| L10-1502 Automatically ***** translating natural language ***** into machine-readable instructions is one of major interesting and challenging tasks in Natural Language (NL) Processing. | ||
| 2020.coling-main.226 Semantic parsing is the task of ***** translating natural language ***** utterances into machine-readable meaning representations. | ||
| Q14-1042 Semantic parsing is the task of *****translating natural language***** utterances into a machine - interpretable meaning representation . | ||
| lexicalized grammar | 12 | |
| P17-1195 We chose a hybrid approach combining a shallow syntactic analyzer and a manually-developed ***** lexicalized grammar *****. | ||
| 1997.iwpt-1.22 In previous work we introduced the idea of supertagging as a means of improving the efficiency of a ***** lexicalized grammar ***** parser. | ||
| P17-1122 Our evaluation shows that generation with Relational-Realizational (Tsarfaty and Sima'an, 2008) inspired grammar gets better language model scores than ***** lexicalized grammar *****s `a la Collins (2003), and that the latter gets better human-evaluation scores. | ||
| L08-1321 Combinatorial Category Grammar is (CCG) a ***** lexicalized grammar ***** formalism which is expressed by syntactic category, a logical form representation. | ||
| 1997.iwpt-1.6 We address the issue of how to associate frequency information with *****lexicalized grammar***** formalisms , using Lexicalized Tree Adjoining Grammar as a representative framework . | ||
| perception | 12 | |
| W19-2912 Inspired by the literature on multisensory integration, we develop a computational model to ground quantifiers in ***** perception *****. | ||
| D18-1289 In this paper, we present a crowdsourcing-based approach to model the human ***** perception ***** of sentence complexity. | ||
| L10-1223 LIPS provides an attractive alternative to costly multi-participant ***** perception ***** experiments by automatically computing IPs for arbitrary words. | ||
| W17-2811 However, very little research relates a combination of multimodal social signals and language features detected during spoken face-to-face human-robot interaction to the resulting user ***** perception ***** of a robot. | ||
| D19-1408 Existing methods based on supervised learning require a large amount of well-labelled training data, which is difficult to obtain due to inconsistent ***** perception ***** of fine-grained emotion intensity. | ||
| message | 12 | |
| L16-1200 The problem of understanding the stream of ***** message *****s exchanged on social media such as Facebook and Twitter is becoming a major challenge for automated systems. | ||
| P19-1239 It is insufficient to detect sarcasm from multi-model ***** message *****s based only on texts. | ||
| 2020.emnlp-main.512 Conversation disentanglement aims to separate intermingled ***** message *****s into detached conversations. | ||
| 2016.lilt-14.7 Moreover, it can be a disruptive factor in sentiment analysis and opinion mining, because it changes the polarity of a ***** message ***** implicitly. | ||
| 2020.emnlp-main.22 Linguistic steganography studies how to hide secret ***** message *****s in natural language cover texts. | ||
| summary evaluation | 12 | |
| 2021.newsum-1.6 For two different evaluation scenarios – evaluation against gold summaries and system output ratings – we show that ***** summary evaluation ***** is sensitive to protected attributes. | ||
| 2021.ranlp-1.98 Results show that, in most cases, GeSERA achieves higher correlations with manual evaluation methods than SERA, while it reduces its gap with ROUGE for general-domain ***** summary evaluation *****. | ||
| C18-1077 We present a new ***** summary evaluation ***** approach that does not require human model summaries. | ||
| C16-1024 In our experimental evaluation, we investigate the optimization of two information-theoretic ***** summary evaluation ***** metrics and find that our framework yields competitive results compared to several strong summarization baselines. | ||
| W17-1007 The present paper introduces a new MultiLing text ***** summary evaluation ***** method. | ||
| suicide risk | 12 | |
| W19-3022 This research, motivated by the CLPsych 2019 shared task, developed neural network-based methods for analyzing posts in one or more Reddit forums to assess the subject's ***** suicide risk *****. | ||
| W19-3024 It contributes to Shared Task A in the 2019 CLPsych workshop by predicting users' ***** suicide risk ***** given posts in the Reddit subforum r/SuicideWatch. | ||
| W18-0603 We report on the creation of a dataset for studying assessment of ***** suicide risk ***** via online postings in Reddit. | ||
| W19-3025 We focused primarily on Task A, which aimed to predict ***** suicide risk *****, as rated by a team of expert clinicians (Shing et al., 2018), based on language used in SuicideWatch posts on Reddit. | ||
| W19-3018 This paper describes our system submission for the CLPsych 2019 shared task B on *****suicide risk***** assessment . | ||
| category | 12 | |
| L06-1322 From our experimental results, we found that the correspondence between a group of adjectives and their ***** category ***** name was more suitable in our method than in the EDR lexicon. | ||
| 2020.semeval-1.159 To utilise both text and image data, a multi-modal CNN-LSTM model is proposed to jointly learn latent features for positive, negative and neutral ***** category ***** predictions. | ||
| 2021.smm4h-1.22 In our system, we use a transformer-based language model fine-tuning approach to automatically identify tweets in the self-reports ***** category *****. | ||
| W16-5322 Previous studies have showed that some pairs of antonyms are perceived to be better examples of opposition than others, and are so considered representative of the whole ***** category ***** (e.g., Deese, 1964; Murphy, 2003; Paradis et al., 2009). | ||
| 2021.sigdial-1.41 To clarify the boundaries of “openness”, we conduct two studies: First, we classify the types of “speech events” encountered in a chatbot evaluation data set (i.e., Meena by Google) and find that these conversations mainly cover the “small talk” ***** category ***** and exclude the other speech event categories encountered in real life human-human communication. | ||
| online media | 12 | |
| D19-1675 Controversial claims are abundant in ***** online media ***** and discussion forums. | ||
| D19-5316 The rising growth of fake news and misleading information through ***** online media ***** outlets demands an automatic method for detecting such news articles. | ||
| 2020.acl-main.50 In this paper, we propose a cascaded method that uses unsupervised learning to ascertain the stance of Twitter users with respect to a polarizing topic by leveraging their retweet behavior; then, it uses supervised learning based on user labels to characterize both the general political leaning of ***** online media ***** and of popular Twitter users, as well as their stance with respect to the target polarizing topic. | ||
| 2021.wassa-1.4 In this work, we focus on detecting sarcasm in textual conversations, written in English, from various social networking platforms and ***** online media *****. | ||
| 2021.ltedi-1.16 Due to the development of modern computer technology and the increase in the number of ***** online media ***** users, we can see all kinds of posts and comments everywhere on the internet. | ||
| usage | 12 | |
| 2020.acl-main.51 The problem of comparing two bodies of text and searching for words that differ in their ***** usage ***** between them arises often in digital humanities and computational social science. | ||
| 2020.wnut-1.52 Increasing ***** usage ***** of social media presents new non-traditional avenues for monitoring disease outbreaks, virus transmissions and disease progressions through user posts describing test results or disease symptoms. | ||
| L06-1227 The main ***** usage ***** scenario is the design of multilingual corpora. | ||
| C16-2013 However, the ***** usage ***** of established NLP frameworks is often hampered for several reasons: in most cases, they require basic to sophisticated programming skills, interfere with interoperability due to using non-standard I/O-formats and often lack tools for visualizing computational results. | ||
| L16-1449 On native test data the models perform very well, showing that we can model preposition ***** usage ***** appropriately. | ||
| semantic frame | 12 | |
| C16-1121 We present a successful collaboration of word embeddings and co-training to tackle in the most difficult test case of semantic role labeling: predicting out-of-domain and unseen ***** semantic frame *****s. | ||
| 2020.findings-emnlp.100 By collecting comparative adjectives from existing dictionaries and utilizing a ***** semantic frame *****work to catch comparative quantifiers, the semantics of clues concerning comparison structures are better understood, ensuring conversion to correct logic representation. | ||
| L08-1231 This paper reports on the design and construction of a bio-event annotated corpus which was developed with a specific view to the acquisition of ***** semantic frame *****s from biomedical corpora. | ||
| L10-1490 one tenth of them aligned with appropriate ***** semantic frame *****s, supports XML import and export and will be accessible, i.e., displayed and queried via the web. | ||
| 2021.acl-short.102 Recent studies on *****semantic frame***** induction show that relatively high performance has been achieved by using clustering - based methods with contextualized word embeddings . | ||
| distributional word representation | 12 | |
| E17-2082 *****Distributional word representations***** are widely used in NLP tasks. | ||
| I17-1024 To enhance the expression ability of *****distributional word representation***** learning model, many researchers tend to induce word senses through clustering, and learn multiple embedding vectors for each word, namely multi-prototype word embedding model. | ||
| C18-1020 *****Distributional word representations***** (often referred to as word embeddings) are omnipresent in modern NLP. | ||
| W17-2810 *****Distributional word representation***** methods exploit word co-occurrences to build compact vector encodings of words. | ||
| W17-1322 *****Distributional word representations***** such as word embeddings learned over large corpora have been shown to capture syntactic and semantic word relationships. | ||
| multi - modal neural machine translation | 12 | |
| 2020.acl-main.273 *****Multi-modal neural machine translation***** (NMT) aims to translate source sentences into a target language paired with images. | ||
| P19-1642 In this work, we propose to model the interaction between visual and textual features for *****multi-modal neural machine translation***** (MMT) through a latent variable model. | ||
| P17-1175 We introduce a *****Multi-modal Neural Machine Translation***** model in which a doubly-attentive decoder naturally incorporates spatial visual features obtained using pre-trained convolutional neural networks, bridging the gap between image description and translation. | ||
| 2020.wat-1.11 For the challenge test data, our *****multi-modal neural machine translation***** system achieves Bilingual Evaluation Understudy (BLEU) score of 33.57, Rank-based Intuitive Bilingual Evaluation Score (RIBES) 0.754141, Adequacy-Fluency Metrics (AMFM) score 0.787320 and for evaluation test data, BLEU, RIBES, and, AMFM score of 40.51, 0.803208, and 0.820980 for English to Hindi translation respectively. | ||
| W17-2004 We conduct a human evaluation where we assess how a *****multi-modal neural machine translation***** (NMT) model compares to two text-only approaches: a conventional state-of-the-art attention-based NMT and a phrase-based statistical machine translation (PBSMT) model. | ||
| inductive transfer | 12 | |
| 2021.emnlp-main.740 *****Inductive transfer***** learning has taken the entire NLP field by storm, with models such as BERT and BART setting new state of the art on countless NLU tasks. | ||
| P18-1031 *****Inductive transfer***** learning has greatly impacted computer vision, but existing approaches in NLP still require task-specific modifications and training from scratch. | ||
| P19-1060 We propose a hierarchical neural network trained in a multi-task fashion that learns to predict a document-level coherence score (at the network's top layers) along with word-level grammatical roles (at the bottom layers), taking advantage of *****inductive transfer***** between the two tasks. | ||
| W19-4307 The method is complementary to recent state-of-the-art approaches to *****inductive transfer***** via fine-tuning, and forgoes costly model architectures and annotation. | ||
| 2020.acl-main.206 Motivated by recent advances in multi-task learning, we develop neural networks trained in a multi-task fashion that learn to predict the proficiency level of non-native English speakers by taking advantage of *****inductive transfer***** between the main task (grading) and auxiliary prediction tasks: morpho-syntactic labeling, language modeling, and native language identification (L1). | ||
| arrau corpus | 12 | |
| D17-1021 We also report first benchmark results on an abstract anaphora subset of the *****ARRAU corpus*****. | ||
| W18-0703 We present two systems for bridging resolution, which we submitted to the CRAC shared task on bridging anaphora resolution in the *****ARRAU corpus***** (track 2): a rule-based approach following Hou et al. 2014 and a learning-based approach. | ||
| 2020.coling-main.538 Evaluation on the gold annotated *****ARRAU corpus***** shows that the out best model uses a combination of three auxiliary corpora achieved F1 scores of 70% and 43.6% when evaluated in a lenient and strict setting, respectively, i.e., 11 and 21 percentage points gain when compared with our baseline. | ||
| 2020.crac-1.5 We test it on the *****ARRAU corpus*****, where we get 65.6 F1 CoNLL. | ||
| W18-0702 The *****ARRAU corpus***** is an anaphorically annotated corpus of English providing rich linguistic information about anaphora resolution. | ||
| multilingual grammar | 12 | |
| D19-1576 The key to *****multilingual grammar***** induction is to couple grammar parameters of different languages together by exploiting the similarity between languages. | ||
| 2020.lrec-1.347 We used Grammatical Framework (GF), a *****multilingual grammar***** formalism and a special- purpose functional programming language to formalise the descriptive grammar of these languages. | ||
| P19-1235 In order to find a better indicator for quality of induced grammars, this paper correlates several linguistically- and psycholinguistically-motivated predictors to parsing accuracy on a large *****multilingual grammar***** induction evaluation data set. | ||
| L12-1582 This paper describes an open-source Latvian resource grammar implemented in Grammatical Framework (GF), a programming language for *****multilingual grammar***** applications. | ||
| D19-1148 Unlike previous work on *****multilingual grammar***** induction, our approach does not rely on any external resource, such as parallel corpora, word alignments or linguistic phylogenetic trees. | ||
| Neural Machine Translation | 12 | |
| W19-5316 This paper describes the *****Neural Machine Translation***** system of IIIT - Hyderabad for the GujaratiEnglish news translation shared task of WMT19 . | ||
| W17-3201 Recently , the attention mechanism plays a key role to achieve high performance for *****Neural Machine Translation***** models . | ||
| D19-5215 This paper describes the *****Neural Machine Translation***** systems used by IIIT Hyderabad ( CVIT - MT ) for the translation tasks part of WAT-2019 . | ||
| W19-5419 Transfer Learning and Selective data training are two of the many approaches being extensively investigated to improve the quality of *****Neural Machine Translation***** systems . | ||
| D19-5216 This paper describes the *****Neural Machine Translation***** systems of IIIT - Hyderabad ( LTRC - MT ) for WAT 2019 Hindi - English shared task . | ||
| Asian | 12 | |
| W16-5409 This paper describes various Indonesian language resources that Agency for the Assessment and Application of Technology ( BPPT ) has developed and collected since mid 80 's when we joined MMTS ( Multilingual Machine Translation System ) , an international project coordinated by CICC - Japan to develop a machine translation system for five *****Asian***** languages ( Bahasa Indonesia , Malay , Thai , Japanese , and Chinese ) . | ||
| W16-4619 Unlike European languages , many *****Asian***** languages like Chinese and Japanese do not have typographic boundaries in written system . | ||
| 2021.wat-1.13 In this paper , we introduce our TMU Neural Machine Translation ( NMT ) system submitted for the Patent task ( Korean Japanese and English Japanese ) of 8th Workshop on *****Asian***** Translation ( Nakazawa et al . , 2021 ) . | ||
| 2021.wat-1.14 In this paper , we describe our participation in the 2021 Workshop on *****Asian***** Translation ( team ID : tpt_wat ) . | ||
| 2020.wat-1.9 In this paper we describe our team`s ( NICT-5 ) Neural Machine Translation ( NMT ) models whose translations were submitted to shared tasks of the 7th Workshop on *****Asian***** Translation . | ||
| Deep | 12 | |
| W17-2703 Recent methods for Event Detection focus on *****Deep***** Learning for automatic feature generation and feature ranking . | ||
| I17-5006 A coming tutorial on *****Deep***** Learning for Semantic Composition will be given in ACL2017 . | ||
| W19-4306 Reduction of the number of parameters is one of the most important goals in *****Deep***** Learning . | ||
| D18-1215 *****Deep***** learning has emerged as a versatile tool for a wide range of NLP tasks , due to its superior capacity in representation learning . | ||
| 2021.naacl-tutorials.4 The advent of *****Deep***** Learning and the availability of large scale datasets has accelerated research on Natural Language Generation with a focus on newer tasks and better models . | ||
| hidden | 12 | |
| P17-2033 We propose a simple yet effective text - based user geolocation model based on a neural network with one *****hidden***** layer , which achieves state of the art performance over three Twitter benchmark geolocation datasets , in addition to producing word and phrase embeddings in the hidden layer that we show to be useful for detecting dialectal terms . | ||
| P19-1324 In neural network models of language , words are commonly represented using context - invariant representations ( word embeddings ) which are then put in context in the *****hidden***** layers . | ||
| 2020.acl-main.682 Approaches to Grounded Language Learning are commonly focused on a single task - based final performance measure which may not depend on desirable properties of the learned *****hidden***** representations , such as their ability to predict object attributes or generalize to unseen situations . | ||
| P18-2002 Increasing the capacity of recurrent neural networks ( RNN ) usually involves augmenting the size of the *****hidden***** layer , with significant increase of computational cost . | ||
| 2020.emnlp-main.717 Stance detection is an important component of understanding *****hidden***** influences in everyday life . | ||
| decision | 12 | |
| W17-6318 To improve grammatical function labelling for German , we augment the labelling component of a neural dependency parser with a *****decision***** history . | ||
| N18-1061 Temporal orientation refers to an individual 's tendency to connect to the psychological concepts of past , present or future , and it affects personality , motivation , emotion , *****decision***** making and stress coping processes . | ||
| D19-5602 Data scarcity is a long - standing and crucial challenge that hinders quick development of task - oriented dialogue systems across multiple domains : task - oriented dialogue models are expected to learn grammar , syntax , dialogue reasoning , *****decision***** making , and language generation from absurdly small amounts of task - specific data . | ||
| W18-5418 In this work , we analyze and interpret the cumulative nature of RNN via a proposed technique named as Layer - wIse - Semantic - Accumulation ( LISA ) for explaining decisions and detecting the most likely ( i.e. , saliency ) patterns that the network relies on while *****decision***** making . | ||
| D18-1128 A rich line of research attempts to make deep neural networks more transparent by generating human - interpretable ` explanations ' of their *****decision***** process , especially for interactive tasks like Visual Question Answering ( VQA ) . | ||
| Automatic Speech Recognition ( ASR ) | 12 | |
| 2020.wnut-1.18 Punctuation restoration is a common post - processing problem for *****Automatic Speech Recognition ( ASR )***** systems . | ||
| 2020.lrec-1.793 Low - resource languages suffer from lower performance of *****Automatic Speech Recognition ( ASR )***** system due to the lack of data . | ||
| 2021.rocling-1.9 *****Automatic Speech Recognition ( ASR )***** technology presents the possibility for medical professionals to document patient record , diagnosis , postoperative care , patrol records , and etc . | ||
| 2021.wnut-1.19 *****Automatic Speech Recognition ( ASR )***** systems generally do not produce punctuated transcripts . | ||
| 2020.emnlp-main.206 The cascade approach to Speech Translation ( ST ) is based on a pipeline that concatenates an *****Automatic Speech Recognition ( ASR )***** system followed by a Machine Translation ( MT ) system . | ||
| Word Sense | 12 | |
| 2020.lrec-1.706 Large sense - annotated datasets are increasingly necessary for training deep supervised systems in *****Word Sense***** Disambiguation . | ||
| R19-1061 *****Word Sense***** Disambiguation remains a challenging NLP task . | ||
| 2021.eacl-main.140 We present WiC - TSV , a new multi - domain evaluation benchmark for *****Word Sense***** Disambiguation . | ||
| D17-1008 Annotating large numbers of sentences with senses is the heaviest requirement of current *****Word Sense***** Disambiguation . | ||
| 2021.blackboxnlp-1.19 Tsetlin Machine ( TM ) is an interpretable pattern recognition algorithm based on propositional logic , which has demonstrated competitive performance in many Natural Language Processing ( NLP ) tasks , including sentiment analysis , text classification , and *****Word Sense***** Disambiguation . | ||
| Portuguese | 12 | |
| L16-1698 This paper presents some work on direct and indirect speech in Portuguese using corpus - based methods : we report on a study whose aim was to identify ( i ) *****Portuguese***** verbs used to introduce reported speech and ( ii ) syntactic patterns used to convey reported speech , in order to enhance the performance of a quotation extraction system , dubbed QUEMDISSE ? . | ||
| L12-1244 This paper presents CINTIL - QATreebank , a treebank composed of *****Portuguese***** sentences that can be used to support the development of Question Answering systems . | ||
| L08-1299 A frequent problem in automatic categorization applications involving *****Portuguese***** language is the absence of large corpora of previously classified documents , which permit the validation of experiments carried out . | ||
| W18-0534 In this paper we present NLI - PT , the first *****Portuguese***** dataset compiled for Native Language Identification ( NLI ) , the task of identifying an author 's first language based on their second language writing . | ||
| L14-1025 This paper presents NomLex - PT , a lexical resource describing *****Portuguese***** nominalizations . | ||
| Abstract Meaning Representation ( AMR ) | 12 | |
| D19-1548 Recent studies on AMR - to - text generation often formalize the task as a sequence - to - sequence ( seq2seq ) learning problem by converting an *****Abstract Meaning Representation ( AMR )***** graph into a word sequences . | ||
| 2021.acl-long.73 Due to the scarcity of annotated data , *****Abstract Meaning Representation ( AMR )***** research is relatively limited and challenging for languages other than English . | ||
| W18-4912 Although English grammar encodes a number of semantic contrasts with tense and aspect marking , these semantics are currently ignored by *****Abstract Meaning Representation ( AMR )***** annotations . | ||
| 2021.acl-long.257 We present algorithms for aligning components of *****Abstract Meaning Representation ( AMR )***** graphs to spans in English sentences . | ||
| 2020.findings-emnlp.199 *****Abstract Meaning Representation ( AMR )***** parsing aims at converting sentences into AMR representations . | ||
| Natural Language Understanding ( NLU ) | 12 | |
| 2021.nlp4convai-1.12 User intent discovery is a key step in developing a *****Natural Language Understanding ( NLU )***** module at the core of any modern Conversational AI system . | ||
| 2020.acl-main.186 The *****Natural Language Understanding ( NLU )***** component in task oriented dialog systems processes a user 's request and converts it into structured information that can be consumed by downstream components such as the Dialog State Tracker ( DST ) . | ||
| N18-3017 This paper investigates the use of Machine Translation ( MT ) to bootstrap a *****Natural Language Understanding ( NLU )***** system for a new language for the use case of a large - scale voice - controlled device . | ||
| 2020.nlp4convai-1.1 One of the core components of voice assistants is the *****Natural Language Understanding ( NLU )***** model . | ||
| 2021.naacl-industry.39 This paper presents a production Semi - Supervised Learning ( SSL ) pipeline based on the student - teacher framework , which leverages millions of unlabeled examples to improve *****Natural Language Understanding ( NLU )***** tasks . | ||
| natural language ( NL ) | 12 | |
| 2020.acl-main.538 Open - domain code generation aims to generate code in a general - purpose programming language ( such as Python ) from *****natural language ( NL )***** intents . | ||
| 2020.coling-main.260 In Text - to - SQL semantic parsing , selecting the correct entities ( tables and columns ) for the generated SQL query is both crucial and challenging ; the parser is required to connect the *****natural language ( NL )***** question and the SQL query to the structured knowledge in the database . | ||
| 2014.lilt-9.11 The role of inference as it relates to *****natural language ( NL )***** semantics has often been neglected . | ||
| C16-2059 Words to express relations in *****natural language ( NL )***** statements may be different from those to represent properties in knowledge bases ( KB ) . | ||
| P19-1447 Semantic parsing considers the task of transducing *****natural language ( NL )***** utterances into machine executable meaning representations ( MRs ) . | ||
| transition | 12 | |
| N18-2066 Because the most common *****transition***** systems are projective , training a transition - based dependency parser often implies to either ignore or rewrite the non - projective training examples , which has an adverse impact on accuracy . | ||
| N19-1018 We introduce a novel *****transition***** system for discontinuous constituency parsing . | ||
| W18-6021 We present a general approach with reinforcement learning ( RL ) to approximate dynamic oracles for *****transition***** systems where exact dynamic oracles are difficult to derive . | ||
| D18-1264 In this paper , we propose a new rich resource enhanced AMR aligner which produces multiple alignments and a new *****transition***** system for AMR parsing along with its oracle parser . | ||
| L14-1609 Flag diacritics , which are special multi - character symbols executed at runtime , enable optimising finite - state networks by combining identical sub - graphs of its *****transition***** graph . | ||
| dialogue state tracking ( DST | 12 | |
| 2020.acl-main.563 Recent studies in *****dialogue state tracking ( DST***** ) leverage historical information to determine states which are generally represented as slot - value pairs . | ||
| P18-1134 We highlight a practical yet rarely discussed problem in *****dialogue state tracking ( DST***** ) , namely handling unknown slot values . | ||
| 2020.emnlp-main.243 Incompleteness of domain ontology and unavailability of some values are two inevitable problems of *****dialogue state tracking ( DST***** ) . | ||
| 2021.acl-long.135 This paper is concerned with *****dialogue state tracking ( DST***** ) in a task - oriented dialogue system . | ||
| 2021.emnlp-main.622 Zero - shot transfer learning for *****dialogue state tracking ( DST***** ) enables us to handle a variety of task - oriented dialogue domains without the expense of collecting in - domain data . | ||
| gold | 12 | |
| W18-6001 Detection and correction of errors and inconsistencies in *****gold***** treebanks are becoming more and more central topics of corpus annotation . | ||
| 2021.nodalida-main.31 We present an error analysis of neural UPOS taggers to evaluate why using *****gold***** tags has such a large positive contribution to parsing performance while using predicted UPOS either harms performance or offers a negligible improvement . | ||
| D19-6108 We present a novel framework to deal with relation extraction tasks in cases where there is complete lack of supervision , either in the form of *****gold***** annotations , or relations from a knowledge base . | ||
| P18-2058 Recent BIO - tagging - based neural semantic role labeling models are very high performing , but assume *****gold***** predicates as part of the input and can not incorporate span - level features . | ||
| R17-1104 Due to the lack of CWI datasets , previous works largely depend on Simple English Wikipedia and edit histories for obtaining ` *****gold***** standard ' annotations , which are of doubtable quality , and limited only to English . | ||
| direct | 12 | |
| 2020.lt4hala-1.15 Fictional prose can be broadly divided into narrative and discursive forms with *****direct***** speech being central to any discourse representation ( alongside indirect reported speech and free indirect discourse ) . | ||
| L16-1168 We propose a scheme for annotating *****direct***** speech in literary texts , based on the Text Encoding Initiative ( TEI ) and the coreference annotation guidelines from the Message Understanding Conference ( MUC ) . | ||
| C16-1202 collaborative filtering and matrix completion , are not designed to exploit the key information hidden in the text comments , while existing opinion mining methods do not provide *****direct***** support to recommendation systems with useful features on users and items . | ||
| 2021.emnlp-main.124 We present a simple but effective approach for leveraging Wikipedia for neural machine translation as well as cross - lingual tasks of image captioning and dependency parsing without using any *****direct***** supervision from external parallel data or supervised models in the target language . | ||
| 2021.acl-long.224 Five years after the first published proofs of concept , *****direct***** approaches to speech translation ( ST ) are now competing with traditional cascade solutions . | ||
| Semantic role labeling ( SRL | 12 | |
| N19-1340 *****Semantic role labeling ( SRL***** ) is a task to recognize all the predicate - argument pairs of a sentence , which has been in a performance improvement bottleneck after a series of latest works were presented . | ||
| 2020.emnlp-main.322 *****Semantic role labeling ( SRL***** ) is the task of identifying predicates and labeling argument spans with semantic roles . | ||
| C18-1233 *****Semantic role labeling ( SRL***** ) is to recognize the predicate - argument structure of a sentence , including subtasks of predicate disambiguation and argument labeling . | ||
| 2020.findings-emnlp.279 *****Semantic role labeling ( SRL***** ) identifies predicate - argument structure(s ) in a given sentence . | ||
| R19-1005 *****Semantic role labeling ( SRL***** ) is an important task for understanding natural languages , where the objective is to analyse propositions expressed by the verb and to identify each word that bears a semantic role . | ||
| different | 12 | |
| 2021.acl-demo.43 Various attack models have been proposed , which are enormously distinct and implemented with *****different***** programming frameworks and settings . | ||
| S17-2176 Clinical TempEval 2017 addressed the problem of temporal reasoning in the clinical domain by providing annotated clinical notes , pathology and radiology reports in line with Clinical TempEval challenges 2015/16 , across two *****different***** evaluation phases focusing on cross domain adaptation . | ||
| 2021.wmt-1.97 We further improve the base classifier by ( i ) adding a weighted sampler to deal with unbalanced data and ( ii ) introducing feature engineering , where features related to toxicity , named - entities and sentiment , which are potentially indicative of critical errors , are extracted using existing tools and integrated to the model in *****different***** ways . | ||
| L16-1687 To handle word senses as fuzzy objects , we exploit the graph structure of synonymy pairs acquired from *****different***** sources to discover synsets where words have different membership degrees that reflect confidence . | ||
| L14-1422 First , it requires a representation framework making it possible to compare , and eventually merge , *****different***** annotation schema . | ||
| Glove | 11 | |
| 2020.semeval-1.254 We also experimented with three types of static word embeddings: word2vec, FastText, and ***** Glove *****, in addition to emoji embeddings, and compared the performance of the different deep learning models on the dataset provided by this task. | ||
| 2020.lrec-1.231 Three standard word embedding models, namely, Word2Vec (both Skipgram and CBOW), FastText, and ***** Glove ***** are evaluated under two types of evaluation methods: intrinsic evaluation and extrinsic evaluation. | ||
| 2021.naacl-main.332 This work presents a novel neural topic modeling framework using multi-view embed ding spaces: (1) pretrained topic-embeddings, and (2) pretrained word-embeddings (context-insensitive from ***** Glove ***** and context-sensitive from BERT models) jointly from one or many sources to improve topic quality and better deal with polysemy. | ||
| 2021.acl-long.198 Our framework achieves state-of-the-art results (62.8% with ***** Glove *****, 72.0% with Electra) on the cross-domain text-to-SQL benchmark Spider at the time of writing. | ||
| S18-1093 We proposed a Siamese neural network for irony detection, which is consisted of two subnetworks, each containing a long short term memory layer(LSTM) and an embedding layer initialized with vectors from ***** Glove ***** word embedding 1 | ||
| usefulness | 11 | |
| L06-1307 The final aim of the paper is contributing to the debate about ***** usefulness ***** of computational lexicons in NLP, by providing evidence from the point of view of a particular application. | ||
| 2008.amta-govandcom.19 At the request of the USG National Virtual Translation Center, the University of Maryland Center for Advanced Study of Language conducted a study that assessed the role of several factors mediating transcript ***** usefulness ***** during translation tasks. | ||
| 2020.eval4nlp-1.16 Our results provide a basis for best practices for crowd-based summarization evaluation regarding major influential factors such as the best annotation aggregation method, the influence of readability and reading effort on summarization evaluation, and the optimal number of crowd workers to achieve comparable results to experts, especially when determining factors such as overall quality, grammaticality, referential clarity, focus, structure & coherence, summary ***** usefulness *****, and summary informativeness. | ||
| 2006.amta-papers.2 The model's ***** usefulness ***** is, however, limited by the computational complexity of estimating parameters at the phrase level. | ||
| 2020.lrec-1.31 Correlating results of crowd and laboratory ratings reveals high applicability of crowdsourcing for the factors overall quality, grammaticality, non-redundancy, referential clarity, focus, structure & coherence, summary ***** usefulness *****, and summary informativeness | ||
| reliably | 11 | |
| P19-1654 Even if these human reference descriptions are not available, VIFIDEL can still ***** reliably ***** evaluate system descriptions. | ||
| L08-1575 We show that textual connectors that link such textual units ***** reliably ***** predict different types of texts, such as information and opinion: using only textual connectors as features, an SVM classifier achieves an F-score of between 0.85 and 0.93 for predicting these classes. | ||
| 2020.smm4h-1.14 Howev-er, methods for automatic extraction of Adverse Drug Reactions from social media plat-forms such as Twitter still need further development before they can be included ***** reliably ***** in routine pharmacovigilance practices. | ||
| 2020.lrec-1.132 People can extract precise, complex logical meanings from text in documents such as tax forms and game rules, but language processing systems lack adequate training and evaluation resources to do these kinds of tasks ***** reliably *****. | ||
| 2020.bea-1.4 With this shift to automated approaches it is important that systems ***** reliably ***** assess all aspects of a candidate's responses | ||
| predictability | 11 | |
| W19-4713 The method utilizes the principles of coarticulation, local ***** predictability ***** and statistical phonological constraints to predict phonetic features by the features of their immediate phonetic environment. | ||
| 2020.lrec-1.141 We summarize and explore the current dataset, illustrate its potential by providing new evidence for the relation between ***** predictability ***** and implicitness – capitalizing on the already existing PDTB-style annotations for the texts we use – and outline its potential for future research. | ||
| N19-1413 Results show significant effects of word frequency and ***** predictability ***** in isolation but no effect of frequency over and above ***** predictability *****, and thus do not provide evidence of distinct mechanisms. | ||
| 2021.cmcl-1.28 Expectation-based theories of sentence processing posit that processing difficulty is determined by ***** predictability ***** in context. | ||
| 2020.lrec-1.180 To examine factors that can cause biases, we take an empirical analysis of demographic ***** predictability ***** on the English corpus | ||
| porting | 11 | |
| R19-1027 We describe work consisting in ***** porting ***** various morphological resources to the OntoLex-Lemon model. | ||
| 2007.iwslt-1.14 During this evaluation our efforts focused on the rapid ***** porting ***** of our SMT system to a new language (Arabic) and novel approaches to translation from speech input. | ||
| L14-1060 The aim of this research is to make existing framenets computationally accessible for multilingual natural language applications via a common semantic grammar API, and to facilitate the ***** porting ***** of such grammar to other languages. | ||
| L08-1225 Such mismatches highlight differences in the definitions of tags which are crucial when ***** porting ***** technology from one annotation scheme to another. | ||
| W19-5104 We describe work consisting in ***** porting ***** two large German lexical resources into the OntoLex-Lemon model in order to establish complementary interlinkings between them | ||
| f1 | 11 | |
| 2021.smm4h-1.16 Our approach performed well on the normalization task, achieving an above average ***** f1 ***** score of 24%, but less so on classification and extraction, with ***** f1 ***** scores of 22% and 37% respectively. | ||
| S19-2141 Our best classifier for identifying offensive tweets for SubTask A (Classifying offensive vs. nonoffensive) has an accuracy of 83.14% and a ***** f1 *****- score of 0.7565 on the actual test data. | ||
| 2020.trac-1.14 We obtained ***** f1 ***** score of 43.10%, 59.45% and 44.84% respectively for English, Hindi and Bengali. | ||
| S19-2037 We propose a three layer model with a generic, multi-purpose approach that without any task specific optimizations achieve competitive results (***** f1 ***** score of 0.7096) in the EmoContext task. | ||
| R17-1092 We report results of 96% ***** f1 ***** score in predicting a case ruling, 90% ***** f1 ***** score in predicting the law area of a case, and 75.9% ***** f1 ***** score in estimating the time span when a ruling has been issued using a linear Support Vector Machine (SVM) classifier trained on lexical features | ||
| replication | 11 | |
| 2020.findings-emnlp.132 In this paper, we propose two weakly supervised learning approaches that use automatically extracted text information of research papers to improve the prediction accuracy of research ***** replication ***** using both labeled and unlabeled datasets. | ||
| 2021.inlg-1.32 Then, we document how we approached our ***** replication ***** of the paper's human evaluation. | ||
| 2021.eval4nlp-1.5 We describe SeqScore, which addresses many of the issues that cause ***** replication ***** failures. | ||
| P19-1267 We conduct ***** replication ***** and reproduction experiments with nine part-of-speech taggers published between 2000 and 2018, each of which claimed state-of-the-art performance on a widely-used “standard split”. | ||
| 2020.lrec-1.173 We also propose, in addition to this corpus, a complete benchmarking platform to stimulate and fairly compare scientific works around the problem of content abuse detection, trying to avoid the recurring problem of result ***** replication ***** | ||
| concretely | 11 | |
| P19-1578 More ***** concretely *****, our approach utilizes the off-the-shelf confusionset for guiding the character generation. | ||
| 2017.iwslt-1.18 More ***** concretely ***** we train neuralized versions of lexicalized reordering [1] and the operation sequence models [2] using feed-forward neural network. | ||
| 2021.naacl-main.1 More ***** concretely *****, we hypothesize KG entities may be more complex than we think, i.e., an entity may wear many hats and relational triplets may form due to more than a single reason. | ||
| 2021.ranlp-1.128 More ***** concretely *****, we study the task of polarity detection for the Czech language on three sentiment polarity datasets. | ||
| 2020.coling-main.232 More ***** concretely *****, we first introduce a novel graph-based iterative knowledge retrieval module, which iteratively retrieves concepts and entities related to the given question and its choices from multiple knowledge sources | ||
| SubTask | 11 | |
| S19-2141 Our best classifier for identifying offensive tweets for ***** SubTask ***** A (Classifying offensive vs. nonoffensive) has an accuracy of 83.14% and a f1- score of 0.7565 on the actual test data. | ||
| S18-1114 ***** SubTask ***** 1 classifies if a sentence is useful for inferring malware actions and capabilities, and ***** SubTask ***** 2 predicts token labels (“Action”, “Entity”, “Modifier” and “Others”) for a given malware-related sentence. | ||
| 2020.semeval-1.207 Our best model, an average ensemble of four different Bert models, achieved 11th place out of 82 participants with a macro F1 score of 0.91344 in the English ***** SubTask ***** A. | ||
| S19-2212 The final submission achieves 14th place in Task 9, ***** SubTask ***** A with the accuracy of 0.6776. | ||
| S19-2218 This paper describes the suggestion miner system that participates in SemEval 2019 Task 9 - ***** SubTask ***** A - Suggestion Mining from Online Reviews and Forums | ||
| exhaustive | 11 | |
| W18-4517 Resources compiling names of persons can be available, but no ***** exhaustive ***** lists exist. | ||
| C18-1236 Instead of being ***** exhaustive *****, we show selected key challenges were a successful application of NLP techniques would facilitate the automation of particular tasks that nowadays require a significant effort to accomplish. | ||
| 2020.bionlp-1.9 Recent advances in embedding methods have shown promising results for several clinical tasks, yet there is no ***** exhaustive ***** comparison of such approaches with other commonly used word representations and classification models. | ||
| 2020.wnut-1.38 We present a neural ***** exhaustive ***** approach that addresses named entity recognition (NER) and relation recognition (RE), for the entity and re- lation recognition over the wet-lab protocols shared task. | ||
| 2020.acl-main.722 However, designing such features for low-resource languages is challenging, because ***** exhaustive ***** entity gazetteers do not exist in these languages | ||
| bounded | 11 | |
| S17-2013 We used random forest ensemble learning to map an expandable set of extracted pairwise features into a semantic similarity estimated value ***** bounded ***** between 0 and 5. | ||
| 2021.clpsych-1.2 However, progress in the domain remains ***** bounded ***** by the availability of adequate data. | ||
| 2020.acl-main.198 We present numerical, theoretical and empirical analyses which show that words on the interior of the convex hull in the embedding space have their probability ***** bounded ***** by the probabilities of the words on the hull. | ||
| 2020.emnlp-main.685 We argue that keeping all entities in memory is unnecessary, and we propose a memory-augmented neural network that tracks only a small ***** bounded ***** number of entities at a time, thus guaranteeing a linear runtime in length of document. | ||
| D18-1292 Moreover, parsing results on English, Chinese and German show that this ***** bounded ***** model is able to produce parse trees more accurately than or competitively with state-of-the-art constituency grammar induction models | ||
| 11 | ||
| 2021.acl-demo.31 This presents a challenge to NLP practitioners who wish to use the information contained within ***** PDF ***** documents for training models or data analysis, because annotating these documents is difficult. | ||
| L16-1426 The dictionary was converted from ***** PDF ***** into XML and senses were automatically identified and annotated. | ||
| C16-2029 Our system provides ways to extract natural language sentences from ***** PDF ***** files together with their logical structures, and also to map arbitrary textual spans to their corresponding regions on page images. | ||
| 2021.conll-1.22 We provide six benchmarks that cover three use cases (OCR errors, text extraction from ***** PDF *****, human errors) and the cases of partially correct space information and all spaces missing | ||
| R19-1001 In this paper , we present a relationship extraction based methodology for table structure recognition in *****PDF***** documents . | ||
| neologism | 11 | |
| I17-1058 The problem of blend formation in generative linguistics is interesting in the context of ***** neologism *****, their quick adoption in modern life and the creative generative process guiding their formation. | ||
| W17-2301 We then score this lattice according to various features, and attempt to determine whether the anomalous production represented a phonemic error or a genuine ***** neologism *****. | ||
| L12-1653 This article describes a modified POS tagger that explicitly considers new tags for known words, hence making it better fit for ***** neologism ***** research. | ||
| L14-1260 In this paper we present a statistical machine learning approach to formal ***** neologism ***** detection going some way beyond the use of exclusion lists. | ||
| C18-1017 Our models capture patterns of ***** neologism ***** usage over time to date texts, provide insights into temporal locality of word usage over a span of 150 years, and generalize to various domains like News, Fiction, and Non-Fiction with competitive performance | ||
| avatar | 11 | |
| 2021.mtsummit-at4ssl.8 The joint coordinates of hands and arms are imported as landmarks to control the skeleton of our ***** avatar *****. | ||
| 2020.signlang-1.10 It continues to show that with a small number of services in a signing ***** avatar *****, these descriptions can be synthesized in a natural way that captures the essential gestural actions while also including the subtleties of human motion that make the signing legible. | ||
| 2020.acl-demos.1 This paper proposes the building of Xiaomingbot, an intelligent, multilingual and multimodal software robot equipped with four inte- gral capabilities: news generation, news translation, news reading and ***** avatar ***** animation. | ||
| 2020.sigdial-1.16 In this work we tackle spatial question answering in a holistic way, using a vision system, speech input and output mediated by an animated ***** avatar *****, a dialogue system that robustly interprets spatial queries, and a constraint solver that derives answers based on 3-D spatial modeling. | ||
| 2021.nlp4posimpact-1.16 In this paper we seek to address this limitation via a conversational agent that adopts one aspect of in-person doctor-patient interactions: A human ***** avatar ***** to facilitate medical grounded question answering | ||
| REST | 11 | |
| 2020.ldl-1.8 In this paper, we propose a ***** REST ***** API that enables the participation of downstream alignment services in the process orchestrated by MAPLE, helping them self-adapt in order to handle heterogeneous alignment tasks and scenarios. | ||
| 2021.hackashop-1.5 The best performing model (offensive content classifier) is available online as a ***** REST ***** API. | ||
| L14-1159 The resulting resource is available online for lookup, editing, download and remote programming via a ***** REST ***** API on a Jibiki platform. | ||
| D19-3020 The system can be easily integrated as a service into existing tools and platforms used by journalists using a ***** REST ***** API | ||
| W18-6560 We present a readily available API that solves the morphology component for surface realizers in 10 languages ( e.g. , English , German and Finnish ) for any topic and is available as *****REST***** API . | ||
| recursion | 11 | |
| 1993.iwpt-1.16 In this paper we describe a phenomenon present in some context-free grammars, called hidden left ***** recursion *****. | ||
| W18-6429 Our systems are based on attentional sequence-to-sequence models with some form of ***** recursion ***** and self-attention. | ||
| 2020.findings-emnlp.384 Surprisingly, the performance of SA networks is at par with LSTMs, which provides evidence on the ability of SA to learn hierarchies without ***** recursion *****. | ||
| C16-1212 However, there remains a major limitation: this line of work completely ignores syntax and ***** recursion *****, which is helpful in many traditional efforts. | ||
| 2021.acl-long.292 This suggested that natural language can be approximated well with models that are too weak for formal languages, or that the role of hierarchy and ***** recursion ***** in natural language might be limited | ||
| dialogic | 11 | |
| P19-1460 Existing methods focus mainly on conversational settings, where ***** dialogic ***** features are used for (dis)agreement inference. | ||
| D18-1252 In this paper, we address this collaborative nature to improve ***** dialogic ***** reference resolution in two ways: First, we trained a words-as-classifiers logistic regression model of word semantics and incrementally adapt the model to idiosyncratic language between dyad partners during evaluation of the dialog. | ||
| L16-1504 It can also be used for studies of entrainment in dialogue, and the form and style of pedestrian instruction dialogues, as well as the effect of friendship on ***** dialogic ***** behaviors. | ||
| W19-2504 According to the literary theory of Mikhail Bakhtin, a ***** dialogic ***** novel is one in which characters speak in their own distinct voices, rather than serving as mouthpieces for their authors. | ||
| P18-1140 Therefore, we propose to include user sentiment obtained through multimodal information (acoustic, ***** dialogic ***** and textual), in the end-to-end learning framework to make systems more user-adaptive and effective | ||
| chronological | 11 | |
| 2020.louhi-1.11 We proposed a systematic methodology for learning from ***** chronological ***** events available in clinical notes. | ||
| 2021.lchange-1.1 We observe that this helps to bridge the linguistic gap as ***** chronological ***** context is also used as auxiliary information. | ||
| 2020.sdp-1.3 We further analyze ***** chronological ***** trends of acknowledgement entities in CORD-19 papers. | ||
| 2021.hackashop-1.12 However, many comment sections present ***** chronological ***** ranking to all users. | ||
| 2021.emnlp-main.683 Correctly ordering the sentences requires an understanding of coherence with respect to the ***** chronological ***** sequence of events described in the text | ||
| verifies | 11 | |
| 2021.emnlp-main.663 Extensive analysis ***** verifies ***** that the document graph is beneficial for capturing discourse phenomena. | ||
| 2021.acl-long.268 The human evaluation further ***** verifies ***** that our approaches improve translation adequacy as well as fluency. | ||
| W18-6113 Empirical analysis ***** verifies ***** that both OMP and overlapping GOMP constitute powerful regularizers, able to produce effective and very sparse models. | ||
| S18-2003 We provide a broad qualitative description of the dataset, and a series of standard classification experiments ***** verifies ***** the quantitative reliability of the presented resource. | ||
| 2020.nlpmc-1.9 Our preliminary results ***** verifies ***** the utility of the proposed conversation-based measures in distinguishing MCI from healthy controls | ||
| corrupted | 11 | |
| 2021.wanlp-1.20 Which, instead of training a model to recover masked tokens, it trains a discriminator model to distinguish true input tokens from ***** corrupted ***** tokens that were replaced by a generator network. | ||
| 2021.nodalida-main.28 If model accuracy on the ***** corrupted ***** data remains high, then the dataset is likely to contain statistical biases and artefacts that guide prediction. | ||
| D19-1430 Next, a model is trained on a noised version of the concatenated synthetic bitext where each source sequence is randomly ***** corrupted *****. | ||
| P19-1561 Trained to recognize words ***** corrupted ***** by random adds, drops, swaps, and keyboard mistakes, our method achieves 32% relative (and 3.3% absolute) error reduction over the vanilla semi-character model. | ||
| E17-2004 In this work, we propose a linguistically-motivated approach for training robust models based on exposing the model to ***** corrupted ***** text examples at training time | ||
| preceding | 11 | |
| 1994.bcs-1.8 Probably the most important aspect in successfully analysing multisentential source texts is the capacity to establish the anaphoric references to ***** preceding ***** discourse entities. | ||
| C16-1177 Our approach resembles how human beings process the task, i.e., decide the information status of the current discourse entity based on its ***** preceding ***** context. | ||
| 2021.naacl-main.106 For the learning process, the system should incrementally learn new classes round by round without re-training on the examples of ***** preceding ***** classes; (ii) For the performance, the system should perform well on new classes without much loss on ***** preceding ***** classes. | ||
| L10-1264 Inter-vocalic duration profile results show long inter-vocalic duration between determiner vowel and ***** preceding ***** word vowel. | ||
| L16-1359 Automatic Term Extraction (ATE) or Recognition (ATR) is a fundamental processing step ***** preceding ***** many complex knowledge engineering tasks | ||
| distinct | 11 | |
| 2021.woah-1.21 The task include two subtasks relating to ***** distinct ***** challenges in the fine-grained detection of hateful memes: (1) the protected category attacked by the meme and (2) the attack type. | ||
| D19-5622 In this paper, we argue that using the same global attention in multiple heads limits multi-head self-attention's capacity for learning ***** distinct ***** features. | ||
| 2014.lilt-11.4 It is a summary of much work concerning One compelling kind of evidence for the autonomy of a language's morphology is the incidence of inflectional polyfunctionality, the systematic use of the same morphology to express ***** distinct ***** but related morphosyntactic content. | ||
| 1963.earlymt-1.29 Employment of the second technique, which is based on periodic comparison of the current prediction pool with pools formed on earlier productive paths, eliminates repeated analysis of identical right-hand segments which belong to ***** distinct ***** paths | ||
| W19-5421 We approached this using transfer learning to obtain a series of strong neural models on *****distinct***** domains , and combining them into multi - domain ensembles . | ||
| clusters | 11 | |
| 2019.gwc-1.50 We investigate how vertical polysemy forms polysemy structures (or sense ***** clusters *****) in semantic hierarchies of the wordnets. | ||
| P19-1245 We train a neural network model to make predictions about which ***** clusters ***** contain activities that were performed by a given user based on the text of their previous posts and self-description. | ||
| 2020.nlp4convai-1.9 We compare this method to a random baseline that randomly assigns templates to ***** clusters ***** as well as a strong baseline that performs the sentence encoding and the utterance clustering sequentially. | ||
| P17-1087 We validate our ***** clusters ***** using datasets containing human judgments of word pair similarities and show the benefit of using our word ***** clusters ***** for sentiment prediction. | ||
| P19-1276 We consider open domain event extraction, the task of extracting unconstraint types of events from news ***** clusters ***** | ||
| glyph | 11 | |
| 2021.acl-long.161 Recent pretraining models in Chinese neglect two important aspects specific to the Chinese language: ***** glyph ***** and pinyin, which carry significant syntax and semantic information for language understanding. | ||
| D17-1025 The character ***** glyph ***** features are directly learned from the bitmaps of characters by convolutional auto-encoder(convAE), and the ***** glyph ***** features improve Chinese word representations which are already enhanced by character embeddings. | ||
| D19-1225 The underlying generative model combines these factors through an asymmetric transpose convolutional process to generate the image of the ***** glyph ***** itself. | ||
| D19-1640 The task of Chinese text spam detection is very challenging due to both ***** glyph ***** and phonetic variations of Chinese characters. | ||
| 2020.acl-main.266 We propose a deep and interpretable probabilistic generative model to analyze *****glyph***** shapes in printed Early Modern documents . | ||
| consequently | 11 | |
| K18-1004 However, entities in real-world are often involved in many different relationships, ***** consequently ***** entity relations are very dynamic over time. | ||
| 2020.coling-main.609 Due to the compelling improvements brought by BERT, many recent representation models adopted the Transformer architecture as their main building block, ***** consequently ***** inheriting the wordpiece tokenization system despite it not being intrinsically linked to the notion of Transformers. | ||
| D18-1132 We find that each model has its own specialty in solving problems, ***** consequently ***** an ensemble model is then proposed to combine their advantages. | ||
| 2021.bucc-1.5 Moreover, our approach is relatively language independent and can ***** consequently ***** be ported quickly (and hence cost-effectively) from one language to another, requiring only minor language-specific tailoring. | ||
| W18-6548 When a set of consumer products is large and varied, it can be difficult for a consumer to understand how the products in the set differ; ***** consequently *****, it can be challenging to choose the most suitable product from the set | ||
| mediated | 11 | |
| L08-1198 Three of the genres were computer ***** mediated *****: email, blog, and chat, and three non-computer-***** mediated *****: essay, interview, and discussion. | ||
| L12-1477 Machine translation ***** mediated ***** communication plays a more and more important role in international collaboration. | ||
| L16-1096 Although there are studies that used replications of the map task to investigate communication in computer ***** mediated ***** tasks, this ILMT-s2s corpus is, to the best of our knowledge, the first investigation of communicative behaviour in the presence of three additional “filters”: Automatic Speech Recognition (ASR), Machine Translation (MT) and Text To Speech (TTS) synthesis, where the instruction giver and the instruction follower speak different languages. | ||
| P18-1164 Source and target words are at the two ends of a long information processing procedure, ***** mediated ***** by hidden states at both the source encoding and the target decoding phases. | ||
| 2020.sigdial-1.16 In this work we tackle spatial question answering in a holistic way, using a vision system, speech input and output ***** mediated ***** by an animated avatar, a dialogue system that robustly interprets spatial queries, and a constraint solver that derives answers based on 3-D spatial modeling | ||
| selecting | 11 | |
| E17-1063 Instead, we formalize dependency parsing as the problem of independently ***** selecting ***** the head of each word in a sentence. | ||
| 2020.acl-main.86 We show that a simple dynamic sampling strategy, ***** selecting ***** instances for training proportional to the multi-task model's current performance on a dataset relative to its single task performance, gives substantive gains over prior multi-task sampling strategies, mitigating the catastrophic forgetting that is common in multi-task learning. | ||
| N19-1340 Furthermore, we compare several associated sentences ***** selecting ***** strategies and label merging methods in AMN to find and utilize the label of associated sentences while attending them. | ||
| C16-1163 In particular, we use the attention weights for both ***** selecting ***** entire sentences and their subparts, i.e., word/chunk, from shallow syntactic trees. | ||
| C16-1054 We propose different approaches to incorporate information from public posts, including using frequency information from the posts to re-estimate bigram weights in the ILP-based summarization model and to re-weight a dependency tree edge's importance for sentence compression, directly ***** selecting ***** sentences from posts as the final summary, and finally a strategy to combine the summarization results generated from news articles and posts | ||
| interdependent | 11 | |
| W16-3902 These tasks often require structured learning models, which make predictions on multiple ***** interdependent ***** variables. | ||
| D18-1158 However, events in one sentence are usually ***** interdependent ***** and sentence-level information is often insufficient to resolve ambiguities for some types of events. | ||
| 2021.conll-1.33 Semantics, morphology and syntax are strongly ***** interdependent *****. | ||
| 2020.acl-main.438 Various natural language processing tasks are structured prediction problems where outputs are constructed with multiple ***** interdependent ***** decisions. | ||
| Q16-1019 These ***** interdependent ***** sentence pair representations are more powerful than isolated sentence representations | ||
| max | 11 | |
| 2020.acl-main.267 In conventional pooling methods such as average, ***** max ***** and attentive pooling, text representations are weighted summations of the L1 or L norm of input features. | ||
| N19-1184 In addition, we find that sigmoidal attention weights with ***** max ***** pooling achieves better performance over the commonly used weighted average attention in this setup. | ||
| 2020.wnut-1.46 Our approach is based on exploiting semantic information from both ***** max ***** pooling and average pooling, to this end we propose two models. | ||
| C18-1154 We propose vector-based multi-head attention that includes the widely used ***** max ***** pooling, mean pooling, and scalar self-attention as special cases. | ||
| P18-4022 Promising preliminary results on ***** max ***** | ||
| troll | 11 | |
| L10-1506 The corpus is categorized in a community-driven process according to the following tags: funny, informative, insightful, offtopic, flamebait, interesting and ***** troll *****. | ||
| 2021.dravidianlangtech-1.24 The dataset consists of ***** troll ***** and non-***** troll ***** images with their captions as texts. | ||
| D19-5003 Social media has reportedly been (ab)used by Russian ***** troll ***** farms to promote political agendas. | ||
| 2021.dravidianlangtech-1.16 Amongst all the forms of ***** troll ***** content, memes are most prevalent due to their popularity and ability to propagate across cultures | ||
| 2020.lrec-1.766 During the past several years , a large amount of *****troll***** accounts has emerged with efforts to manipulate public opinion on social network sites . | ||
| solving | 11 | |
| 2020.emnlp-main.624 We take a novel perspective of IF game ***** solving ***** and re-formulate it as Multi-Passage Reading Comprehension (MPRC) tasks. | ||
| 2020.acl-main.502 The blanks require joint ***** solving ***** and significantly impair each other's context. | ||
| E17-1073 Specifically, this algorithm learns a word embedding matrix in tandem with the classifier parameters in an online fashion, ***** solving ***** a bi-convex constrained optimization at each iteration. | ||
| E17-2017 Our experimental results show that our neural model outperforms a baseline as well as humans ***** solving ***** the same task, suggesting that computational models are able to better capture the underlying semantics of emojis. | ||
| 2020.findings-emnlp.115 Our approach is highly flexible, requires no task-specific train- ing, and leverages efficient constraint satisfaction ***** solving ***** techniques | ||
| uncorrelated | 11 | |
| 2020.semeval-1.102 Our quantitative results also showed that images and text were ***** uncorrelated *****. | ||
| 2016.iwslt-1.8 We experiment with three different scenarios using, i) French, as a source language ***** uncorrelated ***** to the target language, ii) Ukrainian, as a source language correlated to the target one and finally iii) English as a source language ***** uncorrelated ***** to the target language using a relatively large amount of data in respect to the other two scenarios. | ||
| 2020.cogalex-1.1 To test whether individual corpora can make better predictions for a cognitive task of long-term memory retrieval, we generated stimulus materials consisting of 134 sentences with ***** uncorrelated ***** individual and norm-based word probabilities. | ||
| N18-1146 We formalize this need as a new task: inducing a lexicon that is predictive of a set of target variables yet ***** uncorrelated ***** to a set of confounding variables. | ||
| C18-1211 We show that, if the word embed- dings are standardised and ***** uncorrelated *****, such an operator will be independent of bilinear terms, and can be simplified to a linear form, where PairDiff is a special case | ||
| Implicit Emotion Shared | 11 | |
| W18-6234 EmotiKLUE is a submission to the ***** Implicit Emotion Shared ***** Task. | ||
| W18-6230 This approach is ranked 8th in ***** Implicit Emotion Shared ***** Task (IEST) at WASSA-2018. | ||
| W18-6233 ***** Implicit Emotion Shared ***** Task. | ||
| W18-6227 ***** Implicit Emotion Shared ***** Task (IEST 2018). | ||
| W18-6232 ***** Implicit Emotion Shared ***** Task | ||
| UD annotation | 11 | |
| 2021.americasnlp-1.14 Implications for ***** UD annotation ***** of other polysynthetic languages are discussed. | ||
| W17-1407 The paper documents the procedure of building a new Universal Dependencies (UDv2) treebank for Serbian starting from an existing Croatian UDv1 treebank and taking into account the other Slavic ***** UD annotation ***** guidelines. | ||
| 2020.framenet-1.9 The PropBank annotation layer of such a multi-layer corpus can be semi-automatically derived from the existing FrameNet and ***** UD annotation ***** layers, by providing a mapping configuration from lexical units in [a non-English language] FrameNet to [English language] PropBank predicates, and a mapping configuration from FrameNet frame elements to PropBank semantic arguments for the given pair of a FrameNet frame and a PropBank predicate. | ||
| W18-6022 We discuss the composition of the corpus, challenges in adapting the ***** UD annotation ***** scheme to existing conventions for annotating Coptic, and evaluate inter-annotator agreement on ***** UD annotation ***** for the language. | ||
| W18-6002 We show that some of those differences that arise can be diminished by using parallel treebanks and, more importantly from the practical point of view, by harmonizing the language-specific solutions in the ***** UD annotation ***** | ||
| romanized | 11 | |
| 2020.lrec-1.294 We additionally provide baseline results on several tasks made possible by the dataset, including single word transliteration, full sentence transliteration, and language modeling of native script and ***** romanized ***** text. | ||
| 2021.dravidianlangtech-1.3 In this paper, we explored the zero-shot learning and few-shot learning paradigms based on multilingual language models for offensive speech detection in code-mixed and ***** romanized ***** variants of three Dravidian languages - Malayalam, Tamil, and Kannada. | ||
| L08-1315 This paper describes a syllabification based conversion method for converting ***** romanized ***** Persian text to the traditional Arabic-based writing system. | ||
| 2014.amta-researchers.25 We present a machine translation engine that can translate ***** romanized ***** Arabic, often known as Arabizi, into English. | ||
| 2021.sigmorphon-1.22 We analyze the distributions of different error classes using two unsupervised tasks as testbeds: converting informally ***** romanized ***** text into the native script of its language (for Russian, Arabic, and Kannada) and translating between a pair of closely related languages (Serbian and Bosnian) | ||
| branching | 11 | |
| I17-1095 In this paper, we model the document revision detection problem as a minimum cost ***** branching ***** problem that relies on computing document distances. | ||
| W89-0210 In a natural language processing system, a large amount of ambiguity and a large ***** branching ***** factor are hindering factors in obtaining the desired analysis for a given sentence in a short time. | ||
| W19-2901 This paper presents a formal, sound and complete parser for Minimalist Grammars whose search space contains ***** branching ***** points that we can identify as the locus of the decision to perform this kind of active gap-finding. | ||
| 2020.findings-emnlp.401 Experiments show that several existing works exhibit ***** branching ***** biases, and some implementations of these three factors can introduce the ***** branching ***** bias. | ||
| L12-1572 Here, we focus specifically on the construction of advanced workflows, involving multiple ***** branching ***** and merging points, to facilitate various comparative evalutions | ||
| singletons | 11 | |
| 2020.crac-1.5 This approach is simple and universal, compatible with any language or dataset (regardless of ***** singletons *****) and easier to integrate with current language models architectures. | ||
| L14-1074 By preserving ***** singletons *****, we were able to use Kneser-Ney smoothing to build large language models. | ||
| 2010.iwslt-papers.6 We implemented and tested various improvements aimed at i) converting German texts to the new orthographic conventions; ii) performing a new tokenization for German; iii) normalizing lexical redundancy with the help of POS tagging and morphological analysis; iv) splitting German compound words with frequency based algorithm and; v) reducing ***** singletons ***** and out-of-vocabulary words. | ||
| N19-1176 A second distinctive feature is its rich annotation scheme, covering ***** singletons *****, expletives, and split-antecedent plurals. | ||
| 2020.fnp-1.33 The parser uses a similarity measure ( Generalized Dice Coefficient ) between listed terms and unlisted term candidates to ( i ) determine term status , ( ii ) serve putative terms to the parser , ( iii ) decrease parsing complexity by glomming multi - tokens as lexical *****singletons***** , and ( iv ) automatically augment the terminology after parsing of an utterance completes . | ||
| SI | 11 | |
| 2020.semeval-1.240 Our system ranked 20th out of 36 teams with 0.398 F1 in the ***** SI ***** task and 14th out of 31 teams with 0.556 F1 in the TC task. | ||
| 2020.semeval-1.228 Finally, the ensemble model was ranked 1st amongst 35 teams for ***** SI ***** and 3rd amongst 31 teams for TC. | ||
| 2020.lrec-1.293 The related experiment results indicate that the proposed ***** SI ***** can improve the performance of the Chinese Pre-trained models significantly. | ||
| 2021.iwslt-1.27 A portion of the corpus contains ***** SI ***** data from three interpreters with different amounts of experience. | ||
| 1998.amta-papers.19 I conclude by noting further questions which must be answered before we can fully understand ***** SI *****, and how it might help MT | ||
| parallelizable | 11 | |
| 2021.wnut-1.46 Recently, per-word classification of correction edits has proven an efficient, ***** parallelizable ***** alternative to current encoder-decoder GEC systems. | ||
| D18-1408 Fully attention-based models have recently attracted enormous interest due to their highly ***** parallelizable ***** computation and significantly less training time. | ||
| N19-1127 MTSA 1) captures both pairwise (token2token) and global (source2token) dependencies by a novel compatibility function composed of dot-product and additive attentions, 2) uses a tensor to represent the feature-wise alignment scores for better expressive power but only requires ***** parallelizable ***** matrix multiplications, and 3) combines multi-head with multi-dimensional attentions, and applies a distinct positional mask to each head (subspace), so the memory and computation can be distributed to multiple heads, each with sequential information encoded independently. | ||
| 2020.emnlp-main.485 In this paper, we propose to employ a ***** parallelizable ***** approximate variational inference algorithm for the CRF model | ||
| 2021.emnlp-main.260 Multi - head self - attention recently attracts enormous interest owing to its specialized functions , significant *****parallelizable***** computation , and flexible extensibility . | ||
| Catalan | 11 | |
| L10-1600 We report on a series of corpus-based experiments run with Linguistica in Romance languages (***** Catalan *****, French, Italian, Portuguese, and Spanish), Germanic languages (Dutch, English and German), and Slavic language Polish. | ||
| L06-1028 This paper describes an acceptance test procedure for evaluating a spoken language translation system between ***** Catalan ***** and Spanish. | ||
| N18-2022 We also find that ***** Catalan ***** is used more often in referendum-related discourse than in other contexts, contrary to prior findings on language variation. | ||
| 2020.wmt-1.50 The second edition of this shared task featured parallel data from pairs/groups of similar languages from three different language families: Indo-Aryan languages (Hindi and Marathi), Romance languages (***** Catalan *****, Portuguese, and Spanish), and South Slavic Languages (Croatian, Serbian, and Slovene) | ||
| L06-1423 This paper describes a joint initiative of the Catalan and Spanish Government to produce Language Resources for the *****Catalan***** language . | ||
| Customer | 11 | |
| L16-1319 On the overall only 4% of the observed words are misspelled but 26% of the messages contain at least one erroneous word (rising to 40% when focused on ***** Customer ***** messages). | ||
| I17-4031 Our empirical analysis shows that our models perform well in all the four languages on the setups of IJCNLP Shared Task on ***** Customer ***** Feedback Analysis. | ||
| I17-4004 Shared Task on ***** Customer ***** Feedback Analysis. | ||
| I17-4026 The IJCNLP 2017 shared task on ***** Customer ***** Feedback Analysis focuses on classifying customer feedback into one of a predefined set of categories or classes | ||
| 2021.hcinlp-1.9 *****Customer***** reviews are useful in providing an indirect , secondhand experience of a product . | ||
| multilingual lexical | 11 | |
| L10-1484 Mutual hyperlinks among these databases and the bilingual search mode make it easy to compare semantic structures of corresponding lexical units between these languages, and it could be useful for building ***** multilingual lexical ***** resources. | ||
| L14-1282 This paper discusses the multiple approaches to collaboration that the Kamusi Project is employing in the creation of a massively ***** multilingual lexical ***** resource. | ||
| 2021.gwc-1.4 We present novel methods for this task that leverage information from ***** multilingual lexical ***** resources. | ||
| L08-1416 In particular, the paper focuses on i) lexical specification and data categories relevant for building ***** multilingual lexical ***** resources for Asian languages; ii) a core upper-layer ontology needed for ensuring multilingual interoperability and iii) the evaluation platform used to test the entire architectural framework. | ||
| 2020.cl-2.3 We present LESSLEX, a novel ***** multilingual lexical ***** resource | ||
| Semantic annotation | 11 | |
| L16-1699 ***** Semantic annotation ***** on the English section was performed manually; for the annotation in Italian, Spanish, and (partially) Dutch, a procedure was devised to automatically project the annotations on the English texts onto the translated texts, based on the manual alignment of the annotated elements; this enabled us not only to speed up the annotation process but also provided cross-lingual coreference. | ||
| 2021.isa-1.1 The novelty of this scheme is the harmonization of parts 1, 4 and 9 of the ISO 24617 Language resource management - ***** Semantic annotation ***** framework. | ||
| L08-1164 ***** Semantic annotation ***** of text requires the dynamic merging of linguistically structured information and a world model, usually represented as a domain-specific ontology | ||
| 2020.coling-main.422 *****Semantic annotation***** tasks contain ambiguity and vagueness and require varying degrees of world knowledge . | ||
| L12-1296 This paper summarizes the latest , final version of ISO standard 24617 - 2 *****Semantic annotation***** framework , Part 2 : Dialogue acts . | ||
| constructed | 11 | |
| 2020.lrec-1.762 More specifically, we propose a modified version of RankClus algorithm to extract trends from the ***** constructed ***** tweets graph. | ||
| 2021.naacl-demos.8 We then exploit the ***** constructed ***** multimedia knowledge graphs (KGs) for question answering and report generation, using drug repurposing as a case study. | ||
| 2021.emnlp-main.16 After that, we construct a logic-level graph to capture the logical relations between entities and functions in the retrieved evidence, and design a graph-based verification network to perform logic-level graph-based reasoning based on the ***** constructed ***** graph to classify the final entailment relation. | ||
| C18-2004 T-Know is a knowledge service system based on the ***** constructed ***** knowledge graph of Traditional Chinese Medicine (TCM). | ||
| L10-1477 Such aspects as processing large volumes of data, asynchronous mode of processing and scalability of the architecture to large number of users got especial attention in the ***** constructed ***** prototype of the Web Service for morpho-syntactic processing of Polish called TaKIPI-WS (http://plwordnet.pwr.wroc.pl/clarin/ws/takipi/) | ||
| incorporate | 11 | |
| 2020.aacl-main.59 We present two content-inducing approaches to effectively ***** incorporate ***** this additional information. | ||
| 2021.mmsr-1.5 Our visualisations of masked self-attention demonstrate that (i) it can learn general linguistic knowledge of the textual input, and (ii) its attention patterns ***** incorporate ***** artefacts from visual modality even though it has never accessed it directly. | ||
| 2020.emnlp-main.147 There are two challenges for such systems: one is how to effectively ***** incorporate ***** external knowledge bases (KBs) into the learning framework; the other is how to accurately capture the semantics of dialogue history. | ||
| 2021.naacl-main.368 Current approaches ***** incorporate ***** the strengths of structured knowledge and unstructured text, assuming text corpora is semi-structured. | ||
| 2020.emnlp-main.444 Thus they cannot ***** incorporate ***** visual information when encoding plain text alone | ||
| epidemiological | 11 | |
| 2021.smm4h-1.1 Prior work on CoViD-19 NPI sentiment analysis by the ***** epidemiological ***** community has proceeded without a method for properly attributing sentiment changes to events, an ability to distinguish the influence of various events across time, a coherent model for predicting the public's opinion of future events of the same sort, nor even a means of conducting significance tests. | ||
| 2021.ranlp-1.138 Our idea is to propose a system that uses an ontology which includes information in different languages and covers specific ***** epidemiological ***** concepts, it is also based on the multilingual open information extraction for the relation extraction step to reduce the expert intervention and to restrict the content for each text. | ||
| W18-5908 Through a semi-automatic analysis of tweets, we show that Twitter users not only express Medication Non-Adherence (MNA) in social media but also their reasons for not complying; further research is necessary to fully extract automatically and analyze this information, in order to facilitate the use of this data in ***** epidemiological ***** studies. | ||
| 2020.coling-main.543 Our findings indicate that the performance of the models based on fine-tuned language models exceeds by more than 50% the chosen baseline models that include a specialized ***** epidemiological ***** news surveillance system and several machine learning models. | ||
| 2020.multilingualbio-1.6 The tool automatically collects news via customised multilingual queries, classifies them and extracts ***** epidemiological ***** information | ||
| created | 11 | |
| N18-2046 Our analysis shows that the ***** created ***** systems are closer to reaching human-level performance than any other GEC system reported so far. | ||
| L12-1408 We describe the creation of in-domain parallel and monolingual corpora, the development of a domain specific translation system with the ***** created ***** resources, and its adaptation using monolingual resources only. | ||
| R19-1084 To better judge our choice of creating an n-class word model, we compared the ***** created ***** model with the 3-gram type model on the same test corpus of evaluation. | ||
| 2021.eacl-main.22 Similarly, the contribution of salience versus diversity components on the ***** created ***** summary is not studied well. | ||
| L10-1207 This example-based approach not only produces draft transcriptions that just need to be corrected instead of ***** created ***** from scratch but also provides a validation mechanism for ensuring consistency within the corpus | ||
| incomplete | 11 | |
| 2021.naacl-main.416 However, existing report generation systems, despite achieving high performances on natural language generation metrics such as CIDEr or BLEU, still suffer from ***** incomplete ***** and inconsistent generations. | ||
| 2021.naacl-main.69 We argue that this setting does not match human informative seeking behavior and leads to ***** incomplete ***** and uninformative extraction results. | ||
| 2021.naacl-industry.34 One issue with distant supervision is that it leads to ***** incomplete ***** training annotation due to missing attribute values while matching. | ||
| C18-1183 However, this kind of auto-generated data suffers from two main problems: ***** incomplete ***** and noisy annotations, which affect the performance of NER models. | ||
| L10-1398 Even if the classifier detected all of the task ***** incomplete ***** dialog correctly, our proposed method achieved the false detection rate of only 6% | ||
| tectogrammatical | 11 | |
| L04-1194 The annotation of the Prague Dependency Treebank (PDT) is conceived of as a multilayered scenario that comprises also dependency representations (***** tectogrammatical ***** tree structures, TGTS's) of the underlying structure of the sentences. | ||
| L10-1342 This paper investigates the mapping between two semantic formalisms, namely the ***** tectogrammatical ***** layer of the Prague Dependency Treebank 2.0 (PDT) and (Robust) Minimal Recursion Semantics ((R)MRS). | ||
| L08-1172 The proposed valency lexicon will be exploited in particular during further ***** tectogrammatical ***** annotations of PADT and might serve for enriching the expected second edition of the corpus-based Arabic-Czech Dictionary. | ||
| L10-1526 The annotation is carried out on dependency trees (on the ***** tectogrammatical ***** layer), this approach is quite novel and it brings us some advantages when interpreting the syntactic structure of the discourse units. | ||
| L10-1266 At first, the basic principles of the treebank annotation project are introduced (division to three layers: morphological, analytical and ***** tectogrammatical *****) | ||
| bibliographic | 11 | |
| L16-1395 To better connect to the library world, and to allow librarians to enter metadata for linguistic resources into their catalogues, a crosswalk from CMDI-based formats to ***** bibliographic ***** standards is required. | ||
| 2021.wanlp-1.23 In this paper, we present the first reported results on the task of automatic Romanization of undiacritized Arabic ***** bibliographic ***** entries. | ||
| 2020.acl-main.447 The corpus consists of rich metadata, paper abstracts, resolved ***** bibliographic ***** references, as well as structured full text for 8.1M open access papers. | ||
| L16-1576 The previous description is one of BILBO's features, which is an open source software for automatic annotation of ***** bibliographic ***** reference. | ||
| P17-2065 Compositor attribution, the clustering of pages in a historical printed document by the individual who set the type, is a ***** bibliographic ***** task that relies on analysis of orthographic variation and inspection of visual details of the printed page | ||
| read | 11 | |
| L16-1121 It contains 8.2 hours of ***** read ***** speech based on phonetically balanced sentences, commands, and digits. | ||
| L16-1316 This paper investigates the behavior of an automatic phone-based anomaly detection system when applied on ***** read ***** and spontaneous French dysarthric speech. | ||
| L14-1341 It contains approximately 1900 minutes of (***** read ***** and spontaneous) speech produced by 38 speakers. | ||
| 2020.lrec-1.782 It is dramatically different from ***** read ***** speech, where the words are authored as text before they are spoken. | ||
| L04-1176 Utterances will be recorded directly from calls made either from fixed or cellular telephones and are composed by ***** read ***** text and answers to specific questions | ||
| substitutions | 11 | |
| L08-1384 In addition to gauging performance at a finer level of granularity, BLEU+ also allows the computation of various upper bound oracle scores: comparing all tokens considering only the roots allows us to get an upper bound when all errors due to morphological structure are fixed, while comparing tokens in an error-tolerant way considering minor morpheme edit operations, allows us to get a (more realistic) upper bound when tokens that differ in morpheme insertions/deletions and ***** substitutions ***** are fixed. | ||
| 2020.winlp-1.6 As a result, SIMPLEX-PB 2.0 features much more reliable and numerous candidate ***** substitutions ***** to complex words, as well as word complexity rankings produced by a group underprivileged children. | ||
| 2021.acl-long.229 When the human rationales are not available, we propose exploiting unsupervised generated rationales as ***** substitutions *****. | ||
| I17-1030 Current research in text simplification has been hampered by two central problems: (i) the small amount of high-quality parallel simplification data available, and (ii) the lack of explicit annotations of simplification operations, such as deletions or ***** substitutions *****, on existing data. | ||
| 2020.loresmt-1.11 Finally, we analyze performance differences between the LSTM and Transformer encoders when using a Transformer decoder and find that the Transformer encoder is better able to handle insertions and ***** substitutions ***** when transliterating | ||
| partition | 11 | |
| 2021.trustnlp-1.5 ing Problem (CPP), which is an Integer Pro- gram (IP) to formulate ER as a graph ***** partition *****- | ||
| 2000.iwpt-1.26 In this paper, we report on significant progress, i.e., (1) developing guidelines for the grammar ***** partition ***** through a set of heuristics, (2) devising a new mix-strategy composition algorithms for any rule-based grammar ***** partition ***** in a lattice framework, and 3) initial but encouraging parsing results for Chinese and English queries from an Air Travel Information System (ATIS) corpus. | ||
| 2021.emnlp-main.17 We propose a ***** partition ***** filter network to model two-way interaction between tasks properly, where feature encoding is decomposed into two steps: ***** partition ***** and filter. | ||
| L12-1027 In this paper we discuss the data collection and parallel corpus compilation for training SMT systems, which includes several procedures such as data ***** partition *****, conversion, formatting, normalization and alignment | ||
| D18-1405 Noise Contrastive Estimation ( NCE ) is a powerful parameter estimation method for log - linear models , which avoids calculation of the *****partition***** function or its derivatives at each training step , a computationally demanding step in many cases . | ||
| embodied | 11 | |
| L12-1120 We developed a dialogue-based tutoring system for teaching English to Japanese students and plan to transfer the current software tutoring agent into an ***** embodied ***** robot in the hope that the robot will enrich conversation by allowing more natural interactions in small group learning situations. | ||
| L16-1552 We present a corpus of 44 human-agent verbal and gestural story retellings designed to explore whether humans would gesturally entrain to an ***** embodied ***** intelligent virtual agent. | ||
| W18-5011 In this paper, we introduce a computational model for speech overlap resolution in ***** embodied ***** artificial agents. | ||
| 2020.emnlp-main.356 The size, scope and detail of RxR dramatically expands the frontier for research on ***** embodied ***** language agents in photorealistic simulated environments | ||
| 2020.findings-emnlp.348 For *****embodied***** agents , navigation is an important ability but not an isolated goal . | ||
| observational | 11 | |
| L12-1441 Short spans account for canonical entity mentions (e.g., standardized disease names), while long spans cover descriptive text snippets which contain entity-specific elaborations (e.g., anatomical locations, ***** observational ***** details, etc.). | ||
| 2021.cinlp-1.8 We use propensity score stratification, a causal inference method for ***** observational ***** data, and estimate whether the amount of comments —as a measure of social support— increases or decreases the likelihood of posting again on SW. | ||
| 2020.emnlp-main.590 Experiments across three datasets show that our method improves the generalization ability of models under limited ***** observational ***** examples. | ||
| 2021.naacl-main.323 We consider the problem of using ***** observational ***** data to estimate the causal effects of linguistic properties. | ||
| 2020.findings-emnlp.11 Automatic, human ***** observational *****, and interactive evaluation shows that our method is able to select knowledge more accurately and generate more informative responses, significantly outperforming the state-of-the-art baselines | ||
| font | 11 | |
| 2020.semeval-1.214 Text emphasis in visual media is generally done by using different colors, backgrounds, or ***** font ***** for the text; it helps in conveying the actual meaning of the message to the readers. | ||
| 2020.acl-main.162 We seek to extend these works by examining whether or not document level predictions are effective, given additional information such as subject matter, ***** font ***** characteristics, and readability metrics. | ||
| L14-1122 These resources were combined to fulfil the requirements of a well-tested statistical parameter synthesis model, leading to an intelligible voice ***** font *****. | ||
| 2021.emnlp-main.244 We evaluate on the task of ***** font ***** reconstruction over various datasets representing character types of many languages, and compare favorably to modern style transfer systems according to both automatic and manually-evaluated metrics. | ||
| 2020.acl-main.721 In many documents , such as semi - structured webpages , textual semantics are augmented with additional information conveyed using visual elements including layout , *****font***** size , and color . | ||
| Intelligent | 11 | |
| W19-4442 We believe that our system can help medical students grasp the curriculum better, within classroom as well as in ***** Intelligent ***** Tutoring Systems (ITS) settings. | ||
| 2020.sigdial-1.11 We investigate differences in user communication with live chat agents versus a commercial ***** Intelligent ***** Virtual Agent (IVA) | ||
| 2020.acl-main.767 *****Intelligent***** features in email service applications aim to increase productivity by helping people organize their folders , compose their emails and respond to pending tasks . | ||
| 2021.sigdial-1.37 *****Intelligent***** agents that are confronted with novel concepts in situated environments will need to ask their human teammates questions to learn about the physical world . | ||
| P18-2102 *****Intelligent***** systems require common sense , but automatically extracting this knowledge from text can be difficult . | ||
| TMU | 11 | |
| 2020.wat-1.7 We introduce our ***** TMU ***** system submitted to the Japanese-English Multimodal Task (constrained) for WAT 2020 (Nakazawa et al., 2020). | ||
| 2020.ngt-1.15 We introduce our ***** TMU ***** system that is submitted to The 4th Workshop on Neural Generation and Translation (WNGT2020) to English-to-Japanese (En→Ja) track on Simultaneous Translation And Paraphrase for Language Education (STAPLE) shared task. | ||
| 2020.wat-1.15 In this paper, we describe our ***** TMU ***** neural machine translation (NMT) system submitted for the Patent task (Korean→Japanese) of the 7th Workshop on Asian Translation (WAT 2020, Nakazawa et al., 2020). | ||
| W18-0544 We introduce the ***** TMU ***** systems for the second language acquisition modeling shared task 2018 (Settles et al., 2018) | ||
| W18-0521 We introduce the *****TMU***** systems for the Complex Word Identification ( CWI ) Shared Task 2018 . | ||
| exploratory | 11 | |
| D17-2004 While experimental evidence that they are indeed helpful exists for some of them, it is largely unknown which type of graph is most helpful for a specific ***** exploratory ***** task. | ||
| D19-6309 We describe our ***** exploratory ***** system for the shallow surface realization task, which combines morphological inflection using character sequence-to-sequence models with a baseline linearizer that implements a tree-to-tree model using sequence-to-sequence models on serialized trees. | ||
| S19-2202 To determine the essential features, we were aided by ***** exploratory ***** data analysis and visualizations. | ||
| W17-3102 This paper is largely ***** exploratory ***** in nature to lay the groundwork for subsequent research in mental health, rather than optimizing a particular text classification task | ||
| L14-1168 We describe an under - studied problem in language resource management : that of providing automatic assistance to annotators working in *****exploratory***** settings . | ||
| hateful | 11 | |
| 2020.restup-1.1 The target of these bots were pro-independence influencers that were sent negative, emotional and aggressive ***** hateful ***** tweets with hashtags such as #sonunesbesties (i.e. #theyareanimals). | ||
| 2021.acl-long.210 On social media platforms, ***** hateful ***** and offensive language negatively impact the mental well-being of users and the participation of people from diverse backgrounds. | ||
| 2019.jeptalnrecital-court.21 We focus in particular on ***** hateful ***** messages towards two different targets (immigrants and women) in English tweets, as well as sexist messages in both English and French. | ||
| S19-2007 The task is organized in two related classification subtasks: a main binary subtask for detecting the presence of hate speech, and a finer-grained one devoted to identifying further features in ***** hateful ***** contents such as the aggressive attitude and the target harassed, to distinguish if the incitement is against an individual rather than a group. | ||
| 2021.wassa-1.18 We investigate in this paper if this manifests also in online communication of the supporters of the candidates Biden and Trump, by uttering ***** hateful ***** and offensive communication | ||
| Emphasis | 11 | |
| 2020.semeval-1.214 ***** Emphasis ***** selection is the task of choosing candidate words for emphasis, it helps in automatically designing posters and other media contents with written text. | ||
| W19-8631 ***** Emphasis ***** will be placed on adapting NLG methodologies to the political domain, which entails special attention to affect, discursive variety, and rhetorical strategies that align a speaker with their interlocutor, even in cases of policy disagreement | ||
| 2020.semeval-1.190 This paper describes the system designed by ERNIE Team which achieved the first place in SemEval-2020 Task 10 : *****Emphasis***** Selection For Written Text in Visual Media . | ||
| 2021.ranlp-1.175 *****Emphasis***** Selection is a newly proposed task which focuses on choosing words for emphasis in short sentences . | ||
| 2020.semeval-1.216 This paper shows our system for SemEval-2020 task 10 , *****Emphasis***** Selection for Written Text in Visual Media . | ||
| Computer | 11 | |
| W16-4914 We present a novel approach to ***** Computer ***** Assisted Language Learning (CALL), using deep syntactic parsers and semantic based machine translation (MT) in diagnosing and providing explicit feedback on language learners' errors. | ||
| N19-5001 Adversarial learning is a game - theoretic learning paradigm , which has achieved huge successes in the field of *****Computer***** Vision recently . | ||
| W19-2908 This paper presents the first results of a multidisciplinary project , the Evolex project , gathering researchers in Psycholinguistics , Neuropsychology , *****Computer***** Science , Natural Language Processing and Linguistics . | ||
| J78-3001 ACL : Minutes Of the 16th Annual Business Meeting ; ACL Secretary - Treasurer 's Report ; ACL Officers For 1979 ; ACL Officers 1963 - 1979 ; NSF : Support for Computational Linguistics ( Paul G. Chapin ) ; News : Short Notes ; News : ARIST Reprint Request ( Martha E. Williams ) ; News : Summer Linguistics at Texas ; PhD Programs in Computational Linguistics ; Journal : Computational Linguistics and *****Computer***** Languages ( T. Frey ; T. Vamos ) ; Journal : Discourse Processes ( Roy D. Freedle ) ; Book Notices ( Mel'cuk R. Ravic ) ; Yale AI Project Research Reports Available ; Summary of Research on Computational Aspects of Evolution Theories ( Raymond D. Gumb ) ; Taxonomy : Information Sciences ( Editors of Information Systems ) ; Machine Aids to Translation : A Concise State of the Art Bibliography ( Wayne Zachary ) ; | ||
| L10-1638 This paper presents the results of a terminological work conducted by the authors on a Digital Archives Net of the Italian National Research Council ( CNR ) in the field of *****Computer***** Science . | ||
| linguistic cues | 11 | |
| 2020.cl-2.4 We show that our tests can be used to explore word embeddings or black-box neural models for ***** linguistic cues ***** in a multilingual setting. | ||
| 2020.nuse-1.2 Toward that goal, in this work, we present a method to evaluate the quality of a screenplay based on ***** linguistic cues *****. | ||
| W17-8005 We assume that unknown words with internal structure (affixed words or compounds) can provide speakers with ***** linguistic cues ***** as for their meaning, and thus help their decoding and understanding. | ||
| P17-1093 Implicit discourse relation classification is of great challenge due to the lack of connectives as strong ***** linguistic cues *****, which motivates the use of annotated implicit connectives to improve the recognition. | ||
| L10-1163 Four of them that are possibly related to opinions are also annotated in the constructed corpus to provide the ***** linguistic cues *****. | ||
| verification | 11 | |
| 2020.coling-main.165 However, the challenging problem of fake news detection has not benefited from the improvement of fact ***** verification ***** models, which is closely related to fake news detection. | ||
| D19-1258 The system is evaluated on both fact ***** verification ***** and open-domain multihop QA, achieving state-of-the-art results on the leaderboard test sets of both FEVER and HOTPOTQA. | ||
| 2021.acl-short.51 This work explores a framework for fact ***** verification ***** that leverages pretrained sequence-to-sequence transformer models for sentence selection and label prediction, two key sub-tasks in fact ***** verification *****. | ||
| 2020.findings-emnlp.216 We come up with SciKGAT to combine the advantages of open-domain literature search, state-of-the-art fact ***** verification ***** systems and in-domain medical knowledge through language modeling. | ||
| N19-2017 Goals of use-case reviews and analyses include their correctness, completeness, detection of ambiguities, prototyping, ***** verification *****, test case generation and traceability. | ||
| extracting keyphrases | 11 | |
| W18-2304 RAKE and CRF, on the task of ***** extracting keyphrases ***** from Indonesian health forum posts. | ||
| 2021.sdp-1.6 Automatically ***** extracting keyphrases ***** from scholarly documents leads to a valuable concise representation that humans can understand and machines can process for tasks, such as information retrieval, article clustering and article classification. | ||
| N19-1292 Besides ***** extracting keyphrases *****, the output of the extractive model is also employed to rectify the copy probability distribution of the generative model, such that the generative model can better identify important contents from the given document. | ||
| P17-1054 Empirical analysis on six datasets demonstrates that our proposed model not only achieves a significant performance boost on ***** extracting keyphrases ***** that appear in the source text, but also can generate absent keyphrases based on the semantic meaning of the text. | ||
| 2020.coling-main.469 Given the huge rate at which scientific papers are published today, it is important to have effective ways of automatically ***** extracting keyphrases ***** from a research paper. | ||
| parsing algorithms | 11 | |
| 1993.iwpt-1.13 Advantages of the bunch concept are illustrated by using it in descriptions of a formal semantics for context-free grammars and of functional ***** parsing algorithms *****. | ||
| 2021.cmcl-1.3 CCG has well-defined incremental ***** parsing algorithms *****, surface compositional semantics, and can explain long-range dependencies as well as complicated cases of coordination. | ||
| 1991.iwpt-1.7 The focus of this paper is investigation of linguistic data base design in conjugation with ***** parsing algorithms *****. | ||
| 2000.iwpt-1.24 We propose an algebraic method for the design of tabular ***** parsing algorithms ***** which uses parsing schemata [7]. | ||
| P19-1230 Then two ***** parsing algorithms ***** are respectively proposed for two converted tree representations, division span and joint span. | ||
| semantic matching | 11 | |
| 2021.ranlp-srw.25 While many recent papers focus on ***** semantic matching ***** capabilities of TMs, this planned study will address how these tools perform when dealing with longer segments and whether this could be a cause of lower match scores. | ||
| 2020.aacl-main.74 Considering the problem of information ambiguity and incompleteness for short text, two kinds of knowledge, factual knowledge graph and conceptual knowledge graph, are introduced to provide additional knowledge for the ***** semantic matching ***** between candidate entity and mention context. | ||
| 2021.emnlp-main.78 The ***** semantic matching ***** capabilities of neural information retrieval can ameliorate synonymy and polysemy problems of symbolic approaches. | ||
| 2021.emnlp-main.86 This is a many-to-many ***** semantic matching ***** task because both contexts and personas in SPD are composed of multiple sentences. | ||
| D19-1540 On the other hand, many NLP problems, such as question answering and paraphrase identification, can be considered variants of ***** semantic matching *****, which is to measure the semantic distance between two pieces of short texts. | ||
| contextual representations | 11 | |
| D18-1005 Our system features the use of domain-specific resources automatically derived from a large unlabeled corpus, and ***** contextual representations ***** of the emotional and semantic content of the user's recent tweets as well as their interactions with other users. | ||
| 2021.naacl-main.108 However, ***** contextual representations ***** from pre-trained models contain entangled semantic and syntactic information, and therefore cannot be directly used to derive useful semantic sentence embeddings for some tasks. | ||
| P19-1485 We analyze the factors that contribute to generalization, and show that training on a source RC dataset and transferring to a target dataset substantially improves performance, even in the presence of powerful ***** contextual representations ***** from BERT (Devlin et al., 2019). | ||
| 2020.semeval-1.238 Our model has the added advantage that it combines the power of ***** contextual representations ***** from BERT with simple span-based and article-based global features. | ||
| 2020.semeval-1.221 The system aims to learn the emphasis selection distribution using ***** contextual representations ***** extracted from pre-trained language models and a two-staged ranking model. | ||
| sentence extraction | 11 | |
| P18-1188 We propose to use external information to improve document modeling for problems that can be framed as ***** sentence extraction *****. | ||
| I17-2060 We explore a ***** sentence extraction ***** framework based on diversified lexical chains to capture coherence and richness. | ||
| D19-5729 For BioNLP-OST 2019, we introduced a new mental health informatics task called “RDoC Task”, which is composed of two subtasks: information retrieval and ***** sentence extraction ***** through National Institutes of Mental Health's Research Domain Criteria framework. | ||
| L04-1237 Multi-document summaries produced via ***** sentence extraction ***** often suffer from a number of cohesion problems, including dangling anaphora, sudden shifts in topic and incorrect or awkward chronological ordering. | ||
| L06-1300 A centroid-based ***** sentence extraction ***** system has been developed which decides the content of the summary using texts in different languages and uses sentences from English sources alone to create the final output. | ||
| hierarchical structure | 11 | |
| D19-1660 Therefore, we propose a method that can consider the ***** hierarchical structure ***** of labels and label texts themselves. | ||
| 2020.coling-main.129 Given that CFLs are believed to capture important phenomena such as ***** hierarchical structure ***** in natural languages, this discrepancy in performance calls for an explanation. | ||
| W17-4509 We found flat and ***** hierarchical structure *****s of two levels plus the root offer stable centroid models, but ***** hierarchical structure *****s of three levels plus the root didn't seem stable enough for use in hierarchical summarization. | ||
| 2020.emnlp-main.161 HERO encodes multimodal inputs in a ***** hierarchical structure *****, where local context of a video frame is captured by a Cross-modal Transformer via multimodal fusion, and global video context is captured by a Temporal Transformer. | ||
| W19-6106 Reproducing high-quality ***** hierarchical structure *****s such as WordNet on a diachronic scale is a very difficult task. | ||
| technique | 11 | |
| D18-1207 We propose two ***** technique *****s to improve the level of abstraction of generated summaries. | ||
| L12-1283 This work is part of a project for MWE extraction and characterization using different ***** technique *****s aiming at measuring the properties related to idiomaticity, as institutionalization, non-compositionality and lexico-syntactic fixedness. | ||
| L06-1015 That is, the retrieved documents from both systems are shown to the judges without any information about thesearch ***** technique *****s. | ||
| 2020.wanlp-1.32 In this paper, several ***** technique *****s with multiple algorithms are applied for Arabic dialects identification starting from removing noise till classification task using all Arabic countries as 21 classes. | ||
| 2020.gebnlp-1.6 Furthermore, we analyze the effect of the debiasing ***** technique *****s on downstream tasks which show a negligible impact on traditional embeddings and a 2% decrease in performance in contextualized embeddings. | ||
| entity representations | 11 | |
| D19-5302 In this study, we focus on relation prediction and propose a method to learn ***** entity representations ***** via a graph structure that uses Seen-entities, Unseen-entities and words as nodes created from the descriptions of all entities. | ||
| 2020.acl-main.578 Such strategies allow NMN to effectively construct matching-oriented ***** entity representations ***** while ignoring noisy neighbors that have a negative impact on the alignment task. | ||
| 2021.eacl-main.217 This paper explores learning rich self-supervised ***** entity representations ***** from large amounts of associated text. | ||
| 2020.findings-emnlp.54 In this paper, we present an approach to creating ***** entity representations ***** that are human readable and achieve high performance on entity-related tasks out of the box. | ||
| 2020.repl4nlp-1.24 We evaluate named ***** entity representations ***** of BERT-based NLP models by investigating their robustness to replacements from the same typed class in the input. | ||
| translated texts | 11 | |
| 2020.acl-main.532 In this paper, we show that neural machine translation (NMT) systems trained on large back-translated data overfit some of the characteristics of machine-***** translated texts *****. | ||
| W17-7904 Special emphasis will be placed on the labelling of metadata that precisely describe the relations between ***** translated texts ***** and their originals. | ||
| 2020.eamt-1.39 Many studies have confirmed that ***** translated texts ***** exhibit different features than texts originally written in the given language. | ||
| 2021.triton-1.20 These observations can be related to the differences between second language learners of various levels and between translated and un***** translated texts *****. | ||
| 2001.mtsummit-papers.51 It is therefore surprising that only little attention – both in theory and in practice - has been given to the task of post-editing machine ***** translated texts *****. | ||
| lexical chains | 11 | |
| 2018.gwc-1.11 This paper explores the degree of cohesion among a document's words using ***** lexical chains ***** as a semantic representation of its meaning. | ||
| R17-1087 The results show that, on the one hand, the incorporation of many-relation ***** lexical chains ***** improves results, but on the other hand, unrestricted-length chains remain difficult to handle with respect to their huge quantity. | ||
| D19-5723 In this paper, we propose a novel approach for dependency graph construction based on ***** lexical chains *****, so one dependency graph can represent one or multiple sentences. | ||
| I17-2060 We explore a sentence extraction framework based on diversified ***** lexical chains ***** to capture coherence and richness. | ||
| L14-1665 The main difference of thematic chains in comparison with ***** lexical chains ***** is the basic principle of their construction: thematic chains are intended to model different participants (concrete or abstract) of the situation described in the analyzed texts, what means that elements of the same thematic chain cannot often co-occur in the same sentences of the texts under consideration. | ||
| text translation | 11 | |
| L12-1665 That IWSLT 2011 evaluation focused on the automatic translation of public talks and included tracks for speech recognition, speech translation, ***** text translation *****, and system combination. | ||
| 2011.iwslt-evaluation.1 This year, the IWSLT evaluation focused on the automatic translation of public talks and included tracks for speech recognition, speech translation, ***** text translation *****, and system combination. | ||
| 2013.iwslt-evaluation.16 We present the methods and techniques to achieve high translation quality for ***** text translation ***** of talks which are applied at RWTH Aachen University, the University of Edinburgh, Karlsruhe Institute of Technology, and Fondazione Bruno Kessler. | ||
| 2020.iwslt-1.34 Some translationese features tend to appear in simultaneous interpreting with higher frequency than in human ***** text translation *****, but the reasons for this are unclear. | ||
| 2020.aacl-main.58 We investigate how to adapt simultaneous ***** text translation ***** methods such as wait-k and monotonic multihead attention to end-to-end simultaneous speech translation by introducing a pre-decision module. | ||
| gated recurrent neural | 11 | |
| C16-1124 We present a model of visually-grounded language learning based on stacked ***** gated recurrent neural ***** networks which learns to predict visual features given an image description in the form of a sequence of phonemes. | ||
| C16-1014 In particular, given a document, the model learns sentence representations with a convolutional neural network, which are combined using a ***** gated recurrent neural ***** network with attention mechanism to model discourse information and yield a document vector. | ||
| C16-1231 In particular, we use a bi-directional ***** gated recurrent neural ***** network to capture syntactic and semantic information over tweets locally, and a pooling neural network to extract contextual features automatically from history tweets. | ||
| S17-2119 We first trained a ***** gated recurrent neural ***** network using pre-trained word embeddings, then we extracted features from GRU layer and input these features into support vector machine to fulfill both the classification and quantification subtasks. | ||
| C16-1085 We explore ***** gated recurrent neural ***** network model (GRU), and an ensemble of GRU model and maximum entropy language model (GRU-ME) to select the best preposition from 43 candidates for each test sentence. | ||
| statistical parser | 11 | |
| W18-6016 The Hebrew treebank (HTB), consisting of 6221 morpho-syntactically annotated newspaper sentences, has been the only resource for training and validating ***** statistical parser *****s and taggers for Hebrew, for almost two decades now. | ||
| L08-1022 Modern ***** statistical parser *****s are trained on large annotated corpora (treebanks). | ||
| L12-1415 Freely available ***** statistical parser *****s often require careful optimization to produce state-of-the-art results, which can be a non-trivial task especially for application developers who are not interested in parsing research for its own sake. | ||
| L08-1536 We compare and contrast the Morpheme-Based and Word-Based annotation strategies of pronominal clitics in Modern Hebrew and we show that the Word-Based strategy is more adequate for the purpose of training ***** statistical parser *****s as it provides a better PP-attachment disambiguation capacity and a better alignment with initial surface forms. | ||
| L14-1633 This is done by comparing the performance of a ***** statistical parser ***** (DeSR) trained on a simpler resource (the augmented version of the Merged Italian Dependency Treebank or MIDT+) and whose output was automatically converted to SD, with the results of the parser directly trained on ISDT. | ||
| natural language query | 11 | |
| 2021.acl-long.176 The goal of database question answering is to enable ***** natural language query *****ing of real-life relational databases in diverse application domains. | ||
| 2005.mtsummit-swtmt.2 In this paper we present a new architecture aiming to bring together the advantages of ***** natural language query *****ing and the power of semantic W eb. | ||
| 2021.repl4nlp-1.24 We consider the conversational question answering settings, where a ***** natural language query *****, its context and its final answers are available at training. | ||
| L16-1113 In this paper we propose a system, by means of which we will develop a search engine able to process online documents, starting from a ***** natural language query *****, and to return information to users. | ||
| 2021.acl-long.442 Finding codes given ***** natural language query ***** is beneficial to the productivity of software developers. | ||
| dialogue processing | 11 | |
| W19-4109 The uncertainties of language and the complexity of dialogue contexts make accurate dialogue state tracking one of the more challenging aspects of ***** dialogue processing *****. | ||
| 2021.mmsr-1.8 We offer a fine-grained information state annotation scheme that follows directly from the Incremental Unit abstract model of ***** dialogue processing ***** when used within a multimodal, co-located, interactive setting. | ||
| L06-1393 Speech interfaces and ***** dialogue processing ***** abilities have promise for improving the utility of open-domain question answering (QA).We propose a novel method of resolving disambiguation problems arisen in those speech and dialogue enhanced QA tasks. | ||
| 2020.coling-main.43 We present a multi-task learning framework to enable the training of one universal incremental ***** dialogue processing ***** model with four tasks of disfluency detection, language modelling, part-of-speech tagging and utterance segmentation in a simple deep recurrent setting. | ||
| W17-3509 We implement this generator in an incremental ***** dialogue processing ***** framework such that we can exploit an existing interface to incremental text-to-speech synthesis. | ||
| pivot language | 11 | |
| L16-1524 Based on the assumption, we propose a constraint-based bilingual lexicon induction for closely related languages by extending constraints and translation pair candidates from recent ***** pivot language ***** approach. | ||
| 2008.iwslt-papers.1 Translation with ***** pivot language *****s has recently gained attention as a means to circumvent the data bottleneck of statistical machine translation (SMT). | ||
| 2014.amta-researchers.24 We improve translation quality by adding data using ***** pivot language *****s and exper- imentally compare previously proposed triangulation design options. | ||
| 2010.iwslt-papers.12 The principal idea is to generate intermediate translations in several ***** pivot language *****s, translate them separately into the target language, and generate a consensus translation out of these using MT system combination techniques. | ||
| L08-1130 This paper proposes a method of increasing the size of a bilingual lexicon obtained from two other bilingual lexicons via a ***** pivot language *****. | ||
| programming language | 11 | |
| 2020.acl-main.538 Motivated by the intuition that developers usually retrieve resources on the web when writing code, we explore the effectiveness of incorporating two varieties of external knowledge into NL-to-code generation: automatically mined NL-code pairs from the online programming QA forum StackOverflow and ***** programming language ***** API documentation. | ||
| 2020.findings-emnlp.361 Code retrieval is a key task aiming to match natural and ***** programming language *****s. | ||
| 2021.nlp4prog-1.3 To this end, we release 345K datasets consisting of code modification and commit messages in six ***** programming language *****s (Python, PHP, Go, Java, JavaScript, and Ruby). | ||
| 1997.iwpt-1.24 Disambiguation methods for context-free grammars enable concise specification of ***** programming language *****s by ambiguous grammars. | ||
| 2021.naacl-main.211 Experiments on code summarization in the English language, code generation, and code translation in seven ***** programming language *****s show that PLBART outperforms or rivals state-of-the-art models. | ||
| discourse context | 11 | |
| E17-1062 This study introduces a statistical model able to generate variations of a proper name by taking into account the person to be mentioned, the ***** discourse context ***** and variation. | ||
| D19-1568 Among these characteristics of persuasive arguments, prior work in NLP does not explicitly investigate the effect of the pragmatic and ***** discourse context ***** when determining argument quality. | ||
| L10-1397 This paper draws a distinction between ***** discourse context ***** ―other entities that have been mentioned in the dialogue― and visual context ―visually available objects near the intended referent. | ||
| 2020.aacl-main.66 We present two extensions to a state-of-theart joint model for event coreference resolution, which involve incorporating (1) a supervised topic model for improving trigger detection by providing global context, and (2) a preprocessing module that seeks to improve event coreference by discarding unlikely candidate antecedents of an event mention using ***** discourse context *****s computed based on salient entities. | ||
| 2020.acl-srw.17 As a result, these models may sufficiently account for ***** discourse context ***** of task-oriented but not social conversations. | ||
| lyrics | 11 | |
| 2020.lrec-1.105 The resource contains three types of data for the investigation and evaluation of quite distinct phenomena: TEI-compliant song ***** lyrics ***** as primary data, linguistically and literary motivated annotations, and extralinguistic metadata. | ||
| 2020.lrec-1.262 The creation of the resource is still ongoing: so far, the corpus contains 1.73M songs with ***** lyrics ***** (1.41M unique ***** lyrics *****) annotated at different levels with the output of the above mentioned methods. | ||
| N18-1015 Experimental results show that the proposed model generates fluent ***** lyrics ***** while maintaining the compatibility between boundaries of ***** lyrics ***** and melody structures. | ||
| L12-1425 We first describe the corpus, consisting of 100 popular songs, each of them including a music component, provided in the MIDI format, as well as a ***** lyrics ***** component, made available as raw text. | ||
| 2020.emnlp-demos.12 Recently , a variety of neural models have been proposed for *****lyrics***** generation . | ||
| information system | 11 | |
| U19-1017 The first step towards designing an ***** information system ***** is conceptual modelling where domain experts and knowledge engineers identify the necessary information together to build an ***** information system *****. | ||
| 1963.earlymt-1.18 Such an ***** information system ***** has been designed at the Linguistics Research Center of The University of Texas. | ||
| W19-0420 The embedding of words and documents in compact, semantically meaningful vector spaces is a crucial part of modern ***** information system *****s. | ||
| W17-2306 The goal of the BioASQ challenge is to engage researchers into creating cuttingedge biomedical ***** information system *****s. | ||
| 2000.iwpt-1.44 In an *****information system***** indexing can be accomplished by creating a citation based on context - free parses , and matching becomes a natural mechanism to extract patterns . | ||
| interface | 11 | |
| 2003.mtsummit-systems.4 Its Web-based ***** interface ***** and multi-user architecture enable a centralized and efficient work environment for local and geographically disbursed individual users and teams. | ||
| 1999.mtsummit-1.88 A multi-user, networkable application, Logos 8 allows Internet or Intranet use of its applications with client ***** interface *****s that communicate with dictionaries and translation servers through a common gateway. | ||
| L12-1309 The RIDIRE-CPI user-friendly ***** interface ***** is specifically intended for allowing collaborative work performance by users with low skills in web technology and text processing. | ||
| L14-1630 We indicate how different user profiles determined different crucial ***** interface ***** design options. | ||
| 2021.eacl-demos.14 The suite is made available through a web API and a web ***** interface ***** where users can enter text or upload files. | ||
| functions | 11 | |
| 2021.emnlp-main.643 Here, we introduce the application of balancing loss ***** functions ***** for multi-label text classification. | ||
| S19-1018 In doing so, it provides a reformalisation (in TTR) of enthymemes and topoi as networks rather than ***** functions *****, and information state update rules for conditionals. | ||
| 1963.earlymt-1.30 Formats and ***** functions ***** dealing with set-relations, part-whole and numeric relations, and left-toright spatial relations have been included in the system, which is being expanded to handle other types of relations. | ||
| 2020.acl-main.186 We extend local tree-based loss ***** functions ***** with terms that provide global supervision and show how to optimize them end-to-end. | ||
| 2021.eacl-main.117 We do this by showing how the concept of nucleus can be defined in the framework of Universal Dependencies and how we can use composition ***** functions ***** to make a transition-based dependency parser aware of this concept. | ||
| computational social science | 11 | |
| 2020.acl-main.51 The problem of comparing two bodies of text and searching for words that differ in their usage between them arises often in digital humanities and ***** computational social science *****. | ||
| P18-1066 The framework could be useful for machine translation applications and research in ***** computational social science *****. | ||
| 2021.emnlp-main.788 Understanding differences of viewpoints across corpora is a fundamental task for ***** computational social science *****s. | ||
| D19-1661 However, lack of interpretability and the unsupervised nature of word embeddings have limited their use within ***** computational social science ***** and digital humanities. | ||
| W18-4512 We present a simple approach to the generation and labeling of extraction patterns for coding political event data, an important task in ***** computational social science *****. | ||
| hybrid system | 11 | |
| 2020.nlptea-1.14 We present a ***** hybrid system ***** that utilizes both detection and correction stages. | ||
| L10-1259 To illustrate the approach, we apply the proposed methodology to the Afazio RTE system (an ***** hybrid system ***** focusing on syntactic entailment) and show how it permits identifying the most likely sources of errors made by this system on a testsuite of 10 000 (non-)entailment pairs which is balanced in term of (non-)entailment and in term of syntactic annotations. | ||
| 2020.coling-main.459 Thus, we propose Hy-NLI, a ***** hybrid system ***** that learns to identify an NLI pair as linguistically challenging or not. | ||
| L14-1707 DisMo is a ***** hybrid system ***** that uses a combination of lexical resources, rules, and statistical models based on Conditional Random Fields (CRF). | ||
| 1994.bcs-1.16 In providing connections between the terms (lexical entries) and the knowledge base our approach will be compared to terminological knowledge bases (TKBs) which are ***** hybrid system *****s between concept-oriented term banks and knowledge bases. | ||
| multiword expression | 11 | |
| W17-1716 All ***** multiword expression *****s are a great challenge for natural language processing, but the verbal ones are particularly interesting for tasks such as parsing, as the verb is the central element in the syntactic organization of a sentence. | ||
| W17-1727 As ***** multiword expression *****s (MWEs) exhibit a range of idiosyncrasies, their automatic detection warrants the use of many different features. | ||
| L12-1517 Light verb constructions (LVCs), such as take a walk and make a decision, are a common subclass of ***** multiword expression *****s (MWEs), whose distinct syntactic and semantic properties call for a special treatment within a computational system. | ||
| P19-1316 The compositionality degree of ***** multiword expression *****s indicates to what extent the meaning of a phrase can be derived from the meaning of its constituents and their grammatical relations. | ||
| U18-1009 In this paper, we perform a comparative evaluation of off-the-shelf embedding models over the task of compositionality prediction of ***** multiword expression *****s(“MWEs”). | ||
| forums | 11 | |
| C16-1163 In this paper, we apply Long Short-Term Memory networks with an attention mechanism, which can select important parts of text for the task of similar question retrieval from community Question Answering (cQA) ***** forums *****. | ||
| 2021.louhi-1.3 In online ***** forums ***** focused on health and wellbeing, individuals tend to seek and give the following social support: emotional and informational support. | ||
| W19-3022 This research, motivated by the CLPsych 2019 shared task, developed neural network-based methods for analyzing posts in one or more Reddit ***** forums ***** to assess the subject's suicide risk. | ||
| W18-5109 We observe that the purpose of conversations in online ***** forums ***** tend to be more constructive and informative than those in Wikipedia page edit comments which are geared more towards adversarial interactions, and that this may explain the lower levels of abuse found in our forum data than in Wikipedia comments. | ||
| S19-2215 As online customer ***** forums ***** and product comparison sites increase their societal influence, users are actively expressing their opinions and posting their recommendations on their fellow customers online. | ||
| informal text | 11 | |
| N18-1151 Existing keyphrase extraction methods suffer from data sparsity problem when they are conducted on short and ***** informal text *****s, especially microblog messages. | ||
| 2021.emnlp-main.780 We instead propose a language-independent approach to build large datasets of pairs of ***** informal text *****s weakly similar, without manual human effort, exploiting Twitter's intrinsic powerful signals of relatedness: replies and quotes of tweets. | ||
| 2021.ltedi-1.19 This paper describes approaches to identify Hope Speech in short, ***** informal text *****s in English, Malayalam and Tamil using different machine learning techniques. | ||
| 2021.mtsummit-research.16 social media posts and comments and and reviews—has motivated the development of NLP applications tailored to these types of ***** informal text *****s. | ||
| L16-1087 Recently , due to the increasing popularity of social media , the necessity for extracting information from *****informal text***** types , such as microblog texts , has gained significant attention . | ||
| dense retrieval | 11 | |
| 2021.repl4nlp-1.17 Experiments on the MS MARCO passage and document ranking tasks and data from the TREC 2019 Deep Learning Track demonstrate that our approach helps models learn robust representations for ***** dense retrieval ***** effectively and efficiently. | ||
| 2021.emnlp-main.496 Open-domain question answering has exploded in popularity recently due to the success of ***** dense retrieval ***** models, which have surpassed sparse models using only a few supervised training examples. | ||
| 2021.mrqa-1.16 One such model, REALM, (Guu et al., 2020) is an end-to-end ***** dense retrieval ***** system that uses MLM based pretraining for improved downstream QA performance. | ||
| 2021.naacl-main.368 Building on ***** dense retrieval ***** methods, we propose a new multi-step retrieval approach (BeamDR) that iteratively forms an evidence chain through beam search in dense representations. | ||
| 2021.eacl-main.244 Across three open-domain QA datasets, our method consistently outperforms a strong ***** dense retrieval ***** baseline that uses 6 times more computation for training. | ||
| integer linear | 11 | |
| 2020.coling-main.418 We then introduce a total optimization method using ***** integer linear ***** programming to prevent span overlapping and obtain non-monotonic alignments. | ||
| E17-1108 We employ a structured perceptron, together with ***** integer linear ***** programming constraints for document-level inference during training and prediction to exploit relational properties of temporality, together with global learning of the relations at the document level. | ||
| D19-1398 However, it is nontrivial to make use of ***** integer linear ***** programming as a blackbox solver for RE. | ||
| Q15-1003 The algorithm tractably captures a majority of the structural constraints examined by prior work in this area, which has resorted to either approximate methods or off-the-shelf ***** integer linear ***** programming solvers. | ||
| P18-1212 Specifically, we formulate the joint problem as an ***** integer linear ***** programming (ILP) problem, enforcing constraints that are inherent in the nature of time and causality. | ||
| automatic question answering | 11 | |
| L10-1162 Creating more fine-grained annotated data than previously relevent document sets is important for evaluating individual components in ***** automatic question answering ***** systems. | ||
| R19-1049 We present a novel approach to ***** automatic question answering ***** that does not depend on the performance of an information retrieval (IR) system and does not require that the training data come from the same source as the questions. | ||
| C16-1185 This problem has recently attracted a lot of attention as it is an important sub-part of an ***** automatic question answering ***** system, which is currently in great demand. | ||
| 2020.acl-main.454 Next, we propose an ***** automatic question answering ***** (QA) based metric for faithfulness, FEQA, which leverages recent advances in reading comprehension. | ||
| P16-5007 On the other hand, many applications, including search engines, ads, ***** automatic question answering *****, online advertising, recommendation systems, etc., rely on short text understanding. | ||
| data annotation | 11 | |
| 2021.naacl-tutorials.6 In this tutorial, we present a portion of unique industry experience in efficient natural language ***** data annotation ***** via crowdsourcing shared by both leading researchers and engineers from Yandex. | ||
| 2021.naacl-main.269 However, most existing works assume clean ***** data annotation *****, while real-world scenarios typically involve a large amount of noises from a variety of sources (e.g., pseudo, weak, or distant annotations). | ||
| N19-1227 Our findings may provide important insights into structured ***** data annotation ***** schemes and could support progress in learning protocols for structured tasks. | ||
| C18-1059 This work addresses challenges arising from extracting entities from textual data, including the high cost of ***** data annotation *****, model accuracy, selecting appropriate evaluation criteria, and the overall quality of annotation. | ||
| D19-5006 Studies analysing the spread of information about this event on Twitter have focused on small, manually annotated datasets, or used proxys for ***** data annotation *****. | ||
| linear time | 11 | |
| 1995.iwpt-1.10 The chunking and raising actions can be done in ***** linear time *****. | ||
| D19-1074 In this study, we first investigate a novel capsule network with dynamic routing for ***** linear time ***** Neural Machine Translation (NMT), referred as CapsNMT. | ||
| 2020.emnlp-main.289 Accordingly, an end-to-end model is presented to process the input texts from left to right, always with ***** linear time ***** complexity, leading to a speed up. | ||
| 1991.iwpt-1.4 In this paper, we explain why the valid prefix property is expensive to maintain for TAGs and we introduce a predictive left to right parser for TAGs that does not maintain the valid prefix property but that achieves an O(n^6)-time worst case behavior, O(n^4)-time for unambiguous grammars and ***** linear time ***** for a large class of grammars. | ||
| 2021.acl-long.379 To scale up our approach, we also introduce an efficient pruning and growing algorithm to reduce the time complexity and enable encoding in ***** linear time *****. | ||
| sentence summarization | 11 | |
| D19-1301 Experiments on benchmark datasets show that, the proposed contrastive attention mechanism is more focused on the relevant parts for the summary than the conventional attention mechanism, and greatly advances the state-of-the-art performance on the abstractive ***** sentence summarization ***** task. | ||
| 2020.coling-main.497 We conduct experiments in two widely-used ***** sentence summarization ***** datasets and experimental results show that our model outperforms the state-of-the-art methods in both automatic evaluation scores and informativeness metrics. | ||
| P17-1101 We evaluate our model on the English Gigaword, DUC 2004 and MSR abstractive ***** sentence summarization ***** datasets. | ||
| 2020.coling-main.496 Experimental results on a public multimodal ***** sentence summarization ***** dataset demonstrate the advantage of our models over baselines. | ||
| C18-1121 In this paper , we investigate the *****sentence summarization***** task that produces a summary from a source sentence . | ||
| patients | 11 | |
| W19-3015 Speech samples were obtained from healthy controls and ***** patients ***** with a diagnosis of schizophrenia or schizoaffective disorder and different severity of positive formal thought disorder. | ||
| 2020.splu-1.6 Radiology reports contain important clinical information about ***** patients ***** which are often tied through spatial expressions. | ||
| N18-2110 More importantly, we next interpret what these neural models have learned about the linguistic characteristics of AD ***** patients *****, via analysis based on activation clustering and first-derivative saliency techniques. | ||
| 2020.findings-emnlp.336 Previous works were mainly based on either medical domain-specific knowledge, or ***** patients *****' prior diagnoses and clinical encounters. | ||
| 2020.coling-main.63 In a special task - oriented scenario , namely medical conversations between patients and doctors , the symptoms , diagnoses , and treatments could be highly important because the nature of such conversation is to find a medical solution to the problem proposed by the *****patients***** . | ||
| document understanding | 11 | |
| D18-1476 Based on this representation, we present a generic ***** document understanding ***** pipeline for structured documents. | ||
| N19-2005 In VRDs, visual and layout information is critical for ***** document understanding *****, and texts in such documents cannot be serialized into the one-dimensional sequence without losing information. | ||
| 2020.textgraphs-1.3 Named entity recognition (NER) from visual documents, such as invoices, receipts or business cards, is a critical task for visual ***** document understanding *****. | ||
| 2020.emnlp-main.35 Abstractive document summarization is a comprehensive task including ***** document understanding ***** and summary generation, in which area Transformer-based models have achieved the state-of-the-art performance. | ||
| 2020.findings-emnlp.191 Understanding the relationship between figures and text is key to scientific ***** document understanding *****. | ||
| online learning | 11 | |
| Q13-1017 As a result, this training process is as efficient as existing ***** online learning ***** methods, and yet derives consistently better models, as evaluated on four benchmark NLP datasets for part-of-speech tagging, named-entity recognition and dependency parsing. | ||
| D17-1234 Since the human teaching is expensive, we compared various teaching schemes answering the question how and when to teach, to economically utilize teaching budget, so that make the ***** online learning ***** process affordable. | ||
| 2020.lrec-1.612 In this work we introduce Aspect On, an interactive solution based on ***** online learning ***** that allows users to post-edit the aspect extraction with little effort. | ||
| P19-2028 We also demonstrate our approach with an application that uses ***** online learning *****. | ||
| 2020.bea-1.13 With the widespread adoption of the Next Generation Science Standards ( NGSS ) , science teachers and *****online learning***** environments face the challenge of evaluating students ' integration of different dimensions of science learning . | ||
| semantic search | 11 | |
| C18-1227 For this reason, we tackle the task of ***** semantic search *****es of FE dictionaries. | ||
| 2020.lrec-1.562 This dataset was created as there is a dire need for ***** semantic search ***** within archaeology, in order to allow archaeologists to find structured information in collections of Dutch excavation reports, currently totalling around 60,000 (658 million words) and growing rapidly. | ||
| L08-1548 This is considered the preparatory phase for the integration of a ***** semantic search ***** facility in Learning Management Systems. | ||
| 2021.emnlp-main.502 10% improvement upon baseline models on cross-lingual ***** semantic search *****. | ||
| L08-1281 Automatic Term recognition (ATR) is a fundamental processing step preceding more complex tasks such as ***** semantic search ***** and ontology learning. | ||
| design | 11 | |
| 2021.naacl-demos.12 It is ***** design *****ed to be a general-purpose tool with a wide variety of use cases. | ||
| C16-1154 This model ***** design ***** greatly mitigates the lack of data for the minor class. | ||
| 2020.trac-1.4 The contribution of this paper is the ***** design ***** of binary classification and regression-based approaches aiming to predict whether a comment is toxic or not. | ||
| L08-1482 The idea is that the time consuming ***** design ***** of such a tool can be avoided by using the provided architecture. | ||
| E17-4005 It employs a specially ***** design *****ed corpus of truthful and deceptive texts on the same topic from each respondent, N = 113. | ||
| sequence classification | 11 | |
| W19-3819 An ensemble of QA and BERT-based multiple choice and ***** sequence classification ***** models further improves the F1 (23.3% absolute improvement upon the baseline). | ||
| 2021.wmt-1.97 Our approach builds on cross-lingual pre-trained representations in a ***** sequence classification ***** model. | ||
| 2021.semeval-1.143 In order to assign class labels to the given memes, we opted for RoBERTa (A Robustly Optimized BERT Pretraining Approach) as a neural network architecture for token and ***** sequence classification *****. | ||
| 2020.udw-1.9 We evaluate the methods on morphological ***** sequence classification *****, the task of predicting grammatical features of a word. | ||
| C16-1260 Several tasks in argumentation mining and debating, question-answering, and natural language inference involve classifying a sequence in the context of another sequence (referred as bi-***** sequence classification *****). | ||
| word recognition | 11 | |
| I17-1019 Boundary features are widely used in traditional Chinese Word Segmentation (CWS) methods as they can utilize unlabeled data to help improve the Out-of-Vocabulary (OOV) ***** word recognition ***** performance. | ||
| W16-4107 Lexical complexity plays a central role in readability, particularly for dyslexic children and poor readers because of their slow and laborious decoding and ***** word recognition ***** skills. | ||
| P19-1561 Our ***** word recognition ***** models build upon the RNN semi-character architecture, introducing several new backoff strategies for handling rare and unseen words. | ||
| 2019.icon-1.21 The met- ric utilizes word frequency, orthography and morphology as the three factors affect- ing visual ***** word recognition ***** in Malayalam. | ||
| L14-1278 Our tests include medium vocabulary isolated ***** word recognition ***** and LVCSR. | ||
| expression | 11 | |
| D19-1170 In addition, an arithmetic ***** expression ***** reranking mechanism is proposed to rank ***** expression ***** candidates for further confirming the prediction. | ||
| W17-2341 We describe a method for representing time ***** expression *****s with single pseudo-tokens for CNNs. | ||
| C18-1156 Also, since the sarcastic nature and form of ***** expression ***** can vary from person to person, CASCADE utilizes user embeddings that encode stylometric and personality features of users. | ||
| L16-1543 However, domain experts proposed ***** expression *****s not extracted automatically. | ||
| 2021.semeval-1.44 There is currently a gap between the natural language ***** expression ***** of scholarly publications and their structured semantic content modeling to enable intelligent content search. | ||
| supervised and unsupervised | 11 | |
| W18-3108 We compare ***** supervised and unsupervised ***** methods to assign predefined categories at message level. | ||
| S17-2110 We propose two Arabic sentiment classification models implemented using ***** supervised and unsupervised ***** learning strategies. | ||
| 2020.emnlp-main.238 Our framework is a general framework that can incorporate any ***** supervised and unsupervised ***** BLI methods based on optimal transport. | ||
| 2020.emnlp-main.41 In experiments using five word alignment datasets from among Chinese, Japanese, German, Romanian, French, and English, we show that our proposed method significantly outperformed previous ***** supervised and unsupervised ***** word alignment methods without any bitexts for pretraining. | ||
| 2021.acl-long.532 Natural language processing has been used to create experimental tools to interpret privacy policies, but there has been a lack of large privacy policy corpora to facilitate the creation of large-scale semi-***** supervised and unsupervised ***** models to interpret and simplify privacy policies. | ||
| ordering | 11 | |
| 2010.amta-papers.25 However, this basic method for combining phrases is not sufficient for phrase re***** ordering *****. | ||
| 2013.iwslt-evaluation.24 Furthermore, we investigated different re***** ordering ***** models as well as an extended discriminative word lexicon. | ||
| 1993.iwpt-1.26 In this paper I discuss motivations and methods for predictive, Earley-style parsing of multidimensional languages when the relations involved do not necessarily yield an ***** ordering *****, e.g., when the relations are symmetric and/or nontransitive. | ||
| 2020.acl-main.22 Our work, inspired by pre-***** ordering ***** literature in machine translation, uses syntactic transformations to softly “reorder” the source sentence and guide our neural paraphrasing model. | ||
| L14-1464 In order to deal with separated particle verbs, we apply re-***** ordering ***** rules to the German part of the data. | ||
| teaching | 11 | |
| D17-2003 Case studies tend to be used in legal, business, and health education contexts, but less in the ***** teaching ***** and learning of linguistics. | ||
| 2021.***** teaching *****nlp-1.16 Introducing biomedical informatics (BMI) students to natural language processing (NLP) requires balancing technical depth with practical know-how to address application-focused needs. | ||
| 2021.***** teaching *****nlp-1.19 Students implement the core parts of the method, including text preprocessing, negative sampling, and gradient descent. | ||
| 2021.***** teaching *****nlp-1.9 Deep neural networks have revolutionized many fields, including Natural Language Processing. | ||
| L12-1337 In this paper, we focus on an application in which lexicon and ontology are used to generate ***** teaching ***** material. | ||
| local coherence | 11 | |
| 2021.nuse-1.9 We devise two theoretically grounded measures of reader question-answering entropy, the entropy of world coherence (EWC), and the entropy of transitional coherence (ETC), focusing on global and ***** local coherence *****, respectively. | ||
| 2020.coling-main.194 This paper follows the assumption and presents a method for scoring text clarity by utilizing ***** local coherence ***** between adjacent sentences. | ||
| D18-1464 We propose a ***** local coherence ***** model that captures the flow of what semantically connects adjacent sentences in a text. | ||
| N18-1024 We develop a neural model of ***** local coherence ***** that can effectively learn connectedness features between sentences, and propose a framework for integrating and jointly training the ***** local coherence ***** model with a state-of-the-art AES model. | ||
| P17-1121 We propose a *****local coherence***** model based on a convolutional neural network that operates over the entity grid representation of a text . | ||
| fully unsupervised | 11 | |
| K18-1051 Crucially, our method is ***** fully unsupervised *****, requiring only a bag-of-words representation of the objects as input. | ||
| 2021.mtsummit-research.24 While interesting and ***** fully unsupervised ***** settings are unrealistic; small amounts of bilingual data are usually available due to the existence of massively multilingual parallel corpora and or linguists can create small amounts of parallel data. | ||
| 2013.iwslt-papers.15 We present the first known experiments incorporating unsupervised bilingual nonterminal category learning within end-to-end ***** fully unsupervised ***** transduction grammar induction using matched training and testing models. | ||
| 2021.nodalida-main.24 Our new method shows increased performance while remaining ***** fully unsupervised *****, with the added benefit of spelling normalisation. | ||
| D19-1449 Recent efforts in cross - lingual word embedding ( CLWE ) learning have predominantly focused on *****fully unsupervised***** approaches that project monolingual embeddings into a shared cross - lingual space without any cross - lingual signal . | ||
| parse tree | 11 | |
| 1998.amta-papers.25 All ***** parse tree *****s are converted to this format prior to semantic interpretation. | ||
| 2020.acl-main.591 In this paper, we make use of a multi-task objective, i.e., the models simultaneously predict words as well as ground truth ***** parse tree *****s in a form called “syntactic distances”, where information between these two separate objectives shares the same intermediate representation. | ||
| 2021.emnlp-main.317 The aspect and opinion words are expected to be closer along such tree structure compared to the standard dependency ***** parse tree *****. | ||
| 2020.lrec-1.158 We release corpus of high quality sentences and ***** parse tree *****s with these two types of labels on sentence level. | ||
| 2020.acl-main.300 Unsupervised constituency parsing aims to learn a constituency parser from a training corpus without *****parse tree***** annotations . | ||
| communities | 11 | |
| D19-1384 We demonstrate that complex linguistic behavior observed in natural language can be reproduced in this simple setting: i) the outcome of contact between ***** communities ***** is a function of inter- and intra-group connectivity; ii) linguistic contact either converges to the majority protocol, or in balanced cases leads to novel creole languages of lower complexity; and iii) a linguistic continuum emerges where neighboring languages are more mutually intelligible than farther removed languages. | ||
| P18-1228 Because word semantics can substantially change across ***** communities ***** and contexts, capturing domain-specific word semantics is an important challenge. | ||
| 2020.lrec-1.159 We introduce novel corpora annotated by two ***** communities *****, i.e., domain experts and crowd workers, and we also consider automatic article labels inferred by the newspapers' ideologies. | ||
| 2020.wildre-1.2 This exchange is not free from offensive, trolling or malicious contents targeting users or ***** communities *****. | ||
| 2021.acl-short.133 Among social media platforms, Reddit has emerged as the most promising one due to its anonymity and its focus on topic-based ***** communities ***** (subreddits) that can be indicative of someone's state of mind or interest regarding mental health disorders such as r/SuicideWatch, r/Anxiety, r/depression. | ||
| supervised relation | 11 | |
| 2021.naacl-main.2 We propose a multi-task, probabilistic approach to facilitate distantly ***** supervised relation ***** extraction by bringing closer the representations of sentences that contain the same Knowledge Base pairs. | ||
| 2020.coling-main.566 In recent years, distantly-***** supervised relation ***** extraction has achieved a certain success by using deep neural networks. | ||
| E17-2087 While it is natural to use both positive and negative training examples in ***** supervised relation ***** extraction, the impact of positive examples on hypernym prediction was not studied so far. | ||
| 2020.findings-emnlp.113 We perform an extensive experimental study over multiple relation extraction benchmarks and demonstrate that RE-Flex outperforms competing un***** supervised relation ***** extraction methods based on pretrained language models by up to 27.8 F1 points compared to the next-best method. | ||
| 2021.eacl-main.128 The advent of neural - networks in NLP brought with it substantial improvements in *****supervised relation***** extraction . | ||
| original | 11 | |
| 2020.lrec-1.686 By using an updated implementation of OpenNMT, and incorporating the Newsela corpus alongside the ***** original ***** Wikipedia dataset (Hwang et al., 2016), as well as refining both datasets to select high quality training examples. | ||
| W18-6312 We reassess a recent study (Hassan et al., 2018) that claimed that machine translation (MT) has reached human parity for the translation of news from Chinese into English, using pairwise ranking and considering three variables that were not taken into account in that previous study: the language in which the source side of the test set was ***** original *****ly written, the translation proficiency of the evaluators, and the provision of inter-sentential context. | ||
| 2021.ranlp-1.64 Our case studies have also demonstrated SACG's ability to generate fluent target-style sentences that preserved the ***** original ***** content. | ||
| 2021.naacl-main.162 We first take into consideration all the linguistic information embedded in the past layers and then take a further step to engage the future information which is ***** original *****ly inaccessible for predictions. | ||
| L12-1150 We also compare both labellings (the ***** original ***** and the new one) allowing us to detect anomalies in the ***** original ***** WND labels. | ||
| electronic health record | 11 | |
| 2020.lrec-1.547 Multiple efforts have been done to protect the integrity of patients while making ***** electronic health record *****s usable for research by removing personally identifiable information in patient records. | ||
| W19-1915 Recently natural language processing (NLP) tools have been developed to identify and extract salient risk indicators in ***** electronic health record *****s (EHRs). | ||
| 2020.lrec-1.714 However, retrieving eligible patients for a trial from the ***** electronic health record ***** (EHR) database remains a challenging task for clinicians since it requires not only medical knowledge about eligibility criteria, but also an adequate understanding of structured query language (SQL). | ||
| W19-5003 This paper proposes a dataset and method for automatically generating paraphrases for clinical questions relating to patient-specific information in ***** electronic health record *****s (EHRs). | ||
| 2021.naacl-main.318 Given the clinical notes written in ***** electronic health record *****s (EHRs), it is challenging to predict the diagnostic codes which is formulated as a multi-label classification task. | ||
| language teaching | 11 | |
| 2020.nlptea-1.18 Regarding the linguistic feature richness, pop songs are probably suitable to be used as extracurricular materials in ***** language teaching *****. | ||
| W16-4916 Much research in education has been done on the study of different ***** language teaching ***** methods. | ||
| W18-4805 It is in this context that we describe an attempt to produce ***** language teaching ***** materials based on a generative approach. | ||
| L06-1290 In an age when demand for innovative and motivating ***** language teaching ***** methodologies is at a very high level, TREAT - the Trilingual REAding Tutor - combines the most advanced natural language processing (NLP) techniques with the latest second and third language acquisition (SLA/TLA) research in an intuitive and user-friendly environment that has been proven to help adult learners (native speakers of L1) acquire reading skills in an unknown L3 which is related to (cognate with) an L2 they know to some extent. | ||
| R19-1079 We also report on the feedback gathered from the users and an expert from ***** language teaching *****, and discuss the potential of the vocabulary trainer application from the user and language learner perspective. | ||
| hierarchy | 11 | |
| P18-1098 The International Classification of Diseases (ICD) provides a ***** hierarchy ***** of diagnostic codes for classifying diseases. | ||
| 2020.aacl-main.90 To better encode the multilingual and code-mixed questions, we introduce a ***** hierarchy ***** of shared layers. | ||
| 2021.naacl-main.452 We also present a top-down ***** hierarchy ***** expansion algorithm to add the extracted relations into existing hierarchies with reasonable interpretability. | ||
| N18-1002 The task of Fine-grained Entity Type Classification (FETC) consists of assigning types from a ***** hierarchy ***** to entity mentions in text. | ||
| D19-1042 While existing hierarchical text classification ( HTC ) methods attempt to capture label hierarchies for model training , they either make local decisions regarding each label or completely ignore the *****hierarchy***** information during inference . | ||
| factors | 11 | |
| C18-1281 Our experiments show how annotators diverge in language annotation tasks due to a range of ineliminable ***** factors *****. | ||
| 2010.amta-commercial.5 In this paper, we discuss some of our further use cases, and the varying requirements each use case has for quality, customization, cost, and other ***** factors *****. | ||
| 2008.iwslt-evaluation.11 In the English–Chinese translation Challenge Task, we focused on exploring various ***** factors ***** for the English–Chinese translation because the research on the translation of English–Chinese is scarce compared to the opposite direction. | ||
| U19-1011 With this study, we aim to understand ***** factors ***** which cause forgetting during sequential training. | ||
| 2020.lrec-1.187 These affective states are impacted by a combination of emotion inducers, current psychological state, and various conversational ***** factors *****. | ||
| detection of persuasion | 11 | |
| 2021.semeval-1.142 We explored various approaches in feature extraction and the ***** detection of persuasion ***** labels. | ||
| 2021.semeval-1.151 This paper describes our participation in the three subtasks featured by SemEval 2021 task 6 on the ***** detection of persuasion ***** techniques in texts and images. | ||
| 2021.semeval-1.144 In this paper, our study on the ***** detection of persuasion ***** techniques in texts and images in SemEval-2021 Task 6 is summarized. | ||
| 2021.semeval-1.143 The following system description presents our approach to the ***** detection of persuasion ***** techniques in texts and images. | ||
| 2021.semeval-1.139 We describe our approach for SemEval-2021 task 6 on ***** detection of persuasion ***** techniques in multimodal content (memes). | ||
| score | 11 | |
| 2021.acl-demo.41 To guarantee acceptability, all the text transformations are linguistically based and all the transformed data selected (up to 100,000 texts) ***** score *****d highly under human evaluation. | ||
| 2021.wmt-1.89 Our submissions (Tencent AI Lab Machine Translation, TMT) in German/French/Spanish⇒English are ranked 1st respectively according to the official evaluation results in terms of BLEU ***** score *****s. | ||
| 2020.findings-emnlp.366 For detecting who-needs-what sentences, we compared our results against a set of 1,000 annotated tweets and achieved a 0.68 F1-***** score *****. | ||
| P19-2027 Experiments using a Switch Board Dialogue Act corpus show that compared to the baseline considering only a single utterance, our model achieves 10.8% higher F1-***** score ***** and 3.0% higher accuracy on DA prediction. | ||
| K19-2011 Our system was ranked fifth with the macro-averaged MRP F1 ***** score ***** of 0.7604, and outperformed the baseline unified transition-based MRP. | ||
| word class | 11 | |
| 2020.acl-main.337 This paper presents an investigation on the distribution of word vectors belonging to a certain ***** word class ***** in a pre-trained word vector space. | ||
| 2020.sltu-1.22 In fact, they are also by themselves one of the best ways to describe a sentiment, despite the fact that other ***** word class *****es such as nouns, verbs, adverbs or conjunctions can also be utilized for this purpose. | ||
| 2020.sltu-1.41 Next to the dictionary itself, one other resource arising from our work is a lexicographical model for Akan which represents the lexical resource itself, and the extended morphological and ***** word class ***** inventories that provide information to be aggregated. | ||
| L06-1369 Adjunct, and (b) derives a possible set of ***** word class *****es. | ||
| L14-1247 Experiments confirm that our DCW method achieves higher accuracy in detecting real-time dependent questions than existing ***** word class *****es and a simple supervised machine learning approach. | ||
| unsupervised text | 11 | |
| 2021.emnlp-main.730 In this paper, we explore Non-AutoRegressive (NAR) decoding for ***** unsupervised text ***** style transfer. | ||
| 2021.eacl-srw.23 Generating diverse texts is an important factor for ***** unsupervised text ***** generation. | ||
| 2021.emnlp-main.729 In this paper, we propose a collaborative learning framework for ***** unsupervised text ***** style transfer using a pair of bidirectional decoders, one decoding from left to right while the other decoding from right to left. | ||
| 2021.naacl-main.333 Our experiments on several benchmark datasets show that our method outperforms the existing competitive models on supervised and semi-supervised text classification, as well as ***** unsupervised text ***** representation learning. | ||
| 2020.coling-main.201 In this paper, we propose a novel neural approach to ***** unsupervised text ***** style transfer which we refer to as Cycle-consistent Adversarial autoEncoders (CAE) trained from non-parallel data. | ||
| editing | 11 | |
| 2012.amta-government.13 The RevP program saves time by removing the need for post-***** editing ***** of Chinese names, and improves consistency in the translation of these names. | ||
| L16-1004 While Edit Distance as such does not express cognitive effort or time spent ***** editing ***** machine translation suggestions, we found that it correlates strongly with the productivity tests we performed, for various language pairs and domains. | ||
| 2020.coling-main.524 In automatic post-***** editing ***** (APE) it makes sense to condition post-***** editing ***** (pe) decisions on both the source (src) and the machine translated text (mt) as input. | ||
| 2021.alta-1.13 In this paper, we present a novel semi-autoregressive document generation model capable of revising and ***** editing ***** the generated text. | ||
| L10-1039 We propose a language resource management system, called WordNet Management System (WNMS), as a distributed management system that allows the server to perform the cross language WordNet retrieval, including the fundamental web service applications for ***** editing *****, visualizing and language processing. | ||
| key point | 11 | |
| 2020.acl-main.371 We propose to represent such summaries as a small set of talking points, termed ***** key point *****s, each scored according to its salience. | ||
| 2021.argmining-1.19 One component employs contrastive learning via a siamese neural network for matching arguments to ***** key point *****s; the other is a graph-based extractive summarization model for generating ***** key point *****s. | ||
| 2021.acl-long.262 We adapt KPA to review data by introducing Collective Key Point Mining for better ***** key point ***** extraction; integrating sentiment analysis into KPA; identifying good ***** key point ***** candidates for review summaries; and leveraging the massive amount of available reviews and their metadata. | ||
| 2020.emnlp-main.3 Recent work has proposed to summarize arguments by mapping them to a small set of expert-generated ***** key point *****s, where the salience of each ***** key point ***** corresponds to the number of its matching arguments. | ||
| 2021.argmining-1.18 For ***** key point ***** matching the task is to decide if a short ***** key point ***** matches the content of an argument with the same topic and stance towards the topic. | ||
| studies | 11 | |
| D17-2003 Case ***** studies ***** tend to be used in legal, business, and health education contexts, but less in the teaching and learning of linguistics. | ||
| P19-2006 However, several ***** studies ***** strived to overcome divergences in the annotations between English AMRs and those of their target languages by refining the annotation specification. | ||
| 2021.emnlp-main.777 At the script level, most existing ***** studies ***** only consider a single event sequence corresponding to one common protagonist. | ||
| L14-1017 The resultant data has also been recently used in disfluency ***** studies ***** across domains. | ||
| 2020.wnut-1.39 This paper presents our teamwork on WNUT 2020 shared task-1: wet lab entity extract, that we conducted ***** studies ***** in several models, including a BiLSTM CRF model and a Bert case model which can be used to complete wet lab entity extraction. | ||
| measuring | 11 | |
| L12-1283 This work is part of a project for MWE extraction and characterization using different techniques aiming at ***** measuring ***** the properties related to idiomaticity, as institutionalization, non-compositionality and lexico-syntactic fixedness. | ||
| 2020.semeval-1.30 It consists of preparing a semantic vector space for each corpus, earlier and later; computing a linear transformation between earlier and later spaces, using Canonical Correlation Analysis and orthogonal transformation;and ***** measuring ***** the cosines between the transformed vector for the target word from the earlier corpus and the vector for the target word in the later corpus. | ||
| 2020.repl4nlp-1.22 We introduce a novel metric, Polarity Sensitivity Scoring (PSS), which utilizes sentiment perturbations as a proxy for ***** measuring ***** compositionality. | ||
| W18-1601 For detection of stylistic variation, we use relative entropy, ***** measuring ***** the difference between probability distributions at different linguistic levels (here: lexis and grammar). | ||
| W19-2309 By ***** measuring ***** style transfer quality, meaning preservation, and the fluency of generated outputs, we demonstrate that our method is able both to produce high-quality output while maintaining the flexibility to suggest syntactically rich stylistic edits. | ||
| entity and relation | 11 | |
| S17-2172 We investigated appropriate embeddings to adapt a neural end-to-end ***** entity and relation ***** extraction system LSTM-ER to this task. | ||
| P19-1466 To this effect, our paper proposes a novel attention-based feature embedding that captures both ***** entity and relation ***** features in any given entity's neighborhood. | ||
| P18-1012 Despite this popularity and effectiveness of KG embeddings in various tasks (e.g., link prediction), geometric understanding of such embeddings (i.e., arrangement of ***** entity and relation ***** vectors in vector space) is unexplored – we fill this gap in the paper. | ||
| 2020.acl-main.527 Distant supervision based methods for *****entity and relation***** extraction have received increasing popularity due to the fact that these methods require light human annotation efforts . | ||
| P18-2013 State - of - the - art knowledge base completion ( KBC ) models predict a score for every known or unknown fact via a latent factorization over *****entity and relation***** embeddings . | ||
| abstract syntax | 11 | |
| P17-1105 The outputs are represented as ***** abstract syntax ***** trees (ASTs) and constructed by a decoder with a dynamically-determined modular structure paralleling the structure of the output tree. | ||
| W19-0403 The QuantML scheme consists of (1) an ***** abstract syntax ***** which defines `annotation structures' as triples and other set-theoretic constructs; (b) a compositional semantics of annotation structures; (3) an XML representation of annotation structures. | ||
| 2020.cl-2.6 This article gives an overview of the use of ***** abstract syntax ***** as interlingua through both established and emerging NLP applications involving GF. | ||
| D18-2002 TRANX uses a transition system based on the ***** abstract syntax ***** description language for the target MR, which gives it two major advantages: (1) it is highly accurate, using information from the syntax of the target MR to constrain the output space and model the information flow, and (2) it is highly generalizable, and can easily be applied to new types of MR by just writing a new ***** abstract syntax ***** description corresponding to the allowable structures in the MR. | ||
| W19-6131 We present a system for Natural Language Inference which uses a dynamic semantics converter from ***** abstract syntax ***** trees to Coq types. | ||
| endangered language | 11 | |
| 2021.sigtyp-1.12 For many low-resource and ***** endangered language *****s, only single-speaker recordings may be available, demanding a need for domain and speaker-invariant language ID systems. | ||
| 2020.rail-1.1 The ǂKhomani San, Hugh Brody Collection features the voices and history of indigenous hunter gatherer descendants in three ***** endangered language *****s namely, N|uu, Kora and Khoekhoe as well as a regional dialect of Afrikaans. | ||
| 2020.emnlp-main.478 We create a benchmark dataset of transcriptions for scanned books in three critically ***** endangered language *****s and present a systematic analysis of how general-purpose OCR tools are not robust to the data-scarce setting of ***** endangered language *****s. | ||
| L12-1521 The RELISH project promotes language-oriented research by addressing a two-pronged problem: (1) the lack of harmonization between digital standards for lexical information in Europe and America, and (2) the lack of interoperability among existing lexicons of ***** endangered language *****s, in particular those created with the Shoebox/Toolbox lexicon building software. | ||
| 2021.americasnlp-1.14 This paper describes the development of the first Universal Dependencies (UD) treebank for St. Lawrence Island Yupik, an ***** endangered language ***** spoken in the Bering Strait region. | ||
| stress | 11 | |
| 2021.dash-1.15 We present the Everyday Living Artificial Intelligence (AI) Hub, a novel proof-of-concept framework for enhancing human health and wellbeing via a combination of tailored wear-able and Conversational Agent (CA) solutions for non-invasive monitoring of physiological signals, assessment of behaviors through unobtrusive wearable devices, and the provision of personalized interventions to reduce ***** stress ***** and anxiety. | ||
| L16-1018 We adapted an annotation tool for this taxonomy and have annotated portions of two different dialogue corpora, Switchboard and the Di***** stress ***** Analysis Interview Corpus. | ||
| L06-1261 This paper describes FreP, a new electronic tool that provides frequency counts of phonological units at the word-level and below from Portuguese written text: namely, major classes of segments, syllables and syllable types, phonological clitics, clitic types and size, prosodic words and their shape, word ***** stress ***** location, and syllable type by position within the word and/or status relative to word ***** stress *****. | ||
| 2021.naacl-main.230 Here, we present work exploring the use of a semantically related task, emotion detection, for equally competent but more explainable and human-like psychological ***** stress ***** detection as compared to a black-box model. | ||
| C18-1198 In this work, we propose an evaluation methodology consisting of automatically constructed “***** stress ***** tests” that allow us to examine whether systems have the ability to make real inferential decisions. | ||
| naturally occurring | 11 | |
| W19-7602 Most of the test sets used for the evaluation of MT systems reflect the frequency distribution of different phenomena found in ***** naturally occurring ***** data (”standard” or ”natural” test sets). | ||
| L10-1546 The object of this study was to record speech in a variety of situations that vary formality and model multiple ***** naturally occurring ***** interactions as well as a variety of channel conditions | ||
| W18-1404 Using ***** naturally occurring ***** instances of English push , and expansions of MN frames, we demonstrate that literal and metaphorical extensions exhibit patterns predicted and represented by the LCS model. | ||
| 2021.acl-long.186 DynaSent combines ***** naturally occurring ***** sentences with sentences created using the open-source Dynabench Platform, which facilities human-and-model-in-the-loop dataset creation. | ||
| 2020.eval4nlp-1.8 Current summarization evaluation datasets are single-domain and focused on a few domains for which ***** naturally occurring ***** summaries can be easily found, such as news and scientific articles. | ||
| max - pooling | 11 | |
| P18-1041 Based upon this understanding, we propose two additional pooling strategies over learned word embeddings: (i) a *****max-pooling***** operation for improved interpretability; and (ii) a hierarchical pooling operation, which preserves spatial (n-gram) information within text sequences. | ||
| 2021.acl-short.122 Then, the clip-level features are aggregated into node features by using *****max-pool*****, and a graph is generated for each scale of clips. | ||
| 2020.sustainlp-1.21 We compare a classical CNN architecture for sequence classification involving several convolutional and *****max-pooling***** layers against a simple model based on weighted finite state automata (WFA). | ||
| W18-6208 The system is composed of a single pre-trained ELMo layer for encoding words, a Bidirectional Long-Short Memory Network BiLSTM for enriching word representations with context, a *****max-pooling***** operation for creating sentence representations from them, and a Dense Layer for projecting the sentence representations into label space. | ||
| S19-2157 In the effort to tackle the challenge of Hyperpartisan News Detection, i.e., the task of deciding whether a news article is biased towards one party, faction, cause, or person, we experimented with two systems: i) a standard supervised learning approach using superficial text and bag-of-words features from the article title and body, and ii) a deep learning system comprising a four-layer convolutional neural network and *****max-pooling***** layers after the embedding layer, feeding the consolidated features to a bi-directional recurrent neural network. | ||
| cross - modal retrieval | 11 | |
| 2021.acl-long.43 Despite the achievements of large-scale multimodal pre-training approaches, *****cross-modal retrieval*****, e.g., image-text retrieval, remains a challenging task. | ||
| 2020.lrec-1.743 We propose three challenges appropriate for this corpus that are related to processing units of signs in context: automatic alignment of text and video, semantic segmentation of sign language, and production of video-text embeddings for *****cross-modal retrieval*****. | ||
| 2020.findings-emnlp.176 We formulate the task as *****cross-modal retrieval***** and propose Conditional Visual-Semantic Embeddings to align images and fine-grained abnormal findings in a joint embedding space. | ||
| 2021.naacl-main.285 Recent pretrained vision-language models have achieved impressive performance on *****cross-modal retrieval***** tasks in English. | ||
| 2021.emnlp-main.772 By exploiting the cross-modal attention, cross-BERT methods have achieved state-of-the-art accuracy in *****cross-modal retrieval*****. | ||
| event schema induction | 11 | |
| W17-2710 We argue that *****event schema induction***** can benefit from greater structure in the process and in linguistic features that distinguish words' functions and themes. | ||
| 2021.emnlp-main.422 Previous work on *****event schema induction***** focuses either on atomic events or linear temporal event sequences, ignoring the interplay between events via arguments and argument relations. | ||
| P19-1276 Results show that the proposed unsupervised model gives better performance compared to the state-of-the-art method for *****event schema induction*****. | ||
| L16-1307 This article presents a corpus for development and testing of *****event schema induction***** systems in English. | ||
| E17-3026 *****Event Schema Induction***** is the task of learning a representation of events (e.g., bombing) and the roles involved in them (e.g, victim and perpetrator). | ||
| fine - grained sentiment | 11 | |
| 2016.gwc-1.54 For *****fine - grained sentiment***** analysis , we need to go beyond zero - one polarity and find a way to compare adjectives ( synonyms ) that share the same sense . | ||
| 2020.coling-main.158 A fundamental task of *****fine - grained sentiment***** analysis is aspect and opinion terms extraction . | ||
| 2021.rocling-1.23 Aspect Category Sentiment Analysis ( ACSA ) , which aims to identify *****fine - grained sentiment***** polarities of the aspect categories discussed in user reviews . | ||
| Q18-1002 We consider the task of *****fine - grained sentiment***** analysis from the perspective of multiple instance learning ( MIL ) . | ||
| 2020.lrec-1.618 We here introduce NoReC_fine , a dataset for *****fine - grained sentiment***** analysis in Norwegian , annotated with respect to polar expressions , targets and holders of opinion . | ||
| case | 11 | |
| 2020.lrec-1.553 We present a new corpus comprising annotations of medical entities in *****case***** reports , originating from PubMed Central 's open access library . | ||
| L04-1242 This system resolves zero , direct and indirect anaphors in Japanese texts by integrating two sorts of linguistic resources : a hand - annotated corpus with various relations and automatically constructed *****case***** frames . | ||
| 2021.conll-1.39 While the complex nature of event identity is previously studied ( Hovy et al . , 2013 ) , the *****case***** of events across documents is unclear . | ||
| 2020.trac-1.21 The challenge becomes greater for languages rich in popular sayings , colloquial expressions and idioms which may contain vulgar , profane or rude words , but not always have the intention of offending , as is the *****case***** of Mexican Spanish . | ||
| W17-1302 In this paper , we present a new and fast state - of - the - art Arabic diacritizer that guesses the diacritics of words and then their *****case***** endings . | ||
| Detection of Propaganda | 11 | |
| 2020.semeval-1.196 This paper presents our systems for SemEval 2020 Shared Task 11 : *****Detection of Propaganda***** Techniques in News Articles . | ||
| 2020.semeval-1.195 This paper describes the NTUAAILS submission for SemEval 2020 Task 11 *****Detection of Propaganda***** Techniques in News Articles . | ||
| 2020.semeval-1.234 This paper presents a solution for the Span Identification ( SI ) task in the *****Detection of Propaganda***** Techniques in News Articles competition at SemEval-2020 . | ||
| 2020.semeval-1.191 We describe our system for SemEval-2020 Task 11 on *****Detection of Propaganda***** Techniques in News Articles . | ||
| 2020.semeval-1.231 This paper describes our submissions to SemEval 2020 Task 11 : *****Detection of Propaganda***** Techniques in News Articles for each of the two subtasks of Span Identification and Technique Classification . | ||
| text - based | 11 | |
| 2021.iwslt-1.28 Traditional translation systems trained on written documents perform well for *****text - based***** translation but not as well for speech - based applications . | ||
| 2020.lrec-1.296 Prior work has determined domain similarity using *****text - based***** features of a corpus . | ||
| L10-1099 We present our ongoing work on language technology - based e - science in the humanities , social sciences and education , with a focus on *****text - based***** research in the historical sciences . | ||
| P17-1035 Cognitive NLP systems- i.e. , NLP systems that make use of behavioral data - augment traditional *****text - based***** features with cognitive features extracted from eye - movement patterns , EEG signals , brain - imaging etc . | ||
| 2020.textgraphs-1.9 We propose an open - world knowledge graph completion model that can be combined with common closed - world approaches ( such as ComplEx ) and enhance them to exploit *****text - based***** representations for entities unseen in training . | ||
| Neural language | 11 | |
| 2020.emnlp-main.735 *****Neural language***** models are often trained with maximum likelihood estimation ( MLE ) , where the next word is generated conditioned on the ground - truth word tokens . | ||
| D19-1421 *****Neural language***** models are usually trained using Maximum - Likelihood Estimation ( MLE ) . | ||
| 2020.emnlp-main.331 *****Neural language***** models learn , to varying degrees of accuracy , the grammatical properties of natural languages . | ||
| 2021.gwc-1.26 *****Neural language***** models , including transformer - based models , that are pre - trained on very large corpora became a common way to represent text in various tasks , including recognition of textual semantic relations , e.g. | ||
| 2020.acl-main.66 *****Neural language***** models are usually trained to match the distributional properties of large - scale corpora by minimizing the log loss . | ||
| rare | 11 | |
| 2020.wmt-1.65 Despite advances in neural machine translation ( NMT ) quality , *****rare***** words continue to be problematic . | ||
| 2020.emnlp-main.100 This paper designs a Monolingual Lexicon Induction task and observes that two factors accompany the degraded accuracy of bilingual lexicon induction for *****rare***** words . | ||
| L10-1559 During the last years the campaign of mass digitization made available catalogues and valuable *****rare***** manuscripts and old printed books vie the Internet . | ||
| P17-1188 Previous work has modeled the compositionality of words by creating character - level models of meaning , reducing problems of sparsity for *****rare***** words . | ||
| 2021.naacl-demos.9 Current document embeddings require large training corpora but fail to learn high - quality representations when confronted with a small number of domain - specific documents and *****rare***** terms . | ||
| function | 11 | |
| L14-1332 Abstract Meaning Representations ( AMRs ) are rooted , directional and labeled graphs that abstract away from morpho - syntactic idiosyncrasies such as word category ( verbs and nouns ) , word order , and *****function***** words ( determiners , some prepositions ) . | ||
| W17-4909 The differences in the frequencies of some parts of speech ( POS ) , particularly *****function***** words , and lexical diversity in male and female speech have been pointed out in a number of papers . | ||
| L16-1583 When dealing with problems such as user ranking or recommendation systems , all these measures suffer from various problems , including the inability to deal with elements of the same rank , inconsistent and ambiguous lower bound scores , and an inappropriate cost *****function***** . | ||
| 2020.socialnlp-1.6 To understand the particular difficulties of this task , we design a transparent emotion style transfer pipeline based on three steps : ( 1 ) select the words that are promising to be substituted to change the emotion ( with a brute - force approach and selection based on the attention mechanism of an emotion classifier ) , ( 2 ) find sets of words as candidates for substituting the words ( based on lexical and distributional semantics ) , and ( 3 ) select the most promising combination of substitutions with an objective *****function***** which consists of components for content ( based on BERT sentence embeddings ) , emotion ( based on an emotion classifier ) , and fluency ( based on a neural language model ) . | ||
| I17-3011 Classifiers are *****function***** words that are used to express quantities in Chinese and are especially difficult for language learners . | ||
| Automatic Speech Recognition ( ASR | 11 | |
| 2013.iwslt-evaluation.18 This paper describes the *****Automatic Speech Recognition ( ASR***** ) and Machine Translation ( MT ) systems developed by IOIT for the evaluation campaign of IWSLT2013 . | ||
| 2020.sltu-1.37 It is known that *****Automatic Speech Recognition ( ASR***** ) is very useful for human - computer interaction in all the human languages . | ||
| 2020.lrec-1.798 In this paper , we present CEASR , a Corpus for Evaluating the quality of *****Automatic Speech Recognition ( ASR***** ) . | ||
| 2008.iwslt-papers.3 This paper is about Translation Dictation with ASR , that is , the use of *****Automatic Speech Recognition ( ASR***** ) by human translators , in order to dictate translations . | ||
| 2020.lrec-1.513 *****Automatic Speech Recognition ( ASR***** ) is one of the most important technologies to support spoken communication in modern life . | ||
| natural language inference ( NLI ) | 11 | |
| N18-1132 We present a novel deep learning architecture to address the *****natural language inference ( NLI )***** task . | ||
| 2021.naacl-industry.29 In developing an online question - answering system for the medical domains , *****natural language inference ( NLI )***** models play a central role in question matching and intention detection . | ||
| 2021.nodalida-main.28 Pre - trained neural language models give high performance on *****natural language inference ( NLI )***** tasks . | ||
| W18-5441 We present a large scale collection of diverse *****natural language inference ( NLI )***** datasets that help provide insight into how well a sentence representation encoded by a neural network captures distinct types of reasoning . | ||
| 2021.conll-1.19 Negation is one of the most fundamental concepts in human cognition and language , and several *****natural language inference ( NLI )***** probes have been designed to investigate pretrained language models ' ability to detect and reason with negation . | ||
| fake news | 11 | |
| 2020.lrec-1.309 The task of *****fake news***** detection is to distinguish legitimate news articles that describe real facts from those which convey deceiving and fictitious information . | ||
| N19-1347 On the one hand , nowadays , *****fake news***** articles are easily propagated through various online media platforms and have become a grand threat to the trustworthiness of information . | ||
| D17-1317 We present an analytic study on the language of news media in the context of political fact - checking and *****fake news***** detection . | ||
| 2020.rdsm-1.5 In this paper , we trained and compared different models for *****fake news***** detection in Russian . | ||
| 2020.restup-1.1 We aim at identifying possible *****fake news***** spreaders as a first step towards preventing fake news from being propagated among online users ( fake news aim to polarize the public opinion and may contain hate speech ) . | ||
| Contextualized word | 11 | |
| 2020.blackboxnlp-1.13 *****Contextualized word***** representations encode rich information about syntax and semantics , alongside specificities of each context of use . | ||
| 2020.blackboxnlp-1.9 *****Contextualized word***** representations , such as ELMo and BERT , were shown to perform well on various semantic and syntactic task . | ||
| D19-1533 *****Contextualized word***** representations are able to give different representations for the same word in different contexts , and they have been shown to be effective in downstream natural language processing tasks , such as question answering , named entity recognition , and sentiment analysis . | ||
| D19-1627 *****Contextualized word***** embeddings have boosted many NLP tasks compared with traditional static word embeddings . | ||
| 2020.emnlp-main.285 *****Contextualized word***** embeddings have been employed effectively across several tasks in Natural Language Processing , as they have proved to carry useful semantic information . | ||
| Polish | 11 | |
| R17-1048 This paper presents an supervised approach to the recognition of Cross - document Structure Theory ( CST ) relations in *****Polish***** texts . | ||
| 2020.lrec-1.719 In the paper we describe two resources of *****Polish***** data focused on literal and metaphorical meanings of adjective - noun phrases . | ||
| 2020.lrec-1.207 However , the lack of pre - trained models and datasets annotated at the sentence level has been a problem for low - resource languages such as *****Polish***** which led to less interest in applying these methods to language - specific tasks . | ||
| W18-0534 The dataset includes 1,868 student essays written by learners of European Portuguese , native speakers of the following L1s : Chinese , English , Spanish , German , Russian , French , Japanese , Italian , Dutch , Tetum , Arabic , *****Polish***** , Korean , Romanian , and Swedish . | ||
| 2021.gwc-1.24 In the paper , we deal with the problem of unsupervised text document clustering for the *****Polish***** language . | ||
| cross - | 11 | |
| 2021.naacl-main.210 Current sequence - to - sequence models are trained to minimize *****cross -***** entropy and use softmax to compute the locally normalized probabilities over target sequences . | ||
| 2021.emnlp-main.132 We study the power of *****cross -***** attention in the Transformer architecture within the context of transfer learning for machine translation , and extend the findings of studies into cross - attention when training from scratch . | ||
| R19-1027 A main objective of this work is to offer a uniform representation of different morphological data sets in order to be able to compare and interlink multilingual resources and to *****cross -***** check and interlink or merge the content of morphological resources of one and the same language . | ||
| 2020.emnlp-main.179 However , mPLM - based methods usually involve two problems : ( 1 ) simply fine - tuning may not adapt general - purpose multilingual representations to be task - aware on low - resource languages ; ( 2 ) ignore how *****cross -***** lingual adaptation happens for downstream tasks . | ||
| 2021.acl-short.35 Early fusion models with *****cross -***** attention have shown better - than - human performance on some question answer benchmarks , while it is a poor fit for retrieval since it prevents pre - computation of the answer representations . | ||
| Quality | 11 | |
| W19-5406 We present the contribution of the Unbabel team to the WMT 2019 Shared Task on *****Quality***** Estimation . | ||
| 2021.wmt-1.100 *****Quality***** Estimation , as a crucial step of quality control for machine translation , has been explored for years . | ||
| W18-6460 In this paper , a novel approach to *****Quality***** Estimation is introduced , which extends the method in ( Duma and Menzel , 2017 ) by also considering pseudo - reference translations as data sources to the tree and sequence kernels used before . | ||
| 2021.wmt-1.95 This paper presents our submissions to the WMT2021 Shared Task on *****Quality***** Estimation , Task 1 Sentence - Level Direct Assessment . | ||
| W18-6451 We report the results of the WMT18 shared task on *****Quality***** Estimation , i.e. | ||
| chat | 11 | |
| W17-5546 Neural conversational models require substantial amounts of dialogue data to estimate their parameters and are therefore usually learned on large corpora such as *****chat***** forums or movie subtitles . | ||
| 2020.coling-main.175 This paper presents a prototype of a *****chat***** room that detects offensive expressions in a video live streaming chat in real time . | ||
| L16-1129 In this paper we report our effort to construct the first ever Indonesian corpora for *****chat***** summarization . | ||
| E17-3022 We build a *****chat***** bot with iterative content exploration that leads a user through a personalized knowledge acquisition session . | ||
| L12-1558 Although in recent years numerous forms of Internet communication such as e - mail , blogs , *****chat***** rooms and social network environments have emerged , balanced corpora of Internet speech with trustworthy meta - information ( e.g. | ||
| functional | 11 | |
| 2020.sigdial-1.16 A physical blocks world , despite its relative simplicity , requires ( in fully interactive form ) a rich set of *****functional***** capabilities , ranging from vision to natural language understanding . | ||
| 2020.pam-1.11 In the frame hypothesis ( CITATION ) , human concepts are equated with frames , which extend feature lists by a *****functional***** structure consisting of attributes and values . | ||
| 2020.ldl-1.4 The former provide metadata such as number of speakers , location ( in prose descriptions and/or GPS coordinates ) , language code , literacy , etc . , while the latter contain information about a set of structural and *****functional***** attributes of languages . | ||
| W16-4901 Learning functional expressions is one of the difficulties for language learners , since *****functional***** expressions tend to have multiple meanings and complicated usages in various situations . | ||
| W19-1104 We outline a hyperintensional situation semantics in which hyperintensionality is modelled as a ` side effect ' , as this term has been understood in natural language semantics and in *****functional***** programming . | ||
| Indian | 11 | |
| 2016.gwc-1.46 Samsa or compounds are a regular feature of *****Indian***** Languages . | ||
| 2005.mtsummit-posters.21 A survey of the machine translation systems that have been developed in India for translation from English to Indian languages and among *****Indian***** languages reveals that the MT softwares are used in field testing or are available as web translation service . | ||
| 2021.wat-1.30 This paper describes ANVITA-1.0 MT system , architected for submission to WAT2021 MultiIndicMT shared task by mcairt team , where the team participated in 20 translation directions : EnglishIndic and IndicEnglish ; Indic set comprised of 10 *****Indian***** languages . | ||
| 2020.lrec-1.462 We present sentence aligned parallel corpora across 10 *****Indian***** Languages - Hindi , Telugu , Tamil , Malayalam , Gujarati , Urdu , Bengali , Oriya , Marathi , Punjabi , and English - many of which are categorized as low resource . | ||
| 2020.trac-1.25 In this paper , we discuss the development of a multilingual annotated corpus of misogyny and aggression in *****Indian***** English , Hindi , and Indian Bangla as part of a project on studying and automatically identifying misogyny and communalism on social media ( the ComMA Project ) . | ||
| weak | 11 | |
| 2021.naacl-main.242 In this paper , we explore text classification with extremely *****weak***** supervision , i.e. , only relying on the surface text of class names . | ||
| 2021.emnlp-main.561 Compositional reasoning tasks such as multi - hop question answering require models to learn how to make latent decisions using only *****weak***** supervision from the final answer . | ||
| 2021.emnlp-main.694 End - to - end question answering using a differentiable knowledge graph is a promising technique that requires only *****weak***** supervision , produces interpretable results , and is fully differentiable . | ||
| 2020.lrec-1.236 We present TableBank , a new image - based table detection and recognition dataset built with novel *****weak***** supervision from Word and Latex documents on the internet . | ||
| 2020.lrec-1.398 The paper presents a dataset of 11,000 Polish - English translational equivalents in the form of pairs of plWordNet and Princeton WordNet lexical units linked by three types of equivalence links : strong equivalence , regular equivalence , and *****weak***** equivalence . | ||
| used | 11 | |
| L12-1300 While the standard methods for deploying web services using dedicated ( virtual ) server may suffice in many circumstances , CLARIN centers are also faced with a growing number of services that are not frequently *****used***** and for which significant compute power needs to be reserved . | ||
| 2020.lrec-1.274 The framework is adopted from the existing SpatialNet representation in the general domain with the aim to generate more accurate representations of spatial language *****used***** by radiologists . | ||
| 2021.acl-long.307 By introducing three novel components : Pointer , Disambiguator , and Copier , our method PDC achieves the following merits inherently compared with previous efforts : ( 1 ) Pointer leverages the semantic information from bilingual dictionaries , for the first time , to better locate source words whose translation in dictionaries can potentially be *****used***** ; ( 2 ) Disambiguator synthesizes contextual information from the source view and the target view , both of which contribute to distinguishing the proper translation of a specific source word from multiple candidates in dictionaries ; ( 3 ) Copier systematically connects Pointer and Disambiguator based on a hierarchical copy mechanism seamlessly integrated with Transformer , thereby building an end - to - end architecture that could avoid error propagation problems in alternative pipe - line methods . | ||
| 2003.mtsummit-papers.5 pen - based gestures , the following issues arise concerning the nature of the supported communication : a ) to what extend does multilingual communication differ from ` ordinary ' monolingual communication with respect to the dialogue structure and the communicative strategies *****used***** by participants ; b ) the patterns of integration between speech and gestures . | ||
| 2021.acl-long.283 User engagement is one of the most important metrics for evaluating open - domain dialog systems , and could also be *****used***** as real - time feedback to benefit dialog policy learning . | ||
| Internet | 11 | |
| W19-2105 *****Internet***** censorship imposes restrictions on what information can be publicized or viewed on the Internet . | ||
| L14-1203 Hosting Providers play an essential role in the development of *****Internet***** services such as e - Research Infrastructures . | ||
| K19-1096 We investigate the political roles of *****Internet***** trolls in social media . | ||
| W18-4201 Censorship of *****Internet***** content in China is understood to operate through a system of intermediary liability whereby service providers are liable for the content on their platforms . | ||
| 2021.emnlp-main.151 *****Internet***** search affects people 's cognition of the world , so mitigating biases in search results and learning fair models is imperative for social good . | ||
| Question Answering ( QA ) | 11 | |
| W16-4406 In an era where highly accurate *****Question Answering ( QA )***** systems are being built using complex Natural Language Processing ( NLP ) and Information Retrieval ( IR ) algorithms , presenting the acquired answer to the user akin to a human answer is also crucial . | ||
| 2020.emnlp-main.11 The aim of all *****Question Answering ( QA )***** systems is to generalize to unseen questions . | ||
| W18-5435 Datasets that boosted state - of - the - art solutions for *****Question Answering ( QA )***** systems prove that it is possible to ask questions in natural language manner . | ||
| L10-1254 *****Question Answering ( QA )***** technology aims at providing relevant answers to natural language questions . | ||
| L06-1212 Scenario Question Answering is a relatively new direction in *****Question Answering ( QA )***** research that presents a number of challenges for evaluation . | ||
| Latin | 11 | |
| W16-4012 Although spanning thousands of years and genres as diverse as liturgy , historiography , lyric and other forms of prose and poetry , the body of *****Latin***** texts is still relatively sparse compared to English . | ||
| 2020.lt4hala-1.21 Despite the great importance of the *****Latin***** language in the past , there are relatively few resources available today to develop modern NLP tools for this language . | ||
| L06-1229 This paper presents research on Greeklish , that is , a transliteration of Greek using the *****Latin***** alphabet , which is used frequently in Greek e - mail communication . | ||
| L12-1037 Although lexicography of Latin has a long tradition dating back to ancient grammarians , and almost all *****Latin***** grammars devote to wordformation at least one part of the section(s ) concerning morphology , none of the today available lexical resources and NLP tools of Latin feature a wordformation - based organization of the Latin lexicon . | ||
| W16-4902 In learning Asian languages , learners encounter the problem of character types that are different from those in their first language , for instance , between Chinese characters and the *****Latin***** alphabet . | ||
| wikiHow | 10 | |
| 2021.inlg-1.19 We pilot our task on the first multilingual script learning dataset supporting 18 languages collected from ***** wikiHow *****, a website containing half a million how-to articles. | ||
| 2021.emnlp-main.165 With a new dataset harvested from ***** wikiHow ***** consisting of 772,277 images representing human actions, we show that our task is challenging for state-of-the-art multimodal models. | ||
| 2020.emnlp-main.675 ***** wikiHow ***** is a resource of how-to guidesthat describe the steps necessary to accomplish a goal. | ||
| 2020.lrec-1.702 Instructional texts, such as articles in ***** wikiHow *****, describe the actions necessary to accomplish a certain goal. | ||
| 2020.emnlp-main.374 We introduce a dataset targeting these two relations based on ***** wikiHow *****, a website of instructional how-to articles | ||
| modulation | 10 | |
| P19-2022 Concretely, we incorporate informativeness in a previously proposed model of nonce learning, using it for context selection and learning rate ***** modulation *****. | ||
| 2020.nlpbt-1.1 Our motivation is to propose two architectures based on Transformers and ***** modulation ***** that combine the linguistic and acoustic inputs from a wide range of datasets to challenge, and sometimes surpass, the state-of-the-art in the field. | ||
| 2020.coling-main.1 We present an overview of different techniques used to perform the ***** modulation ***** of these modules. | ||
| 2021.acl-long.230 Having obtained the clustering assignment for each sample, we develop the ensemble LM (EnsLM) with the technique of weight ***** modulation *****. | ||
| 2020.coling-main.167 In this paper, we propose a novel deep rectification-***** modulation ***** network (RMN), transforming this task into a multi-step reasoning process by repeating rectification and ***** modulation ***** | ||
| repetition | 10 | |
| P18-2027 Evaluations on the LCSTS and the English Gigaword both demonstrate that our model outperforms the baseline models, and the analysis shows that our model is capable of generating summary of higher quality and reducing ***** repetition *****. | ||
| W19-4613 This shared lexicon is obscured by a lot of cliticization, gemination, and character ***** repetition *****. | ||
| N19-1073 The following serial recall effects are generally investigated in studies with humans: word length and frequency, primacy and recency, semantic confusion, ***** repetition *****, and transposition effects. | ||
| W17-5510 We propose computationally inexpensive measures of verbal alignment based on expression ***** repetition ***** in dyadic textual dialogues. | ||
| L16-1016 The datasets include audios and several manual annotations, i.e., miscommunication, anger, satisfaction, ***** repetition *****, gender and task success | ||
| treebanking | 10 | |
| L06-1329 This pilot, consisting of morphological and syntactic annotation of approximately 26,000 words of Levantine Arabic conversational telephone speech, was developed under severe time constraints; hence the LDC team drew on their experience in ***** treebanking ***** Modern Standard Arabic (MSA) text. | ||
| W19-6102 Standard approaches to ***** treebanking ***** traditionally employ a waterfall model (Sommerville, 2010), where annotation guidelines guide the annotation process and insights from the annotation process in turn lead to subsequent changes in the annotation guidelines. | ||
| L12-1454 We present the format of syntacto-semantic annotations found in this resource and present initial parsing results for these data, as well as some reflections following a first round of ***** treebanking *****. | ||
| L04-1171 We focus on the ***** treebanking ***** task as a trigger for basic language resources compilation. | ||
| W18-4924 The near absence of question constructions is due to the dominance of the news domain in ***** treebanking ***** efforts | ||
| Numerous | 10 | |
| 2020.aacl-main.92 ***** Numerous ***** methods have been proposed and their performance compared in the RumourEval shared tasks in 2017 and 2019. | ||
| 2021.sigtyp-1.3 ***** Numerous ***** people are involved in updating UD's annotation guidelines and treebanks in various languages. | ||
| L16-1017 ***** Numerous ***** dialogue corpora are available for research purposes and many of them are annotated with dialogue act information that captures the intentions encoded in user utterances. | ||
| 2020.coling-main.530 ***** Numerous ***** successful attempts use large monolingual corpora to augment low-resource pairs. | ||
| 2004.amta-papers.14 ***** Numerous ***** specialist dictionaries are being and have been created | ||
| 100K | 10 | |
| 2020.coling-main.579 These techniques enable us to create an initial data set covering ***** 100K ***** or more relatively clean sentences in each of 500+ languages, paving the way towards a 1,000-language web text corpus. | ||
| W17-2005 We present BreakingNews, a novel dataset with approximately ***** 100K ***** news articles including images, text and captions, and enriched with heterogeneous meta-data (e.g. GPS coordinates and popularity metrics). | ||
| P19-1250 To this end, we create a new dataset including ***** 100K *****+ Japanese comments with constructiveness scores (C-scores). | ||
| L16-1436 Finally, we publicly release around ***** 100K ***** parallel discourse data with manual speaker and dialogue boundary annotation. | ||
| D18-1393 We analyze 13 years (***** 100K ***** articles) of the Russian newspaper Izvestia and identify a strategy of distraction: articles mention the U.S. more frequently in the month directly following an economic downturn in Russia | ||
| multimodal dataset | 10 | |
| 2020.lrec-1.755 We present Fakeddit, a novel ***** multimodal dataset ***** consisting of over 1 million samples from multiple categories of fake news. | ||
| 2020.lrec-1.292 Eye4Ref is a rich ***** multimodal dataset ***** of eye-movement recordings collected from referentially complex situated settings where the linguistic utterances and their visual referential world were available to the listener. | ||
| 2020.lrec-1.93 In this paper, we introduce a ***** multimodal dataset ***** in which subjects are instructing each other how to assemble IKEA furniture. | ||
| L14-1673 This paper presents the construction of a ***** multimodal dataset ***** for deception detection, including physiological, thermal, and visual responses of human subjects under three deceptive scenarios. | ||
| D19-1211 This paper presents a diverse ***** multimodal dataset *****, called UR-FUNNY, to open the door to understanding multimodal language used in expressing humor | ||
| cardinality | 10 | |
| W19-8603 Basically, these models enable the integration of different parameters into the decision process for using a specific referring expression like the ***** cardinality ***** of the object set, the configuration and complexity of the visual field, and the discriminatory power of available attributes that need to be combined with visual salience and personal preference. | ||
| 1999.mtsummit-1.59 The definition of an infinite set of templates enables the automatic creation of LTRs for multi-word, non-compositional word equivalences of any ***** cardinality *****. | ||
| 2020.globalex-1.18 The proposed strategies are based on the analysis of Apertium RDF graph, taking advantage of characteristics such as translation using multiple paths, synonyms and similarities between lexical entries from different lexicons and ***** cardinality ***** of possible translations through the graph. | ||
| 2021.emnlp-main.193 Evaluation of NLP debiasing methods has largely been limited to binary attributes in isolation, e.g., debiasing with respect to binary gender or race, however many corpora involve multiple such attributes, possibly with higher ***** cardinality *****. | ||
| W17-3514 Our grammar engine adds to previous work in this field with new rules for ***** cardinality ***** constraints, prepositions in roles, the passive, and phonological conditioning | ||
| Conversely | 10 | |
| 2020.winlp-1.34 ***** Conversely *****, when sadness is expressed with authority-vice, the tweet is more likely to be retweeted. | ||
| 2020.isa-1.4 ***** Conversely *****, people also have no problems imagining a concept of a described space. | ||
| L10-1117 ***** Conversely *****, we show that increasing the size of the input corpus and modifying the extraction procedure to better differentiate prepositional arguments from prepositional modifiers improves performance. | ||
| P17-2088 ***** Conversely *****, any set of complex embeddings can be converted to a set of equivalent holographic embeddings. | ||
| 1997.iwpt-1.7 ***** Conversely *****, the use of TAG reveals the need for additional types of underspecification that have not been considered so far in the D-theoretic framework | ||
| extractive QA | 10 | |
| D19-5817 We show that F1 may not be suitable for all ***** extractive QA ***** tasks depending on the answer types. | ||
| 2021.emnlp-main.622 Specifically, we propose TransferQA, a transferable generative QA model that seamlessly combines ***** extractive QA ***** and multi-choice QA via a text-to-text transformer framework, and tracks both categorical slots and non-categorical slots in DST. | ||
| W17-2309 We focus on factoid and list question, using an ***** extractive QA ***** model, that is, we restrict our system to output substrings of the provided text snippets. | ||
| D17-1111 We propose instead to cast ***** extractive QA ***** as an iterative search problem: select the answer's sentence, start word, and end word. | ||
| 2020.acl-main.653 We present MLQA, a multi-way aligned ***** extractive QA ***** evaluation benchmark intended to spur research in this area | ||
| reproducing | 10 | |
| W19-3011 It is argued that these improvements in scientific integrity are poised to naturally reduce persistent healthcare inequities in neglected subpopulations, such as verbally fluent girls and women with ASD, but that concerted attention to this issue is necessary to avoid ***** reproducing ***** biases built into training data. | ||
| L14-1231 Aiming to inform a module of a system designed to support scientific written production of Spanish native speakers learning Portuguese, we developed an approach to automatically generate a lexicon of wrong words, ***** reproducing ***** language transfer errors made by such foreign learners. | ||
| P18-4013 It also includes the implementations of most state-of-the-art neural sequence labeling models such as LSTM-CRF, facilitating ***** reproducing ***** and refinement on those methods. | ||
| D19-1117 To obtain a powerful attention helping with ***** reproducing ***** the most salient information and avoiding repetitions, we augment the vanilla attention model from both local and global aspects. | ||
| W19-4512 After ***** reproducing ***** the state-of-the-art Evidence Graph model from Afantenos et al | ||
| cf | 10 | |
| 2020.framenet-1.2 This paper reports on an effort to search for corresponding constructions in English and Japanese in a TED Talk parallel corpus, using frames-and-constructions analysis (Ohara, 2019; Ohara and Okubo, 2020; ***** cf *****. | ||
| 2021.nodalida-main.18 Given the”gold” nature of the resource, it is possible to use it for empirical studies as well as to develop linguistically-aware algorithms for morpheme segmentation and labeling (***** cf ***** statistical subword approach). | ||
| L06-1283 (***** cf *****. | ||
| 2014.lilt-11.7 In French, evaluative prefixes can be classified along two dimensions (***** cf *****. | ||
| W16-4111 This paper investigates the design of a key-stroke and subject dependent identification system of cognitive effort to track complexity in translation with keystroke logging (***** cf ***** | ||
| mixtures | 10 | |
| 2021.metanlp-1.7 This proved that these two methods do have the ability to generate well-trained parameters for adapting to speech ***** mixtures ***** of new speakers and accents. | ||
| 2020.acl-main.503 We combine this method with a SQuAD-trained QA model and evaluate on ***** mixtures ***** of SQuAD and five other QA datasets. | ||
| D19-1273 We use synthetic data consisting of article ***** mixtures ***** for scalable training and evaluate our model on a new human-curated dataset of scenarios about real-world news topics. | ||
| 2021.nlp4if-1.8 We investigate if the performance of supervised models for cross-corpora abuse detection can be improved by incorporating additional information from topic models, as the latter can infer the latent topic ***** mixtures ***** from unseen samples. | ||
| D17-1016 We propose a method for embedding two-dimensional locations in a continuous vector space using a neural network-based model incorporating ***** mixtures ***** of Gaussian distributions, presenting two model variants for text-based geolocation and lexical dialectology | ||
| constraining | 10 | |
| 2020.findings-emnlp.166 We find that by ***** constraining ***** the output to suppress illegal transitions we can train a tagger with a cross-entropy loss twice as fast as a CRF with differences in F1 that are statistically insignificant, effectively eliminating the need for a CRF. | ||
| 2006.amta-papers.2 We present a method of ***** constraining ***** the search space of the Joint Probability Model based on statistically and linguistically motivated word align- ments. | ||
| 2011.iwslt-evaluation.5 This idea here being that ***** constraining ***** the decoding process in this manner would greatly reduce the search space of the decoder, and cut out many possibilities for error while at the same time allowing for a correct output to be generated. | ||
| N18-1003 This is achieved by using entity and template seeds jointly (as opposed to just one as in previous work), by expanding entities and templates in parallel and in a mutually ***** constraining ***** fashion in each iteration and by introducing higherquality similarity measures for templates. | ||
| D19-1181 Despite detection of suicidal ideation on social media has made great progress in recent years, people's implicitly and anti-real contrarily expressed posts still remain as an obstacle, ***** constraining ***** the detectors to acquire higher satisfactory performance | ||
| collected | 10 | |
| 2020.findings-emnlp.375 Such conclusions about system and human performance are, however, based on estimates aggregated from scores ***** collected ***** over large test sets of translations and unfortunately leave some remaining questions unanswered. | ||
| 2021.acl-short.139 We perform an intrinsic evaluation by manually evaluating a subset of the sentence pairs and an extrinsic evaluation by finetuning mBART (Liu et al., 2020) on the ***** collected ***** data. | ||
| 2021.bionlp-1.20 Further analysis on a ***** collected ***** probing dataset shows that our model has better ability to model medical knowledge. | ||
| 2020.signlang-1.9 We conclude with a report on the current state of a collection following this protocol, and a few observations on the ***** collected ***** contents. | ||
| 2020.lrec-1.746 The authors formulate the basic principles of annotation of sign words, based on the ***** collected ***** data, and reveal the content of the ***** collected ***** database | ||
| 100k | 10 | |
| 2021.acl-long.439 In total, DVD is built from 11k CATER synthetic videos and contains 10 instances of 10-round dialogues for each video, resulting in more than ***** 100k ***** dialogues and 1M question-answer pairs. | ||
| L14-1250 The tagset has been used to tag ***** 100k ***** words of the CLE Urdu Digest Corpus, giving a tagging accuracy of 96.8%. | ||
| 2021.conll-1.21 These simulations are repeated with increasing amounts of exposure, from ***** 100k ***** to 2 million words, to measure the impact of exposure on the convergence of grammars. | ||
| D19-6129 We report experiments on two low resource languages: Swahili and Tagalog, trained on less that ***** 100k ***** parallel sentences each. | ||
| 2020.findings-emnlp.147 We find that alignments created from embeddings are superior for four and comparable for two language pairs compared to those produced by traditional statistical aligners – even with abundant parallel data; e.g., contextualized embeddings achieve a word alignment F1 for English-German that is 5 percentage points higher than eflomal, a high-quality statistical aligner, trained on ***** 100k ***** parallel sentences | ||
| obfuscation | 10 | |
| 2020.findings-emnlp.56 MBA expressions have been widely applied in software ***** obfuscation *****, transforming programs from a simple form to a complex form. | ||
| P19-1104 We analyze, quantify, and illustrate the rationale of this approach, define paraphrasing operators, derive ***** obfuscation ***** thresholds, and develop an effective ***** obfuscation ***** framework. | ||
| 2020.starsem-1.19 In this paper, we propose the first deep learning architecture for constructing adversarial examples against similarity-based learners, and explore its application to author ***** obfuscation *****. | ||
| W19-5906 However, this joint modeling adds to model ***** obfuscation *****. | ||
| 2020.iwpt-1.7 Our primary tool is ***** obfuscation *****, relying on the properties of natural language | ||
| stochastic gradient | 10 | |
| 2021.naacl-main.413 Our method preserves differentiability, allowing scalable inference via ***** stochastic gradient ***** descent. | ||
| Q13-1017 Unlike algorithms such as Perceptron and ***** stochastic gradient ***** descent, our method keeps track of dual variables and updates the weight vector more aggressively. | ||
| P17-1185 Skip-Gram Negative Sampling (SGNS) word embedding model, well known by its implementation in “word2vec” software, is usually optimized by ***** stochastic gradient ***** descent. | ||
| P18-1169 From a machine learning perspective, the key challenge lies in a proper reweighting of the estimator so as to avoid known degeneracies in counterfactual learning, while still being applicable to ***** stochastic gradient ***** optimization. | ||
| D17-1045 We make distributed ***** stochastic gradient ***** descent faster by exchanging sparse updates instead of dense updates | ||
| MUC | 10 | |
| L08-1126 In ***** MUC *****, it was 7 categories - people, organization, location, time, date, money and percentage expressions. | ||
| D19-5727 The best performing system obtained F-scores of 44%, 48%, 39%, 49%, 40%, and 57% on the test set with B3, BLANC, CEAFE, CEAFM, LEA, and ***** MUC ***** metrics, respectively. | ||
| L16-1168 We propose a scheme for annotating direct speech in literary texts, based on the Text Encoding Initiative (TEI) and the coreference annotation guidelines from the Message Understanding Conference (***** MUC *****). | ||
| C18-1004 Our experiments show that a standard hierarchical clustering using the scores produces state-of-art results with ***** MUC ***** and B 3 metrics on the English portion of CoNLL 2012 Shared Task | ||
| L04-1245 We survey the evaluation methodology adopted in Information Extraction ( IE ) , as defined in the *****MUC***** conferences and in later independent efforts applying machine learning to IE . | ||
| CODA | 10 | |
| L14-1214 In this paper, we present a conventional orthography for Tunisian Arabic, following a previous effort on developing a conventional orthography for Dialectal Arabic (or ***** CODA *****) demonstrated for Egyptian Arabic. | ||
| 2020.lrec-1.508 In this paper, we present the MADAR ***** CODA ***** Corpus, a collection of 10,000 sentences from five Arabic city dialects (Beirut, Cairo, Doha, Rabat, and Tunis) represented in the Conventional Orthography for Dialectal Arabic (***** CODA *****) in parallel with their raw original form. | ||
| L12-1328 We explain the design principles of ***** CODA ***** and provide a detailed description of its guidelines as applied to Egyptian Arabic. | ||
| 2020.wanlp-1.16 The annotation performed are text classification, tokenization, PoS tagging and encoding of Tunisian Arabizi into ***** CODA ****** Arabic orthography. | ||
| L10-1079 We describe the construction of the *****CODA***** corpus , a parallel corpus of monologues and expository dialogues . | ||
| aforementioned | 10 | |
| 2020.cmlc-1.5 This paper outlines the details of our first draft of ***** aforementioned ***** architecture. | ||
| L06-1366 Our baseline approach to geo-referencing is based on application of ***** aforementioned ***** resources and a lightweight co-referencing technique which utilizes string-similarity metric of Jaro-Winkler. | ||
| D19-1310 To address ***** aforementioned ***** problems, not only do we model each table cell considering other records in the same row, we also enrich table's representation by modeling each table cell in context of other cells in the same column or with historical (time dimension) data respectively. | ||
| 2020.coling-main.179 To address ***** aforementioned ***** problems, we propose TableGPT for table-to-text generation. | ||
| 2021.acl-tutorials.2 This tutorial will provide audience with a systematic introduction of (i) knowledge representations of events, (ii) various methods for automated extraction, conceptualization and prediction of events and their relations, (iii) induction of event processes and properties, and (iv) a wide range of NLU and commonsense understanding tasks that benefit from ***** aforementioned ***** techniques | ||
| quantity | 10 | |
| W19-3607 We also can conclude data ***** quantity *****, performance of POS tagger and data quality highly affects the keystroke savings. | ||
| 2020.lrec-1.322 Using manually created lexical analysis and rich annotation (instead of high data ***** quantity *****) allows for an automated creation of AAC communication boards. | ||
| N19-1389 We critically survey the existing literature and report experiments on eight languages, comparing systems spanning all categories of proposed normalization techniques, analysing the effect of training data ***** quantity *****, and using different evaluation methods. | ||
| 2020.emnlp-main.746 However, the increased emphasis on data ***** quantity ***** has made it challenging to assess the quality of data. | ||
| 2020.lrec-1.666 In addition to the context information captured at each word position, we incorporate a new ***** quantity ***** of context information jump to facilitate the attention weight formulation | ||
| corrections | 10 | |
| 2020.lrec-1.835 As a complementary new resource for these tasks, we present the GitHub Typo Corpus, a large-scale, multilingual dataset of misspellings and grammatical errors along with their ***** corrections ***** harvested from GitHub, a large and popular platform for hosting and sharing git repositories. | ||
| D18-1195 Our algorithm learns a semantic parser from users' ***** corrections ***** such as “no, what I really meant was before his job, not after”, by also simultaneously learning to parse this natural language feedback in order to leverage it as a form of supervision. | ||
| N19-5003 These techniques include vector autoregressive models, multiple comparisons ***** corrections ***** for hypothesis testing, and causal inference. | ||
| W19-5919 This is important since ***** corrections ***** to log dialogues provide a means to improve performance after deployment. | ||
| W18-6452 The task consists in automatically correcting the output of a “black-box” machine translation system by learning from human ***** corrections ***** | ||
| interpolating | 10 | |
| W16-4808 By feeding the model Bible translations in a thousand languages, not only does the learned vector space capture language similarity, but by ***** interpolating ***** between the learned vectors it is possible to generate text in unattested intermediate forms between the training languages. | ||
| 2020.coling-main.305 It has shown strong effectiveness in image classification by ***** interpolating ***** images at the pixel level. | ||
| 2020.nli-1.3 Using publicly available data from Reddit, we demonstrate improvements in offline metrics at the user level by ***** interpolating ***** a global LSTM-based authoring model with a user-personalized n-gram model. | ||
| 2020.acl-main.176 We find that perplexity of neural LMs is strongly and differentially associated with lexical frequency, and that using a mixture model resulting from ***** interpolating ***** control and dementia LMs improves upon the current state-of-the-art for models trained on transcript text exclusively. | ||
| 2020.emnlp-main.95 In this work , to alleviate the dependence on labeled data , we propose a Local Additivity based Data Augmentation ( LADA ) method for semi - supervised NER , in which we create virtual samples by *****interpolating***** sequences close to each other . | ||
| neural MT | 10 | |
| W18-6469 Results show better performances reached by the attention-only model over the recurrent one, significant improvement over the baseline when post-editing phrase-based MT output but degradation when applied to ***** neural MT ***** output. | ||
| 2020.acl-main.146 Although unnecessary for training ***** neural MT ***** models, word alignment still plays an important role in interactive applications of neural machine translation, such as annotation transfer and lexicon injection. | ||
| W17-1207 In-domain manual analysis shows that ***** neural MT ***** tends to improve both adequacy and fluency, for example, by being able to generate more natural translations instead of literal ones, choosing to the adequate target word when the source word has several translations and improving gender agreement. | ||
| D19-1085 In this paper, we propose a unified and discourse-aware ZP translation approach for ***** neural MT ***** models | ||
| D19-6506 In this paper , we investigate how different aspects of discourse context affect the performance of recent *****neural MT***** systems . | ||
| parallelize | 10 | |
| 2000.iwpt-1.37 Because of the nature of the parsing problem, unification-based parsers are hard to ***** parallelize *****. | ||
| D18-1244 However, existing dependency-based models either neglect crucial information (e.g., negation) by pruning the dependency trees too aggressively, or are computationally inefficient because it is difficult to ***** parallelize ***** over different tree structures. | ||
| D18-1043 It is also efficient, easy to ***** parallelize ***** on CPU and interpretable. | ||
| P18-1108 It is also easier to ***** parallelize ***** and much faster. | ||
| Q14-1009 We extend the method and show how to efficiently ***** parallelize ***** the algorithm on modern parallel computing platforms while preserving approximation guarantees | ||
| nuanced | 10 | |
| P19-1413 While intuitive, these approaches fall short of representing ***** nuanced ***** relations, needed for downstream tasks. | ||
| 2021.hcinlp-1.3 Prior research has highlighted the difficulty in creating language models that recognize ***** nuanced ***** toxicity such as microaggressions. | ||
| S17-1001 Our study offers a needed explanation for why analogy tests succeed and fail where they do and provides ***** nuanced ***** insight into the relationship between word distributions and the theoretical linguistic domains of syntax and semantics. | ||
| 2021.emnlp-main.485 Due to the inherent complexity of this type of data, caused by its dynamic (context evolves rapidly), ***** nuanced ***** (misinformation types are often ambiguous), and diverse (skewed, fine-grained, and overlapping categories) nature, it is imperative for an effective model to capture both the local and global context of the target domain. | ||
| 2021.bea-1.23 Results suggest that ***** nuanced ***** features such as the number of ambiguous medical terms help explain response process complexity beyond superficial item characteristics such as word count | ||
| integrate | 10 | |
| L16-1071 The Dictionaries division at Oxford University Press (OUP) is aiming to model, ***** integrate *****, and publish lexical content for 100 languages focussing on digitally under-represented languages. | ||
| 2021.acl-long.291 Most previous studies ***** integrate ***** cognitive language processing signals (e.g., eye-tracking or EEG data) into neural models of natural language processing (NLP) just by directly concatenating word embeddings with cognitive features, ignoring the gap between the two modalities (i.e., textual vs. cognitive) and noise in cognitive features. | ||
| W19-2907 There is now evidence that listeners can and do ***** integrate ***** cues that occur far apart in time. | ||
| 2020.lrec-1.111 Methods from linguistics and from computer vision research ***** integrate ***** into a mixed methods system, with benefits on both sides. | ||
| D19-5322 Additionally, DBee aims to facilitate the construction of KGs and VSs, by providing a library of generators, which can be used to create, ***** integrate ***** and transform data into KGs and VSs | ||
| minimizes | 10 | |
| 2020.acl-main.304 We propose two novel KD methods based on structure-level information: (1) approximately ***** minimizes ***** the distance between the student's and the teachers' structure-level probability distributions, (2) aggregates the structure-level knowledge to local distributions and ***** minimizes ***** the distance between two local probability distributions. | ||
| 2021.acl-long.568 Furthermore, we empirically show that pre-training implicitly ***** minimizes ***** intrinsic dimension and, perhaps surprisingly, larger models tend to have lower intrinsic dimension after a fixed number of pre-training updates, at least in part explaining their extreme effectiveness. | ||
| D18-1383 Our approach explicitly ***** minimizes ***** the distance between the source and the target instances in an embedded feature space. | ||
| N19-1025 The Wasserstein objective ***** minimizes ***** the distance between marginal distribution and the prior directly and therefore does not force the posterior to match the prior. | ||
| 2021.naacl-main.442 Simultaneously, the model ***** minimizes ***** the similarity between the latter representation and the representation of a random sentence with the same context | ||
| ours | 10 | |
| W17-5009 Compared to previous data resources supporting humor recognition research, ***** ours ***** has several advantages, including (a) both positive and negative instances coming from a homogeneous data set, (b) containing a large number of speakers, and (c) being open. | ||
| 2014.amta-researchers.24 To the best of our knowledge, ***** ours ***** is the first effective translation system for the first two of these languages. | ||
| 2021.wmt-1.40 In the official evalua- tion of the test set, ***** ours ***** is ranked 2nd in terms of BLEU scores. | ||
| L12-1515 Compared to prior models, ***** ours ***** is a novel synthesis of the notions of goal, plan, intention, outcome, affect and time that is amenable to corpus annotation. | ||
| 2021.ranlp-1.39 To the best of our knowledge, ***** ours ***** is the first study to present fact-infusion as a novel form of question paraphrasing | ||
| commonly | 10 | |
| Q18-1015 Different from ***** commonly ***** studied named entity KBs such as Freebase, generics KBs involve quantification, have more complex underlying regularities, tend to be more incomplete, and violate the ***** commonly ***** used locally closed world assumption (LCWA). | ||
| W18-6319 I quantify this variation, finding differences as high as 1.8 between ***** commonly ***** used configurations. | ||
| C18-1231 We further qualitatively analyze these metrics and our findings show that apart from being less interpretable and non-deterministic, GLEU also produces counter-intuitive scores in ***** commonly ***** occurring test examples. | ||
| 2021.emnlp-main.532 We further design four types of strategies for creating negative samples, to resemble errors made ***** commonly ***** by two state-of-the-art models, BART and PEGASUS, found in our new human annotations of summary errors. | ||
| W17-4208 The revision logs contain data that can help reveal the requirements of quality journalism such as the types and number of edit operations and aspects ***** commonly ***** focused in revision | ||
| elicit | 10 | |
| 2019.icon-1.4 We are able to produce poems that correctly ***** elicit ***** the emotions of sadness and joy 87.5 and 85 percent, respectively, of the time. | ||
| 2020.emnlp-main.346 We also show that our prompts ***** elicit ***** more accurate factual knowledge from MLMs than the manually created prompts on the LAMA benchmark, and that MLMs can be used as relation extractors more effectively than supervised relation extraction models. | ||
| 2020.inlg-1.45 Our results show that different kinds of errors ***** elicit ***** significantly different evaluation scores, even though all erroneous descriptions differ in only one character from the reference descriptions. | ||
| 2020.coling-main.307 While internalized “implicit knowledge” in pretrained transformers has led to fruitful progress in many natural language understanding tasks, how to most effectively ***** elicit ***** such knowledge remains an open question. | ||
| 2021.nodalida-main.6 Our results show that, against our expectations, professional translations ***** elicit ***** higher perplexity scores from a target language model than students' translations | ||
| semantic vector | 10 | |
| W19-5356 We propose WMDO, a metric based on distance between distributions in the ***** semantic vector ***** space. | ||
| S17-2016 We generated ***** semantic vector ***** of every sentence by max pooling every dimension of their word vectors. | ||
| D19-1440 Using a detailed quantitative and qualitative analysis, we demonstrate that these data sources have complementary semantic aspects, supporting the creation of explicit ***** semantic vector ***** spaces. | ||
| P19-1086 We evaluate a large number of strong baselines on SherLIiC, ranging from ***** semantic vector ***** space models to state of the art neural models of natural language inference (NLI). | ||
| W16-5303 We propose a model for regular polysemy detection that is based on sense vectors and allows to work directly with senses in ***** semantic vector ***** space | ||
| solve | 10 | |
| D19-5401 Usually, dialogue systems ***** solve ***** this issue by using template-based language generation. | ||
| 2020.emnlp-main.105 Typically, machine learning systems ***** solve ***** new tasks by training on thousands of examples. | ||
| 2021.emnlp-main.469 Meta-learning considers the problem of learning an efficient learning process that can leverage its past experience to accurately ***** solve ***** new tasks. | ||
| 2021.acl-long.27 Existing methods ***** solve ***** this problem by associating aspect terms with pivot words (we call this passive domain adaptation because the transfer of aspect terms relies on the links to pivots). | ||
| 2021.acl-long.186 DynaSent has a total of 121,634 sentences, each validated by five crowdworkers, and its development and test splits are designed to produce chance performance for even the best models we have been able to develop; when future models ***** solve ***** this task, we will use them to create DynaSent version 2, continuing the dynamic evolution of this benchmark | ||
| introduced | 10 | |
| 2014.amta-researchers.9 The weights of ***** introduced ***** features are tuned to optimize the sentence- and document-level metrics simultaneously on the basis of Pareto optimality. | ||
| N19-1372 Finally, we analyze the effectiveness of the ***** introduced ***** models in detail. | ||
| L14-1127 Based on the ***** introduced ***** methodology, we compute a matrix of Romance languages intelligibility. | ||
| 2021.emnlp-main.788 We validate our approach and illustrate a case study to show the usefulness of the ***** introduced ***** measure. | ||
| P19-1478 By fine-tuning the BERT language model both on the ***** introduced ***** and on the WSCR dataset, we achieve overall accuracies of 72.5% and 74.7% on WSC273 and WNLI, improving the previous state-of-the-art solutions by 8.8% and 9.6%, respectively | ||
| proposing | 10 | |
| 2020.codi-1.13 With the goal of analyzing and pruning the parameter-heavy self-attention mechanism, there are multiple approaches ***** proposing ***** more parameter-light self-attention alternatives. | ||
| C18-1283 We conclude with ***** proposing ***** avenues for future NLP research on automated fact checking. | ||
| 2020.coling-main.63 We propose a hierarchical encoder-tagger model (HET) to generate summaries by identifying important utterances (with respect to problem ***** proposing ***** and solving) in the conversations. | ||
| W18-3410 In this work we extend this line of research, ***** proposing ***** effective, low-rank and low-rank plus diagonal matrix parametrizations for Passthrough Networks which exploit this decoupling property, reducing the data complexity and memory requirements of the network while preserving its memory capacity. | ||
| 2012.iwslt-papers.2 From the many different possible paralinguistic features to handle, in this paper we chose duration and power as a first step, ***** proposing ***** a method that can translate these features from input speech to the output speech in continuous space | ||
| unifying | 10 | |
| L16-1157 We use an interoperable scheme ***** unifying ***** discourse phenomena in both frameworks into more abstract categories and considering only those phenomena that have a direct match in German and Czech. | ||
| 2021.emnlp-demo.10 Our application provides a seamless, ***** unifying ***** interface with which to visualise, manipulate and analyse semantically parsed graph data represented in a JSON-based serialisation format. | ||
| 2020.acl-main.660 It has been exactly a decade since the first establishment of SPMRL, a research initiative ***** unifying ***** multiple research efforts to address the peculiar challenges of Statistical Parsing for Morphologically-Rich Languages (MRLs). | ||
| 2021.emnlp-main.292 To emulate this setting, we construct a new benchmark, called BeerQA, by combining existing one- and two-step datasets with a new collection of 530 questions that require three Wikipedia pages to answer, ***** unifying ***** Wikipedia corpora versions in the process. | ||
| 2020.coling-main.310 At the same time, we further attempt to prevent data leakage when ***** unifying ***** multiple datasets which, arguably, is more useful in an industry setting | ||
| dialogical | 10 | |
| W17-5102 Premises are annotated with the three types of persuasive modes: ethos, logos, pathos, while claims are labeled as interpretation, evaluation, agreement, or disagreement, the latter two designed to account for the ***** dialogical ***** nature of our corpus. | ||
| 2021.reinact-1.6 In this paper we will argue that the nature of dogwhistle communication is essentially ***** dialogical *****, and that to account for dogwhistle meaning we must consider ***** dialogical ***** events in which dialogue partners can draw different conclusions based on communicative events. | ||
| W19-5937 The main aim of this paper is to provide a characterization of the response space for questions using a taxonomy grounded in a ***** dialogical ***** formal semantics. | ||
| W19-5938 In a crowd-sourcing experiment, we investigated three different annotation tasks, each in a collaborative ***** dialogical ***** (two annotators) and monological (one annotator) setting. | ||
| L16-1617 By determining the argumentative and ***** dialogical ***** structures contained within a debate, we are able to determine the issues which are divisive and those which attract agreement | ||
| combining | 10 | |
| 2020.aespen-1.10 Thus, interdisciplinary approaches are necessary, ***** combining ***** experts of both social and computer science. | ||
| N18-6006 From a research perspective, the design of spoken dialogue systems provides a number of significant challenges, as these systems depend on: a) solving several difficult NLP and decision-making tasks; and b) ***** combining ***** these into a functional dialogue system pipeline. | ||
| 2021.conll-1.37 We introduce an optimization method for learning angles in limited ranges of polar coordinates, which ***** combining ***** a loss function controlling gradient and distribution uniformization. | ||
| 2006.amta-papers.24 Several experiments were conducted ***** combining ***** linguistic and statistical methods, and manual evaluation was conducted for a set of 460 Chinese sentences. | ||
| 2021.semeval-1.120 Although experimental evidence indicates higher effectiveness of the first approach than the second one, ***** combining ***** them leads to our best results of 70.77 F1-score on the test dataset | ||
| thus | 10 | |
| L12-1133 The benefits of ***** thus ***** improving a language resource such as wordnet become self-evident. | ||
| 2021.eacl-main.305 However, keeping track of the constantly changing legislation is difficult, ***** thus ***** organizations are increasingly adopting Regulatory Technology (RegTech) to facilitate the process. | ||
| L14-1120 Bilingual dictionaries define word equivalents from one language to another, ***** thus ***** acting as an important bridge between languages. | ||
| 2021.acl-long.516 However, it is with the advent of the Internet and the social media that propaganda has started to spread on a much larger scale than before, ***** thus ***** becoming major societal and political issue. | ||
| 2021.rocling-1.1 Modern approaches to Constituency Parsing are mono-lingual supervised approaches which require large amount of labelled data to be trained on, ***** thus ***** limiting their utility to only a handful of high-resource languages | ||
| resulting | 10 | |
| P17-1188 Additionally, qualitative analyses demonstrate that our proposed model learns to focus on the parts of characters that carry topical content which ***** resulting ***** in embeddings that are coherent in visual space. | ||
| P17-2025 Finally, we show that explicit model combination can improve performance even further, ***** resulting ***** in new state-of-the-art numbers on the PTB of 94.25 F1 when training only on gold data and 94.66 F1 when using external data. | ||
| P18-1205 We collect data and train models to (i)condition on their given profile information; and (ii) information about the person they are talking to, ***** resulting ***** in improved dialogues, as measured by next utterance prediction. | ||
| L10-1501 As a consequence, many concepts, theories and scientific methods get in contact with each other, ***** resulting ***** in many different strategies and variants of acquiring, structuring, and sharing data sets. | ||
| 2021.inlg-1.13 Our experiments demonstrate improved handling of such cyclic cases in ***** resulting ***** graphs | ||
| informational | 10 | |
| W18-0505 Traditional readability metrics have the additional drawback of not generalizing to ***** informational ***** texts such as science. | ||
| 2021.louhi-1.3 In this work, we build four machine learning models to measure the extent of the following social supports expressed in each post in a COVID-19 online forum: (a) emotional support given (b) emotional support sought (c) ***** informational ***** support given, and (d) ***** informational ***** support sought. | ||
| D19-1664 We first produce a new dataset, BASIL, of 300 news articles annotated with 1,727 bias spans and find evidence that ***** informational ***** bias appears in news articles more frequently than lexical bias. | ||
| L08-1450 Data models and encoding formats for syntactically annotated text corpora need to deal with syntactic ambiguity; underspecified representations are particularly well suited for the representation of ambiguous data because they allow for high ***** informational ***** efficiency. | ||
| D17-1164 Questions play a prominent role in social interactions , performing rhetorical functions that go beyond that of simple *****informational***** exchange . | ||
| Uppsala | 10 | |
| K17-3022 We present the ***** Uppsala ***** submission to the CoNLL 2017 shared task on parsing from raw text to universal dependencies. | ||
| W17-4805 We describe the ***** Uppsala ***** system for the 2017 DiscoMT shared task on cross-lingual pronoun prediction. | ||
| K18-2011 We present the ***** Uppsala ***** system for the CoNLL 2018 Shared Task on universal dependency parsing. | ||
| 2020.wmt-1.58 This paper describes the joint submission of the University of Edinburgh and ***** Uppsala ***** University to the WMT'20 chat translation task for both language directions (English-German). | ||
| L16-1374 The Persian UD is the converted version of the ***** Uppsala ***** Persian Dependency Treebank (UPDT) to the universal dependencies framework and consists of nearly 6,000 sentences and 152,871 word tokens with an average sentence length of 25 words | ||
| deceptive | 10 | |
| 2020.acl-main.432 We call the latter use of attention mechanisms into question by demonstrating a simple method for training models to produce ***** deceptive ***** attention masks. | ||
| C16-1014 Finally, the document representation is used directly as features to identify ***** deceptive ***** opinion spam. | ||
| R17-1102 Finally, reviewer level evaluation gives an interesting insight into different ***** deceptive ***** reviewers' writing styles. | ||
| 2020.stoc-1.6 We carried out a research experiment on the Deceptive Opinion Spam corpus, a balanced corpus composed of 1,600 hotel reviews of 20 Chicago hotels split into four datasets: positive truthful, negative truthful, positive ***** deceptive ***** and negative ***** deceptive ***** reviews. | ||
| 2020.inlg-1.27 Extending earlier studies, we demonstrate that, when conditioned on a topic, ***** deceptive ***** content is shorter, less readable, more biased, and more subjective than credible content, and transferring the style from ***** deceptive ***** to credible content is more challenging than the opposite direction | ||
| Bulgarian | 10 | |
| L04-1171 The advantages of our approach to a less-spoken language (like ***** Bulgarian *****) are as follows: it triggers the creation of the basic set of language resources which lack for certain languages and it rises the question about the ways of language resources creation. | ||
| 2021.nlp4if-1.12 Task 1 focused on fighting the COVID-19 infodemic in social media, and it was offered in Arabic, ***** Bulgarian *****, and English. | ||
| 2021.ranlp-srw.30 In this paper, we demonstrate different methods for adapting NLP tools for English and other languages to a low resource language like ***** Bulgarian *****. | ||
| W19-3711 Our only submitted run is based on a voting schema using multiple models, one for each of the four languages of the task (***** Bulgarian *****, Czech, Polish, and Russian) and another for English. | ||
| W17-7804 The paper presents part of an ongoing project of the Laboratory for Language Technologies of New ***** Bulgarian ***** University – “An e-Platform for Language Teaching (PLT)” – the development of corpus-based teaching content for Business English courses | ||
| GermEval | 10 | |
| L14-1251 The data is released under the permissive CC-BY license, and will be fully available for download in September 2014 after it has been used for the ***** GermEval ***** 2014 shared task on NER. | ||
| D18-1139 We conduct experiments with different neural architectures and word representations on the recent ***** GermEval ***** 2017 dataset. | ||
| P18-2020 BiLSTMs profit substantially from transfer learning, which enables them to be trained on multiple corpora, resulting in a new state-of-the-art model for German NER on two contemporary German corpora (CoNLL 2003 and ***** GermEval ***** 2014) and two historic corpora. | ||
| 2021.germeval-1.11 To solve the fact claiming task, we fine-tuned these transformers with external data and the data provided by the ***** GermEval ***** task organizers. | ||
| 2021.germeval-1.1 Building on the two previous ***** GermEval ***** shared tasks on the identification of offensive language in 2018 and 2019, we extend this year's task definition to meet the demand of moderators and community managers to also highlight comments that foster respectful communication, encourage in-depth discussions, and check facts that lines of arguments rely on | ||
| Java | 10 | |
| L14-1637 These include (i) the type of phenomena annotated (either morphosyntactic, syntactic, semantic, etc.); (ii) how these phenomena are annotated (e.g., the particular guidelines and/or schema used to encode the annotations); and (iii) the languages (***** Java *****, C++, etc.) and technologies (as standalone programs, as APIs, as web services, etc.) used to develop them. | ||
| 1997.mtsummit-papers.21 A requirements analysis for a generic Natural Language Processing and Machine Translation tool is undertaken to consider how ***** Java ***** could be used, and subsequently two example systems developed in ***** Java ***** (which can be accessed on the Internet) are introduced. | ||
| L12-1256 The resulting LSR, UBY (Gurevych et al., 2012), holds interoperable versions of all nine resources which can be queried by an easy to use public ***** Java ***** API. | ||
| C16-2060 PKUSUMSUM is a *****Java***** platform for multilingual document summarization , and it sup - ports multiple languages , integrates 10 automatic summarization methods , and tackles three typical summarization tasks . | ||
| 2020.wosp-1.3 We introduce SmartCiteCon ( SCC ) , a *****Java***** API for extracting both explicit and implicit citation context from academic literature in English . | ||
| Hateful | 10 | |
| 2020.alw-1.3 Here, four deep learners based on the Bidirectional Encoder Representations from Transformers (BERT), with either general or domain-specific language models, were tested against two datasets containing tweets labelled as either `***** Hateful *****', `Normal' or `Offensive'. | ||
| 2021.acl-long.132 ***** Hateful ***** entries make up 54% of the dataset, which is substantially higher than comparable datasets. | ||
| 2020.alw-1.13 *****Hateful***** rhetoric is plaguing online discourse , fostering extreme societal movements and possibly giving rise to real - world violence . | ||
| 2021.woah-1.4 *****Hateful***** memes pose a unique challenge for current machine learning systems because their message is derived from both text- and visual - modalities . | ||
| 2021.woah-1.24 The Shared Task on *****Hateful***** Memes is a challenge that aims at the detection of hateful content in memes by inviting the implementation of systems that understand memes , potentially by combining image and textual information . | ||
| IWPT 2021 Shared | 10 | |
| 2021.iwpt-1.19 This paper presents the system used in our submission to the ***** IWPT 2021 Shared ***** Task. | ||
| 2021.iwpt-1.21 This paper presents our multilingual dependency parsing system as used in the ***** IWPT 2021 Shared ***** Task on Parsing into Enhanced Universal Dependencies. | ||
| 2021.iwpt-1.18 This paper describes a system proposed for the ***** IWPT 2021 Shared ***** Task on Parsing into Enhanced Universal Dependencies (EUD). | ||
| 2021.iwpt-1.24 We describe the NUIG solution for ***** IWPT 2021 Shared ***** Task of Enhanced Dependency (ED) parsing in multiple languages | ||
| 2021.iwpt-1.20 This paper describe the system used in our submission to the *****IWPT 2021 Shared***** Task . | ||
| cybersecurity | 10 | |
| 2020.sustainlp-1.15 We have evaluated this technique on nine datasets across diverse domains, including newswire, user forums, air flight booking, ***** cybersecurity ***** news, etc. | ||
| R19-1128 In this paper we discuss the named entity recognition task for Russian texts related to ***** cybersecurity *****. | ||
| S18-1113 This paper describes the SemEval 2018 shared task on semantic extraction from ***** cybersecurity ***** reports, which is introduced for the first time as a shared task on SemEval. | ||
| 2021.acl-long.296 To defend against this attack that can cause significant harm, in this paper, we borrow the “honeypot” concept from the ***** cybersecurity ***** community and propose DARCY, a honeypot-based defense framework against UniTrigger. | ||
| P17-1143 Despite the severity of the problem, there has been few NLP efforts focused on tackling ***** cybersecurity ***** | ||
| emotive | 10 | |
| 2018.gwc-1.18 The paper presents an approach to building a very large ***** emotive ***** lexicon for Polish based on plWordNet. | ||
| 2020.peoples-1.15 We show significant and consistent improvements in automatic classification across all languages and topics, as well as consistent (and expected) emotion distributions across all languages and topics, proving for the manually corrected lexicons to be a useful addition to the severely lacking area of emotion lexicons, the crucial resource for ***** emotive ***** analysis of text. | ||
| L08-1158 We evaluate the accuracy of placing newly found ***** emotive ***** words in one or more of the defined semantic dimensions. | ||
| 2019.gwc-1.43 In this paper we present a novel method for ***** emotive ***** propagation in a wordnet based on a large ***** emotive ***** seed. | ||
| 2015.lilt-12.3 Building upon techniques designed to analyze style and sentiment in texts, we examine elements of poetic craft such as imagery, sound devices, ***** emotive ***** language, and diction | ||
| efficient | 10 | |
| 2020.coling-main.597 We show that DILOG is 100x more data ***** efficient ***** than state-of-the-art neural approaches on MultiWoZ while achieving similar performance metrics. | ||
| 2021.insights-1.19 Semi-Supervised Variational Autoencoders (SSVAEs) are widely used models for data ***** efficient ***** learning. | ||
| 2021.naacl-main.405 Alternatives include energy-based models (which give up ***** efficient ***** sampling) and latent-variable autoregressive models (which give up ***** efficient ***** scoring of a given string). | ||
| 2021.naacl-demos.15 To overcome these problems, in this demo, we present Alexa Conversations, a new approach for building goal-oriented dialogue systems that is scalable, extensible as well as data ***** efficient *****. | ||
| N18-2057 Our approach is simple, ***** efficient ***** and has the benefit of being robust to semantic drift, a dominant problem in most semi-supervised learning systems | ||
| transformational | 10 | |
| 1998.amta-papers.30 We present a newly designed ***** transformational ***** system for the MT system LMT, consisting of a ***** transformational ***** formalism, LMT-TL, and an algorithm for applying transformations written in this formalism. | ||
| L14-1202 Furthermore, it is based on the notion of morphological tree structure and uses ***** transformational ***** rules which are attached to the leaf nodes. | ||
| 2021.cl-1.2 Abstract Steedman (2020) proposes as a formal universal of natural language grammar that grammatical permutations of the kind that have given rise to ***** transformational ***** rules are limited to a class known to mathematicians and computer scientists as the “separable” permutations. | ||
| L06-1240 MORPHEs morphology description language is based on two constructs: 1) a morphological form hierarchy, whose nodes relate and differentiate surface forms in terms of the common and distinguishing inflectional features of lexical items; and 2) ***** transformational ***** rules, attached to leaf nodes of the hierarchy, which generate the surface form of an item from the base form stored in the lexicon. | ||
| 1993.iwpt-1.3 In this paper we describe a new technique for parsing free text : a *****transformational***** grammar is automatically learned that is capable of accurately parsing text into binary - branching syntactic trees . | ||
| Lexical Complexity | 10 | |
| 2021.semeval-1.13 In this paper , we present our contribution in SemEval-2021 Task 1 : *****Lexical Complexity***** Prediction , where we integrate linguistic , statistical , and semantic properties of the target word and its context as features within a Machine Learning ( ML ) framework for predicting lexical complexity . | ||
| 2021.semeval-1.74 This paper describes our submission to the SemEval-2021 shared task on *****Lexical Complexity***** Prediction . | ||
| 2021.semeval-1.66 This paper describes our contribution to SemEval 2021 Task 1 ( Shardlow et al . , 2021 ): *****Lexical Complexity***** Prediction . | ||
| 2021.semeval-1.1 This paper presents the results and main findings of SemEval-2021 Task 1 - *****Lexical Complexity***** Prediction . | ||
| 2021.semeval-1.89 This paper presents our system for the single- and multi - word lexical complexity prediction tasks of SemEval Task 1 : *****Lexical Complexity***** Prediction . | ||
| Clinical TempEval | 10 | |
| W18-5607 The complexity of temporal representation in language is evident as results of the 2016 ***** Clinical TempEval ***** challenge indicate: the current state-of-the-art systems perform well in solving mention-identification tasks of event and time expressions but poorly in temporal relation extraction, showing a gap of around 0.25 point below human performance. | ||
| S17-2176 ***** Clinical TempEval ***** 2017 addressed the problem of temporal reasoning in the clinical domain by providing annotated clinical notes, pathology and radiology reports in line with ***** Clinical TempEval ***** challenges 2015/16, across two different evaluation phases focusing on cross domain adaptation. | ||
| S17-2093 Most tasks observed about a 20 point drop over ***** Clinical TempEval ***** 2016, where systems were trained and evaluated on the same domain (colon cancer) | ||
| S17-2180 *****Clinical TempEval***** 2017 ( SemEval 2017 Task 12 ) addresses the task of cross - domain temporal extraction from clinical text . | ||
| S17-2181 In this paper , we describe the system of the KULeuven - LIIR submission for *****Clinical TempEval***** 2017 . | ||
| objective | 10 | |
| 2021.acl-short.115 To be able to use BERTScore as a training ***** objective *****, we propose three approaches for generating soft predictions, allowing the network to remain completely differentiable end-to-end. | ||
| 2021.emnlp-main.466 When two related training examples share internal substructure, we add an additional training ***** objective ***** to encourage consistency between their latent decisions. | ||
| D19-5402 However, the existing models in this framework mostly rely on sentence-level rewards or suboptimal labels, causing a mismatch between a training ***** objective ***** and evaluation metric. | ||
| P19-1296 The proposed agreement module can be integrated into NMT as an additional training ***** objective ***** function and can also be used to enhance the representation of the source sentences | ||
| 2010.amta-papers.28 Although the scoring features of state - of - the - art Phrase - Based Statistical Machine Translation ( PB - SMT ) models are weighted so as to optimise an *****objective***** function measuring translation quality , the estimation of the features themselves does not have any relation to such quality metrics . | ||
| item | 10 | |
| 2021.emnlp-main.67 While prior work has addressed `cold start' estimation of ***** item ***** difficulties without piloting, we devise a multi-task generalized linear model with BERT features to jump-start these estimates, rapidly improving their quality with as few as 500 test-takers and a small sample of ***** item ***** exposures (6 each) from a large ***** item ***** bank (4,000 ***** item *****s). | ||
| N19-1018 This change makes it possible to construct any discontinuous constituency tree in exactly 4n–2 transitions for a sentence of length n. At each parsing step, the parser considers every ***** item ***** in the set to be combined with a focus ***** item ***** and to construct a new constituent in a bottom-up fashion. | ||
| C18-1286 We apply factorization machines, a widely used method in ***** item ***** recommendation, to model user preferences toward topics from the social media data. | ||
| C16-1306 We show that a system based on multiple text complexity features can predict ***** item ***** difficulty for several different ***** item ***** types and for some ***** item *****s achieves higher accuracy than human estimates of ***** item ***** difficulty. | ||
| 2020.bea-1.20 The results indicate that, for our sample, transfer learning can improve the prediction of ***** item ***** difficulty when response time is used as an auxiliary task but not the other way around | ||
| consistent | 10 | |
| 2021.acl-long.87 We show that our method has ***** consistent ***** improvement across datasets, fine-tuning tasks, and language model architectures. | ||
| 2021.acl-short.21 This question is challenging to answer in general, as there is no clear line between meaning and form, but rather meaning constrains form in ***** consistent ***** ways. | ||
| D18-1338 We propose a simple modification to the attention mechanism that eases the optimization of deeper models, and results in ***** consistent ***** gains of 0.7-1.1 BLEU on the benchmark WMT'14 English-German and WMT'15 Czech-English tasks for both architectures. | ||
| 2021.socialnlp-1.13 Drawing on these ***** consistent ***** differences, we build a classifier that can reliably identify the people more likely to persist, based on their language. | ||
| I17-5001 In this tutorial, we will give a review of each line of work, by contrasting them with traditional statistical methods, and organizing them in ***** consistent ***** orders | ||
| Tamil | 10 | |
| 2021.ltedi-1.29 The data for this shared task is provided in English, ***** Tamil *****, and Malayalam which was collected from YouTube comments. | ||
| 2021.ltedi-1.25 We propose three distinct models to identify hope speech in English, ***** Tamil ***** and Malayalam language to serve this purpose. | ||
| 2020.peoples-1.5 Thus, we have constructed a Hope Speech dataset for Equality, Diversity and Inclusion (HopeEDI) containing user-generated comments from the social media platform YouTube with 28,451, 20,198 and 10,705 comments in English, ***** Tamil ***** and Malayalam, respectively, manually labelled as containing hope speech or not. | ||
| 2021.ltedi-1.12 The datasets were provided by the LT-EDI organisers in English, ***** Tamil *****, and Malayalam language with texts sourced from YouTube comments. | ||
| 2021.ltedi-1.14 The YouTube comments are available in English, ***** Tamil ***** and Malayalam languages and are part of the task “EACL-2021:Hope Speech Detection for Equality, Diversity and Inclusion” | ||
| contextualized representations | 10 | |
| 2020.emnlp-main.125 We address the phrase alignment problem by combining an unordered tree mapping algorithm and phrase representation modelling that explicitly embeds the similarity distribution in the sentences onto powerful ***** contextualized representations *****. | ||
| 2021.emnlp-main.440 As such, a novel filtering mechanism is presented to facilitate the learning of word category representations from ***** contextualized representations ***** on input texts based on adversarial learning. | ||
| D19-1053 Our findings suggest that metrics combining ***** contextualized representations ***** with a distance measure perform the best. | ||
| 2020.coling-main.371 Our proposed model leverages effective word embeddings trained on one hundred different languages to generate ***** contextualized representations *****. | ||
| 2021.acl-short.96 In this paper, we combine ***** contextualized representations ***** with neural topic models. | ||
| unsupervised methods | 10 | |
| W18-3108 We compare supervised and ***** unsupervised methods ***** to assign predefined categories at message level. | ||
| N18-2075 Previous work on text segmentation focused on ***** unsupervised methods ***** such as clustering or graph search, due to the paucity in labeled data. | ||
| 2021.emnlp-main.791 Compared to the ***** unsupervised methods *****, the supervised ones make less assumptions about optimization objectives and usually achieve better results. | ||
| W18-0909 Most of the ***** unsupervised methods ***** directed at detection of metaphors use some hand-coded knowledge. | ||
| 2020.acl-main.274 Results on standard benchmark demonstrate the effectiveness of our proposed method, which substantially outperforms previous ***** unsupervised methods *****. | ||
| corpus based | 10 | |
| 2020.winlp-1.1 Using PPMI with threshold value of 100 and 200, we got ***** corpus based ***** Amharic Sentiment lexicons of size 1811 and 3794 respectively by expanding 519 seeds. | ||
| L14-1474 This dictionary is built depending on WordNet and ***** corpus based ***** approaches, in a specially designed linguistic environment called UNLariam that is developed by the UNLD foundation. | ||
| W17-1202 In this work, we propose an information-theoretic approach to geographic language variation using a ***** corpus based ***** on Twitter. | ||
| W17-3208 Previous work has noted that sorting the ***** corpus based ***** on the sentence length before making mini-batches reduces the amount of padding and increases the processing speed. | ||
| 2020.alw-1.17 Second, we present a publicly available ***** corpus based ***** on our taxonomy, with 39.8k human annotated comments extracted from Reddit. | ||
| improving | 10 | |
| D18-1270 This enables our approach to: (a) augment the limited supervision in the target language with additional supervision from a high-resource language (like English), and (b) train a single entity linking model for multiple languages, ***** improving ***** upon individually trained models for each language. | ||
| 2021.acl-long.523 We show that Polyjuice produces diverse sets of realistic counterfactuals, which in turn are useful in various distinct applications: ***** improving ***** training and evaluation on three different tasks (with around 70% less annotation effort than manual generation), augmenting state-of-the-art explanation techniques, and supporting systematic counterfactual error analysis by revealing behaviors easily missed by human experts. | ||
| W19-6129 Finally we discuss the dataset in light of the results and point to future research and plans for further ***** improving ***** both the dataset and methods of predicting prosodic prominence from text. | ||
| Q15-1030 We suggest a method for determining the correct labels of the clustering outcomes, and then use the labels for voting, ***** improving ***** the accuracy even further. | ||
| 2005.mtsummit-papers.31 We show that the parts of a sentence that are automatically identified as nonmachine-translatable provide useful information for paraphrasing or revising the sentence in the source language, thus ***** improving ***** the quality of the final translation. | ||
| cases | 10 | |
| 2021.naacl-demos.12 It is designed to be a general-purpose tool with a wide variety of use ***** cases *****. | ||
| 2020.emnlp-main.502 We devise a complementary framework, under which a pattern-based and a distributional model collaborate seamlessly in ***** cases ***** which they each prefer. | ||
| N18-2084 We show that such embeddings can be surprisingly effective in some ***** cases ***** – providing gains of up to 20 BLEU points in the most favorable setting. | ||
| C16-1320 As an additional objective, we discuss two novel use ***** cases ***** including automatically extracting links to public datasets from the proceedings, which would further accelerate the advancement in digital libraries. | ||
| 2010.amta-commercial.5 In this paper, we discuss some of our further use ***** cases *****, and the varying requirements each use case has for quality, customization, cost, and other factors. | ||
| semantic categories | 10 | |
| 2021.eacl-main.120 We first construct a dataset of apparitions of lexical collocations in context, categorized into 17 representative ***** semantic categories *****. | ||
| L14-1009 This paper presents a logical formalization of a set 20 ***** semantic categories ***** related to opinion, emotion and sentiment. | ||
| 2021.emnlp-main.467 Nevertheless, they share a common weakness: sentences in a contradiction pair are not necessarily from different ***** semantic categories *****. | ||
| 2021.naacl-main.427 Unsupervised clustering aims at discovering the ***** semantic categories ***** of data according to some distance measured in the representation space. | ||
| L06-1341 Finally, we evaluate the generalizability of STEP to other ***** semantic categories ***** on the example of the category of words denoting increase/decrease in magnitude, intensity or quality of some state or process. | ||
| reviewers | 10 | |
| 2020.findings-emnlp.112 We argue that a part of the problem is that the ***** reviewers ***** and area chairs face a poorly defined task forcing apples-to-oranges comparisons. | ||
| 2021.acl-long.30 Second, as ***** reviewers ***** often disagree on the pros and cons of a given product, summarizers sometimes yield inconsistent, self-contradicting summaries. | ||
| 2020.lrec-1.29 A common approach to this problem is to ask multiple ***** reviewers ***** to evaluate the same artifacts. | ||
| N19-1219 In this work, we study the content and structure of peer reviews under the argument mining framework, through automatically detecting (1) the argumentative propositions put forward by ***** reviewers *****, and (2) their types (e.g., evaluating the work or making suggestions for improvement). | ||
| P18-2079 As more and more academic papers are being submitted to conferences and journals, evaluating all these papers by professionals is time-consuming and can cause inequality due to the personal factors of the ***** reviewers *****. | ||
| sense embeddings | 10 | |
| W19-0421 In this work, we study the effect of multi-***** sense embeddings ***** on the task of reverse dictionaries. | ||
| K17-1012 We address this issue by proposing a new model which learns word and ***** sense embeddings ***** jointly. | ||
| D18-1025 This paper proposes a modularized sense induction and representation learning model that jointly learns bilingual ***** sense embeddings ***** that align well in the vector space, where the cross-lingual signal in the English-Chinese parallel corpus is exploited to capture the collocation and distributed characteristics in the language pair. | ||
| W17-2613 We present a multi-view Bayesian non-parametric algorithm which improves multi-sense wor d embeddings by (a) using multilingual (i.e., more than two languages) corpora to significantly improve ***** sense embeddings ***** beyond what one achieves with bilingual information, and (b) uses a principled approach to learn a variable number of senses per word, in a data-driven manner. | ||
| L16-1421 Word ***** sense embeddings ***** represent a word sense as a low-dimensional numeric vector. | ||
| samples | 10 | |
| 2020.emnlp-main.95 Our approach has two variations: Intra-LADA and Inter-LADA, where Intra-LADA performs interpolations among tokens within one sentence, and Inter-LADA ***** samples ***** different sentences to interpolate. | ||
| L16-1741 Previously, a seniors' speech corpus named S-JNAS was developed, but the average age of the participants was 67.6 years, but the target age for nursing home care is around 75 years old, much higher than that of the S-JNAS ***** samples *****. | ||
| 2020.nl4xai-1.4 Weighted ***** samples ***** are used to overcome class imbalanced data. | ||
| W19-3015 Speech ***** samples ***** were obtained from healthy controls and patients with a diagnosis of schizophrenia or schizoaffective disorder and different severity of positive formal thought disorder. | ||
| 2020.acl-main.345 We use a state-of-the-art end-to-end ASR system, comprising convolutional and recurrent layers, that is trained on a large amount of US-accented English speech and evaluate the model on speech ***** samples ***** from seven different English accents. | ||
| spatial information | 10 | |
| L16-1121 Special features of this corpus are the laryngograph recordings (representing glottograms required to detect a speaker's instantaneous fundamental frequency and pitch), corresponding clean-speech recordings, and ***** spatial information ***** and video data provided by four Kinects and a camera. | ||
| 2021.acl-srw.9 For spatial features, we propose a hierarchical attention network to model the ***** spatial information ***** from object-level to video-level. | ||
| L16-1604 The resulting annotations indicate whether entities are or are not located somewhere with a degree of certainty, and temporally anchor this ***** spatial information *****. | ||
| W18-5405 The method is illustrated on a medical dataset where the correct representation of ***** spatial information ***** and shorthands are of particular importance. | ||
| L12-1649 A significant amount of ***** spatial information ***** in textual documents is hidden within the relationship between events. | ||
| syntactic constraints | 10 | |
| L12-1613 We present an approach to the description of Polish Multi-word Expressions (MWEs) which is based on expressions in the WCCL language of morpho-***** syntactic constraints ***** instead of grammar rules or transducers. | ||
| N19-1320 Our evaluator is an adversarially trained style discriminator with semantic and ***** syntactic constraints ***** that score the generated sentence for style, meaning preservation, and fluency. | ||
| 2020.conll-1.39 In this work, we probe for highly abstract ***** syntactic constraints ***** that have been claimed to govern the behavior of filler-gap dependencies across different surface constructions. | ||
| L08-1096 The surface realization, the surface level, is realized according to each language ***** syntactic constraints *****. | ||
| 1993.iwpt-1.10 The modified LR table reflects both the morphological and ***** syntactic constraints *****. | ||
| semantic hierarchy | 10 | |
| W19-8662 For this purpose, we turn input sentences into a two-layered ***** semantic hierarchy ***** in the form of core facts and accompanying contexts, while identifying the rhetorical relations that hold between them. | ||
| C16-1289 In order to accurately learn the ***** semantic hierarchy ***** of a bilingual phrase, we develop a recursive neural network to constrain the learned bilingual phrase structures to be consistent with word alignments. | ||
| 1998.amta-papers.18 In this paper, we address issues related to problems in building a ***** semantic hierarchy ***** from machine-readable dictionaries: genus disambiguation, discovery of covert categories, and bilingual taxonomy. | ||
| 2020.lrec-1.359 The wordnet is built using the `expansion' approach (Vossen, 1998), leveraging on the Princeton Wordnet's core synsets and ***** semantic hierarchy *****, as well as scientific names. | ||
| L16-1685 The lexical ***** semantic hierarchy ***** pioneered by Princeton Wordnet has traditionally restricted its coverage to referential and contentful classes of words: such as nouns, verbs, adjectives and adverbs. | ||
| messages | 10 | |
| L16-1200 The problem of understanding the stream of ***** messages ***** exchanged on social media such as Facebook and Twitter is becoming a major challenge for automated systems. | ||
| P19-1239 It is insufficient to detect sarcasm from multi-model ***** messages ***** based only on texts. | ||
| 2020.emnlp-main.512 Conversation disentanglement aims to separate intermingled ***** messages ***** into detached conversations. | ||
| 2020.emnlp-main.22 Linguistic steganography studies how to hide secret ***** messages ***** in natural language cover texts. | ||
| L12-1670 The primary data collection is done in the form of large number of ***** messages ***** as part of Personal communication among natives of Hindi language and Indian speakers of English. | ||
| 10 | ||
| D19-1250 The code to reproduce our analysis is available at https://github.com/***** facebook *****research/LAMA. | ||
| D19-1632 Data and code to reproduce our experiments are available at https://github.com/***** facebook *****research/flores. | ||
| 2020.emnlp-main.519 Our code and models are available at https://github.com/***** facebook *****research/BLINK. | ||
| 2020.trac-1.14 In the last few years, hate speech and aggressive comments have covered almost all the social media platforms like ***** facebook *****, twitter etc. | ||
| W18-4408 Our system on English ***** facebook ***** and social media obtained F1 score of 0.5151 and 0.5099 respectively where Hindi ***** facebook ***** and social media obtained F1 score of 0.5599 and 0.3790 respectively. | ||
| knowledge management | 10 | |
| 2020.lrec-1.264 The research question of this paper is: How to address the resource bottleneck problem of creating specialist ***** knowledge management ***** systems? | ||
| L04-1173 One of the major obstacles for ***** knowledge management ***** remains MultiWord Terminology (MWT). | ||
| L08-1188 In this work we present a novel web browser extension which combines several features coming from the worlds of terminology and information extraction, semantic annotation and ***** knowledge management *****, to support users in the process of both keeping track of interesting information they find on the web, and organizing its associated content following knowledge representation standards offered by the Semantic Web | ||
| L06-1080 To meet a variety of needs in information modeling, software development and integration as well as ***** knowledge management ***** and reuse, various groups within industry, academia, and government have been developing and deploying sharable and reusable models known as ontologies. | ||
| L08-1271 An ontological ***** knowledge management ***** system requires dynamic and encapsulating operation in order to share knowledge among communities. | ||
| platforms | 10 | |
| W19-3502 Interactions among users on social network ***** platforms ***** are usually positive, constructive and insightful. | ||
| W17-8105 The E-platform integrates: 1/ an environment for creating, organizing and maintaining electronic text archives, for extracting text corpora and aligning corpora; 2/ a linguistic database; 3/ a concordancer; 4/ a set of modules for the generation and editing of practice exercises for each text or corpus; 5/ functionalities for export from the platform and import to other educational ***** platforms *****. | ||
| 2021.dravidianlangtech-1.25 Offensive language detection in the various social media ***** platforms ***** was identified previously. | ||
| L16-1258 Therefore, large amounts of self-assessed data on MBTI are readily available on social-media ***** platforms ***** such as Twitter. | ||
| 2021.acl-short.133 Among social media ***** platforms *****, Reddit has emerged as the most promising one due to its anonymity and its focus on topic-based communities (subreddits) that can be indicative of someone's state of mind or interest regarding mental health disorders such as r/SuicideWatch, r/Anxiety, r/depression. | ||
| headline | 10 | |
| S17-2148 This paper describes our system for fine-grained sentiment scoring of news ***** headline *****s submitted to SemEval 2017 task 5–subtask 2. | ||
| W18-4305 The evaluation result shows that 26% ***** headline *****s do not in-clude health claims, and all extractors face difficulty separating them from the rest. | ||
| W19-8910 HEvAS provides two types of metrics– one which measures the informativeness of a ***** headline *****, and another that measures its readability. | ||
| 2020.aespen-1.7 The multi-task convolutional neural network is shown to be capable of recognizing events and event coreferences given the ***** headline *****s' texts and publication dates. | ||
| 2020.semeval-1.104 Various factors need to be accounted for in order to assess the funniness of an edited ***** headline *****. | ||
| online shopping | 10 | |
| 2021.ranlp-1.109 Online reviews are an essential aspect of ***** online shopping ***** for both customers and retailers. | ||
| 2021.ecnlp-1.18 The accuracy of an ***** online shopping ***** system via voice commands is particularly important and may have a great impact on customer trust. | ||
| 2021.naacl-industry.37 Item categorization is an important application of text classification in e-commerce due to its impact on the ***** online shopping ***** experience of users. | ||
| 2020.ecnlp-1.11 In recent years, there has been an increase in ***** online shopping ***** resulting in an increased number of online reviews. | ||
| 2021.eacl-main.197 For example, when using voice to search an ***** online shopping ***** site, a user often needs to refine their search by some aspect or facet. | ||
| situations | 10 | |
| 2020.acl-main.54 The goal-oriented dialogue system needs to be optimized for tracking the dialogue flow and carrying out an effective conversation under various ***** situations ***** to meet the user goal. | ||
| 2020.emnlp-main.592 Unfortunately, when scaling NER to open ***** situations *****, these advantages may no longer exist. | ||
| 2021.econlp-1.9 In decision making in the economic field, an especially important requirement is to rapidly understand news to absorb ever-changing economic ***** situations *****. | ||
| L12-1036 We enhanced this system in order to properly react on non-understandings in real-life ***** situations ***** where intuitive communication is required. | ||
| W18-6101 Sociolinguistics is often concerned with how variants of a linguistic item (e.g., nothing vs. nothin') are used by different groups or in different ***** situations *****. | ||
| speech act | 10 | |
| 2020.peoples-1.7 Furthermore, we propose contextual augmentation of pretrained language models for emotion recognition in conversations, which is to consider not only previous utterances, but also conversation-related information such as speakers, ***** speech act *****s and topics. | ||
| L12-1544 This collection focuses on satisfying user information needs for queries associated with specific types of ***** speech act *****s. | ||
| L08-1545 These lexical resources are in the form of sub-categorization frames, verb knowledge bases and rule templates for establishing semantic relations and ***** speech act ***** like attributes. | ||
| L10-1546 Over the years Mixer collections have grown to include socio-linguistic interviews, a wide variety of telephone conditions and multiple languages, recording conditions, channels and ***** speech act *****s.. Mixer 6 was the most recent collection. | ||
| W17-4601 Silence is sized and placed within the conversation flow and it is coordinated by the speakers along with the other ***** speech act *****s. | ||
| attention weights | 10 | |
| 2021.acl-long.94 Current empirical studies provide shreds of evidence that ***** attention weights ***** are not explanations by proving that they are not unique. | ||
| 2021.cmcl-1.26 We fine-tune BERT for relation extraction with auxiliary attention supervision in which BERT's ***** attention weights ***** are supervised by cognitive data. | ||
| 2021.acl-demo.16 Dodrio tightly integrates an overview that summarizes the roles of different attention heads, and detailed views that help users compare ***** attention weights ***** with the syntactic structure and semantic information in the input text. | ||
| N19-1148 Further, the ***** attention weights ***** in the learned model confirm that the model finds expected linguistic patterns for each category. | ||
| 2020.acl-main.385 We propose two methods for approximating the attention to input tokens given ***** attention weights *****, attention rollout and attention flow, as post hoc methods when we use ***** attention weights ***** as the relative relevance of the input tokens. | ||
| focus | 10 | |
| L14-1708 Workflow languages ***** focus ***** on expressive power of the languages to describe variety of workflow patterns to meet users' needs. | ||
| D19-5809 To properly generate a question coherent to the grounding text and the current conversation history, the proposed framework first locates the ***** focus ***** of a question in the text passage, and then identifies the question pattern that leads the sequential generation of the words in a question. | ||
| 2021.mmsr-1.5 However, a lot of work mainly ***** focus *****ed on models trained for uni-modal tasks, e.g. | ||
| L12-1611 We ***** focus ***** specifically on speech and gesture interaction which can enhance the quality of lifestyle of people living in assistive environments, be they seniors or people with physical or cognitive disabilities. | ||
| 2021.alta-1.19 In developing systems to identify *****focus***** entities in scientific literature , we face the problem of discriminating key entities of interest from other potentially relevant entities of the same type mentioned in the articles . | ||
| extracting semantic | 10 | |
| 2021.blackboxnlp-1.19 We achieve this by ***** extracting semantic *****ally related words from pre-trained word representations as input features to the TM. | ||
| 2021.naacl-main.37 To this end, we propose a procedure of ***** extracting semantic ***** triples from tables that encodes their structures by exploiting the semantic dependencies among table headers and the table title. | ||
| D19-1486 However, it has two unaddressed limitations: (1) it cannot deal with polysemy when ***** extracting semantic ***** capsules; (2) it hardly recognizes the utterances of unseen intents in the generalized zero-shot intent classification setting. | ||
| S18-1130 Additionally, we utilized the model trained with TEES for ***** extracting semantic ***** relations from biomedical abstracts, for which we present a preliminary evaluation. | ||
| L14-1690 These assumptions motivated the work depicted in this paper, aiming at the establishment and use of lexical-syntactic patterns for ***** extracting semantic ***** relations for Portuguese from corpora, part of a larger ongoing project for the semi-automatic extension of WordNet.PT. | ||
| text quality | 10 | |
| L08-1392 We discuss the problems encountered in the implementation of each approach in the context of the literature, and propose that a test based on the Turing test for machine intelligence offers a way forward in the evaluation of the subjective notion of ***** text quality *****. | ||
| W17-5031 We present a very simple model for ***** text quality ***** assessment based on a deep convolutional neural network, where the only supervision required is one corpus of user-generated text of varying quality, and one contrasting text corpus of consistently high quality. | ||
| P18-1219 To do this, we first we model ***** text quality ***** as a function of three properties - organization, coherence and cohesion. | ||
| D19-1053 In this paper we investigate strategies to encode system and reference texts to devise a metric that shows a high correlation with human judgment of ***** text quality *****. | ||
| D17-1227 In neural text generation such as neural machine translation, summarization, and image captioning, beam search is widely used to improve the output ***** text quality *****. | ||
| text spans | 10 | |
| C18-1048 Implicit discourse relation recognition is a challenging task as the relation prediction without explicit connectives in discourse parsing needs understanding of ***** text spans ***** and cannot be easily derived from surface features from the input sentence pairs. | ||
| D19-1243 In stark contrast to most existing reading comprehension datasets where the questions focus on factual and literal understanding of the context paragraph, our dataset focuses on reading between the lines over a diverse collection of people's everyday narratives, asking such questions as “what might be the possible reason of ...?”, or “what would have happened if ...” that require reasoning beyond the exact ***** text spans ***** in the context. | ||
| W17-0807 Intrinsic evaluation of the created scheme confirmed its potential contribution to the consistent classification of identified erroneous ***** text spans *****, achieving visibly higher Cohen's kappa values, up to 0.831, than previous work. | ||
| 2020.wnut-1.77 Our system is trained on the COVID-19 Twitter Event Corpus and is able to identify relevant ***** text spans ***** that answer pre-defined questions (i.e., slot types) for five COVID-19 related events (i.e., TESTED POSITIVE, TESTED NEGATIVE, CAN-NOT-TEST, DEATH and CURE & PREVENTION). | ||
| L14-1145 We present in detail the process of text selection, annotation process and the contents of the corpus, which includes both abstract free-word summaries, as well as extraction-based summaries created by selecting ***** text spans ***** from the original document. | ||
| spoken language processing | 10 | |
| N18-5020 The system architecture consists of several components including ***** spoken language processing *****, dialogue management, language generation, and content management, with emphasis on user-centric and content-driven design. | ||
| W89-0224 Application of this type of connectionist model to the area of ***** spoken language processing ***** is discussed. | ||
| L12-1589 Large-scale spontaneous speech corpora are crucial resource for various domains of ***** spoken language processing *****. | ||
| L06-1340 In addition to the word spoken, the prosodic content of the speech has been proved quite valuable in a variety of ***** spoken language processing ***** tasks such as sentence segmentation and tagging, disfluency detection, dialog act segmentation and tagging, and speaker recognition. | ||
| L16-1297 Automatic Speech recognition (ASR) is one of the most widely used components in ***** spoken language processing ***** applications. | ||
| text analytics | 10 | |
| 2020.insights-1.14 Non-negative Matrix Factorization (NMF) has been used for ***** text analytics ***** with promising results. | ||
| L14-1126 assisted by machine translation and ***** text analytics ***** services, to explain how linked data can support such active curation. | ||
| 2020.lrec-1.202 Sentiment classification is an important aspect of general ***** text analytics *****. | ||
| 2021.naacl-srw.3 Despite many advances in ***** text analytics *****, negation resolution remains an acute and continuously researched question in Natural Language Processing. | ||
| 2020.acl-main.72 We propose a methodology to construct a term dictionary for ***** text analytics ***** through an interactive process between a human and a machine, which helps the creation of flexible dictionaries with precise granularity required in typical text analysis. | ||
| joint inference | 10 | |
| D19-3044 The ***** joint inference ***** over morphology and syntax substantially limits error propagation, and leads to high accuracy. | ||
| C16-1308 However, the rich features that are important for this task are typically very hard to explicitly encode as MLN formulas since they significantly increase the size of the MLN, thereby making ***** joint inference ***** and learning infeasible. | ||
| W17-5107 We design a ***** joint inference ***** method for the task by modeling argument relation classification and stance classification jointly. | ||
| C16-1285 We present a novel way for designing complex ***** joint inference ***** and learning models using Saul (Kordjamshidi et al., 2015), a recently-introduced declarative learning-based programming language (DeLBP). | ||
| Q18-1004 The interpretation of spatial references is highly contextual, requiring ***** joint inference ***** over both language and the environment. | ||
| black box | 10 | |
| 2020.emnlp-main.255 To demystify the “***** black box *****” property of deep neural networks for natural language processing (NLP), several methods have been proposed to interpret their predictions by measuring the change in prediction probability after erasing each token of an input. | ||
| 2020.emnlp-main.548 Existing works ignore the complicated reasoning process and solve it with a one-step “***** black box *****” model. | ||
| W19-5938 Despite recent attempts in the field of explainable AI to go beyond ***** black box ***** prediction models, typically already the training data for supervised machine learning is collected in a manner that treats the annotator as a “***** black box *****”, the internal workings of which remains unobserved. | ||
| 2020.emnlp-main.498 We present BAE, a ***** black box ***** attack for generating adversarial examples using contextual perturbations from a BERT masked language model. | ||
| 2021.eacl-main.320 The proposed approach and paradigm are evaluated on the Librispeech dataset and a commercial (***** black box *****) ASR system, Google Cloud's Speech-to-Text API. | ||
| topic modelling | 10 | |
| W18-5902 The occurrence of stance-taking towards vaccination was measured in documents extracted by ***** topic modelling ***** from two different corpora, one discussion forum corpus and one tweet corpus. | ||
| 2021.adaptnlp-1.5 Supervised classification is limited by unchangeable class labels that may not be relevant to new events, and unsupervised ***** topic modelling ***** by insufficient prior knowledge. | ||
| L16-1042 In this paper we evaluate a number of measures of corpus similarity, including a method based on ***** topic modelling ***** which has not been previously evaluated for this task. | ||
| D19-6611 The proposed models combine ***** topic modelling ***** and dictionary-based approach. | ||
| 2020.latechclfl-1.11 But with that data, ***** topic modelling ***** and sentiment analysis can then be applied to show trends, for instance, that despite the horrors of war, Australians in WW1 primarily wrote about their everyday routines and experiences. | ||
| bases | 10 | |
| C16-1258 When processing arguments in online user interactive discourse, it is often necessary to determine their ***** bases ***** of support. | ||
| 2021.dash-1.1 Domain-specific conceptual ***** bases ***** use key concepts to capture domain scope and relevant information. | ||
| 2021.emnlp-main.292 We avoid crucial assumptions of previous work that do not transfer well to real-world settings, including exploiting knowledge of the fixed number of retrieval steps required to answer each question or using structured metadata like knowledge ***** bases ***** or web links that have limited availability. | ||
| W18-4904 We investigate whether different MWEs have distinct neural ***** bases *****, e.g. | ||
| D18-1359 Existing approaches , however , primarily focus on simple link structure between a finite set of entities , ignoring the variety of data types that are often used in knowledge *****bases***** , such as text , images , and numerical values . | ||
| text matching | 10 | |
| 2021.emnlp-main.312 Impressive milestones have been achieved in ***** text matching ***** by adopting a cross-attention mechanism to capture pertinent semantic connections between two sentence representations. | ||
| 2020.coling-main.568 Recently, pre-trained language models such as BERT have shown state-of-the-art accuracies in ***** text matching *****. | ||
| 2020.findings-emnlp.191 Using MedICaT, we introduce the task of subfigure to subcaption alignment in compound figures and demonstrate the utility of inline references in image-***** text matching *****. | ||
| C16-1222 Term co-occurrence in a sentence or paragraph is a powerful and often overlooked feature for ***** text matching ***** in document retrieval. | ||
| 2020.emnlp-main.194 We study the zero - shot transfer capabilities of *****text matching***** models on a massive scale , by self - supervised training on 140 source domains from community question answering forums in English . | ||
| influence | 10 | |
| W19-4447 In view of the ***** influence ***** of the first language on learners, we further propose an effective approach to improve the quality of the suggested sentences. | ||
| R19-1064 In the second place we investigate if language ***** influence *****s the similarity computation. | ||
| 2020.emnlp-main.396 Despite being a staple of NLP, and sharing a common structure, there is little insight on how these tasks' properties ***** influence ***** their difficulty, and thus little guidance on what model families work well on span ID tasks, and why. | ||
| 2020.ecomnlp-1.5 Managerial responses to such reviews provide businesses with the opportunity to ***** influence ***** the public discourse and to attain improved ratings over time. | ||
| N19-1263 It draws inspiration from traditional template-based text generation techniques, where the source provides the content (i.e., what to say), and the template ***** influence *****s how to say it. | ||
| expressive speech | 10 | |
| L12-1513 A further aim of this study is to investigate the usability of audiobooks as a language resource for ***** expressive speech ***** synthesis of utterances of conversational speech. | ||
| L06-1357 The database has been designed for speech synthesis, speech conversion and ***** expressive speech *****. | ||
| L08-1084 A Hungarian multimodal spontaneous ***** expressive speech ***** corpus was recorded following the methodology of a similar French corpus. | ||
| L14-1566 A three-step method was used for recording both - the high-activation and low-activation ***** expressive speech ***** databases. | ||
| 2020.coling-main.440 As smart speakers and conversational robots become ubiquitous , the demand for *****expressive speech***** synthesis has increased . | ||
| graph parsing | 10 | |
| 2020.acl-main.377 A key problem in processing graph-based meaning representations is ***** graph parsing *****, i.e. | ||
| D17-1003 Experiments demonstrate that second-order features are helpful for Maximum Sub***** graph parsing *****. | ||
| K19-2004 Our system uses a graph-based approach to model a variety of semantic ***** graph parsing ***** tasks. | ||
| W17-6317 We evaluate our algorithm on PCFG, PTAG, and ***** graph parsing *****. | ||
| 2020.iwpt-1.3 We present a neural end - to - end architecture for negation resolution based on a formulation of the task as a *****graph parsing***** problem . | ||
| spaces | 10 | |
| 2020.semeval-1.30 It consists of preparing a semantic vector space for each corpus, earlier and later; computing a linear transformation between earlier and later ***** spaces *****, using Canonical Correlation Analysis and orthogonal transformation;and measuring the cosines between the transformed vector for the target word from the earlier corpus and the vector for the target word in the later corpus. | ||
| 2020.vardial-1.6 However, these approaches require cross-lingual information such as seed dictionaries to train the model and find a linear transformation between the word embedding ***** spaces *****. | ||
| 2021.naacl-main.42 In this paper, we propose MetaXL, a meta-learning based framework that learns to transform representations judiciously from auxiliary languages to a target one and brings their representation ***** spaces ***** closer for effective transfer. | ||
| 2021.acl-long.139 Separately embedding the individual knowledge sources into vector ***** spaces ***** has demonstrated tremendous successes in encoding the respective knowledge, but how to jointly embed and reason with both knowledge sources to fully leverage the complementary information is still largely an open problem. | ||
| P19-1312 Despite its remarkable results, unsupervised mapping is also well-known to be limited by the original dissimilarity between the word embedding ***** spaces ***** to be mapped. | ||
| human annotation | 10 | |
| 2020.wnut-1.15 We evaluate the approach by comparing the results to TF-IDF using the discounted cumulative gain metric with ***** human annotation *****s, finding our method outperforms TF-IDF on information retrieval. | ||
| 2021.emnlp-main.216 Low-resource Relation Extraction (LRE) aims to extract relation facts from limited labeled corpora when ***** human annotation ***** is scarce. | ||
| 2020.acl-main.423 Accordingly, we attain a lexical-semantic level language model, without the use of ***** human annotation *****. | ||
| N19-1028 Finally, ***** human annotation ***** is used to remove any false positive in these matched triples. | ||
| 2020.acl-main.124 We study unsupervised multi-document summarization evaluation metrics, which require neither human-written reference summaries nor ***** human annotation *****s (e.g. | ||
| notion | 10 | |
| 1997.iwpt-1.7 This paper introduces a novel approach to parsing TAG, namely one that explores how D-theoretic ***** notion *****s may be applied to TAG parsing. | ||
| W18-3028 We show that the ***** notion *****s of concreteness and imageability are highly predictable both within and across languages, with a moderate loss of up to 20% in correlation when predicting across languages. | ||
| 2021.eval4nlp-1.1 To address this shortcoming, we introduce the ***** notion ***** of differential evaluation which effectively defines a pragmatic partition of instances into gradually more difficult bins by leveraging the predictions made by a set of systems. | ||
| L10-1088 Following earlier work of (Hobbs, 1998) and (Davidson, 1967) ***** notion ***** of reification, we extend the logical account of Concession originally proposed in (Robaldo et al., 2008) to provide refined formal descriptions for the first three mentioned sources of expectations in Concessive relations. | ||
| L12-1247 We introduce the ***** notion ***** of semantic neighborhoods, which are exploited for the computation of semantic similarity. | ||
| word representation learning | 10 | |
| K19-1021 Recent work has validated the importance of subword information for ***** word representation learning *****. | ||
| N19-1097 The use of subword-level information (e.g., characters, character n-grams, morphemes) has become ubiquitous in modern ***** word representation learning *****. | ||
| P17-1187 In this paper, we present that, word sememe information can improve ***** word representation learning ***** (WRL), which maps words into a low-dimensional semantic space and serves as a fundamental step for many NLP tasks. | ||
| P18-2090 Negative sampling is an important component in word2vec for distributed ***** word representation learning *****. | ||
| I17-1024 To enhance the expression ability of distributional ***** word representation learning ***** model, many researchers tend to induce word senses through clustering, and learn multiple embedding vectors for each word, namely multi-prototype word embedding model. | ||
| annotation process | 10 | |
| R17-1022 Our objective in this paper is to show the pre-***** annotation process *****, as well as to evaluate the usability of subjective and polarity information in this process. | ||
| L14-1596 This article presents the methods, results, and precision of the syntactic ***** annotation process ***** of the Rhapsodie Treebank of spoken French. | ||
| L10-1476 After a short introduction and a description of related work, we illustrate the ***** annotation process *****, including a description of the annotation methodology and the developed tool for the ***** annotation process *****. | ||
| 2020.eval4nlp-1.8 We describe the ***** annotation process ***** in detail and compare it with other similar evaluation systems. | ||
| L14-1273 We then present the selection of texts and distribution between genres, as well as the ***** annotation process ***** and an evaluation of the inter-annotator agreement. | ||
| knowledge transfer | 10 | |
| 2021.blackboxnlp-1.10 We find that our method can successfully be used to measure visual ***** knowledge transfer ***** capabilities in models and that our novel model architecture shows promising results for leveraging multimodal knowledge in a unimodal setting. | ||
| 2020.acl-main.291 In this paper, we proposed a Semantic-Emotion Knowledge Transferring (SEKT) model for cross-target stance detection, which uses the external knowledge (semantic and emotion lexicons) as a bridge to enable ***** knowledge transfer ***** across different targets. | ||
| P19-1236 To address this issue, we consider using cross-domain LM as a bridge cross-domains for NER domain adaptation, performing cross-domain and cross-task ***** knowledge transfer ***** by designing a novel parameter generation network. | ||
| D19-6122 As shown in previous work, critical to this distillation procedure is the construction of an unlabeled transfer dataset, which enables effective ***** knowledge transfer *****. | ||
| D19-1153 In addition, characteristic differences between the source and target languages raise a natural question of whether source data selection can improve the ***** knowledge transfer *****. | ||
| corpus of contemporary | 10 | |
| 2020.latechclfl-1.7 In this paper we describe an approach for the computer-aided identification of Shakespearean intertextuality in a ***** corpus of contemporary ***** fiction. | ||
| L04-1264 The CGN corpus (Corpus Gesproken Nederlands/Corpus Spoken Dutch) is a large speech ***** corpus of contemporary ***** Dutch as spoken in Belgium (3.3 million words) and in the Netherlands (5.6 million words). | ||
| L10-1104 This annotation effort is carried out in the framework of a larger project which aims at the collection of a 500-million word ***** corpus of contemporary ***** Dutch, covering the variants used in the Netherlands and Flanders, the Dutch speaking part of Belgium. | ||
| 2020.iwltp-1.13 It also incorporates the search engines for the large COROLA reference ***** corpus of contemporary ***** Romanian and the Romanian wordnet. | ||
| L14-1311 We present the project of creating CoRoLa, a reference ***** corpus of contemporary ***** Romanian (from 1945 onwards). | ||
| build | 10 | |
| L12-1114 The corpus, available in web, is already being used to ***** build ***** a semantic tagger for Portuguese language. | ||
| 2008.amta-papers.19 We also ***** build ***** a cascaded translation model that dynamically shifts translation units from phrase level to word and morpheme phrase levels. | ||
| W19-5004 In this work, we ***** build ***** a unifying framework for RE, applying this on three highly used datasets (from the general, biomedical and clinical domains) with the ability to be extendable to new datasets. | ||
| 2021.blackboxnlp-1.43 Rather than ***** build ***** a WSD system as in previous work, we investigate contextualized embedding neighborhoods directly, formulating a query-by-example nearest neighbor retrieval task and examining ranking performance for words and senses in different frequency bands. | ||
| U19-1001 Our future work will ***** build ***** off the baseline and challenges presented here. | ||
| political science | 10 | |
| P19-4004 In other words, these two communities have been largely agnostic of one another, with NLP researchers mostly unaware of interesting applications in ***** political science ***** and political scientists not applying cutting-edge NLP methodology to their problems. | ||
| W18-6221 In this paper, we investigate another venue for social media monitoring, namely issue ownership and agenda setting, which are concepts from ***** political science ***** that have been used to explain voter choice and electoral outcomes. | ||
| P19-3018 This paper describes the MARDY corpus annotation environment developed for a collaboration between ***** political science ***** and computational linguistics. | ||
| D18-1393 Here, we draw on two concepts from ***** political science ***** literature to explore subtler strategies for government media manipulation: agenda-setting (selecting what topics to cover) and framing (deciding how topics are covered). | ||
| 2020.emnlp-main.109 Attribution of natural disasters / collective misfortune is a widely - studied *****political science***** problem . | ||
| machine translated | 10 | |
| 2020.coling-main.524 In automatic post-editing (APE) it makes sense to condition post-editing (pe) decisions on both the source (src) and the ***** machine translated ***** text (mt) as input. | ||
| 2004.amta-papers.27 This paper describes our experience in deploying this system and the (positive) customer response to the availability of ***** machine translated ***** articles, as well as other uses of MSR-MT either planned or underway at Microsoft. | ||
| L14-1727 Finally, we show that the performance of the sentiment classifiers built on ***** machine translated ***** data can be improved using original data from the target language and that even a small amount of such texts can lead to significant growth in the classification performance. | ||
| D19-6501 The analysis shows stronger potential translationese effects in ***** machine translated ***** outputs than in human translations. | ||
| D19-1676 We investigate the potential for this transfer in an applied industrial setting and compare to multilingual classification using ***** machine translated ***** text. | ||
| accurate | 10 | |
| K19-1001 In particular, this setup enables us to ***** accurate *****ly distil which part of a prediction stems from semantic heuristics, which part truly emanates from syntactic cues and which part arise from the model biases themselves instead. | ||
| N19-2018 A capable, automatic Question Answering (QA) system can provide more complete and ***** accurate ***** answers using a comprehensive knowledge base (KB). | ||
| 2020.emnlp-main.542 Predicting the proper emojis associated with text provides a way to summarize the text ***** accurate *****ly, and it has been proven to be a good auxiliary task to many Natural Language Understanding (NLU) tasks. | ||
| C16-1172 While most sentences are more ***** accurate ***** and fluent than translations by statistical machine translation (SMT)-based systems, in some cases, the NMT system produces translations that have a completely different meaning. | ||
| W18-4501 We show that a model which treats the concept terms as analogous and learns weights to compensate for diachronic changes (weighted linear combination) is able to more ***** accurate *****ly predict the missing term than a learned transformation and two baselines for most of the evaluated concepts. | ||
| subject | 10 | |
| L14-1009 Our formalization is based on the BDI model (Belief, Desire and Intetion) and constitues a first step toward a unifying model for ***** subject *****ive information extraction. | ||
| R17-1022 Our objective in this paper is to show the pre-annotation process, as well as to evaluate the usability of ***** subject *****ive and polarity information in this process. | ||
| 2020.lrec-1.61 We assessed the users' ***** subject *****ive cognitive load and their satisfaction in different questionnaires during the interaction with both PA variants. | ||
| P19-2010 The purpose of this research proposal is to build a model that can discriminate PD and HC ***** subject *****s even when the language used for train and test is different. | ||
| W17-4911 The results are encouraging in that human ***** subject *****s tend to perceive the generated utterances as being more similar to the character they are modeled on, than to another random character. | ||
| multilingual semantic | 10 | |
| S17-2033 In this paper, we introduce an approach to combining word embeddings and machine translation for ***** multilingual semantic ***** word similarity, the task2 of SemEval-2017. | ||
| 2021.starsem-1.17 To evaluate our multilingual models on human-written sentences as opposed to machine translated ones, we introduce a new ***** multilingual semantic ***** parsing dataset in English, Italian and Japanese based on the Facebook Task Oriented Parsing (TOP) dataset. | ||
| L16-1416 In this paper, we report on the construction of large-scale ***** multilingual semantic ***** lexicons for twelve languages, which employ the unified Lancaster semantic taxonomy and provide a multilingual lexical knowledge base for the automatic UCREL semantic annotation system (USAS). | ||
| P18-2106 Previous approaches to ***** multilingual semantic ***** dependency parsing treat languages independently, without exploiting the similarities between semantic structures across languages. | ||
| D19-1278 We test our model on the Parallel Meaning Bank—a ***** multilingual semantic ***** graphbank. | ||
| aggression | 10 | |
| N19-1144 However, previous work on this topic did not consider the problem as a whole, but rather focused on detecting very specific types of offensive content, e.g., hate speech, cyberbulling, or cyber-***** aggression *****. | ||
| 2020.alw-1.10 We also develop computational models to incorporate emotions into textual cues to improve ***** aggression ***** identification. | ||
| W18-4421 Results confirmed the difficulty of the task (particularly for detecting covert ***** aggression *****s), showing the limitations of traditionally used features. | ||
| 2020.trac-1.5 One of the aspects of social media is the ability for their information producers to hide, fully or partially, their identity during a discussion; leading to cyber-***** aggression ***** and interpersonal ***** aggression *****. | ||
| 2020.trac-1.10 This paper presents a system developed during our participation ( team name : scmhl5 ) in the TRAC-2 Shared Task on *****aggression***** identification . | ||
| dialogue modeling | 10 | |
| 2021.hcinlp-1.11 We discuss the respective methods, show interesting gaps, and conclude by suggesting neural, visually grounded ***** dialogue modeling ***** as a promising potential for NLIs for visual models. | ||
| 2021.sigdial-1.18 Dialogue topic segmentation is critical in several ***** dialogue modeling ***** problems. | ||
| 2021.acl-long.342 To this end, we exploit Abstract Meaning Representation (AMR) to help ***** dialogue modeling *****. | ||
| 2020.nlp4convai-1.7 In this paper, we present DLGNet, a transformer-based model for ***** dialogue modeling *****. | ||
| 2020.emnlp-main.652 For evaluating the versatility of the dataset, we introduce multiple ***** dialogue modeling ***** tasks and present baseline approaches. | ||
| predicate argument | 10 | |
| N18-2065 Here, it is important that the parser processes the sentences consistently; failing to recognize the similar syntactic structure results in inconsistent ***** predicate argument ***** structures among them, in which case the succeeding theorem proving is doomed to failure. | ||
| W19-3309 Meta-semantic representation consists of three parts, entities, ***** predicate argument ***** structures, and discourse attributes, that derive rich knowledge graphs. | ||
| P18-1054 Our experimental results demonstrate the proposed method can improve the performance of the inter-sentential zero anaphora resolution drastically, which is a notoriously difficult task in ***** predicate argument ***** structure analysis. | ||
| C16-1269 In this paper, we propose utilising eye gaze information for estimating parameters of a Japanese ***** predicate argument ***** structure (PAS) analysis model. | ||
| 2020.lrec-1.11 In languages like Arabic, Chinese, Italian, Japanese, Korean, Portuguese, Spanish, and many others, ***** predicate argument *****s in certain syntactic positions are not realized instead of being realized as overt pronouns, and are thus called zero- or null-pronouns. | ||
| distributed word | 10 | |
| N18-1018 Our proposed model generates the words by querying ***** distributed word ***** representations (i.e. | ||
| R17-2004 The architecture employs recurrent neural layers and more specifically LSTM cells, in order to capture information about word order and to easily incorporate ***** distributed word ***** representations (embeddings) as features, without having to use a fixed window of text. | ||
| W19-6107 We present an evaluation of Czech low-dimensional ***** distributed word ***** representations, also known as word embeddings. | ||
| Q17-1018 In this paper we propose and carefully evaluate a sequence labeling framework which solely utilizes sparse indicator features derived from dense ***** distributed word ***** representations. | ||
| S18-1193 Our approach is to build ***** distributed word ***** embedding of reason, warrant and claim respectively, meanwhile, we use a series of frameworks such as CNN model, LSTM model, GRU with attention model and biLSTM with attention model for processing word vector. | ||
| probability distribution | 10 | |
| W18-1601 For detection of stylistic variation, we use relative entropy, measuring the difference between ***** probability distribution *****s at different linguistic levels (here: lexis and grammar). | ||
| 2020.repl4nlp-1.9 Skip-Gram is a simple, but effective, model to learn a word embedding mapping by estimating a conditional ***** probability distribution ***** for each word of the dictionary. | ||
| I17-1007 In this paper, we propose a probabilistic parsing model that defines a proper conditional ***** probability distribution ***** over non-projective dependency trees for a given sentence, using neural representations as inputs. | ||
| D17-1229 Instead of greedily choosing a label at each time step, and using it for the next prediction, we retain the ***** probability distribution ***** over the current label, and pass this distribution to the next prediction. | ||
| D19-1421 The corresponding objective function for MLE is derived from the Kullback-Leibler (KL) divergence between the empirical ***** probability distribution ***** representing the data and the parametric ***** probability distribution ***** output by the model. | ||
| vision | 10 | |
| D18-1270 This enables our approach to: (a) augment the limited super***** vision ***** in the target language with additional super***** vision ***** from a high-resource language (like English), and (b) train a single entity linking model for multiple languages, improving upon individually trained models for each language. | ||
| Q14-1014 We introduce a method for automatically segmenting a corpus into chunks such that many uncertain labels are grouped into the same chunk, while human super***** vision ***** can be omitted altogether for other segments. | ||
| W18-6312 We reassess a recent study (Hassan et al., 2018) that claimed that machine translation (MT) has reached human parity for the translation of news from Chinese into English, using pairwise ranking and considering three variables that were not taken into account in that previous study: the language in which the source side of the test set was originally written, the translation proficiency of the evaluators, and the pro***** vision ***** of inter-sentential context. | ||
| 2021.emnlp-main.326 Multimodal abstractive summarization ( MAS ) models that summarize videos ( *****vision***** modality ) and their corresponding transcripts ( text modality ) are able to extract the essential information from massive multimodal data on the Internet . | ||
| W18-5434 PatternAttribution is a recent method , introduced in the *****vision***** domain , that explains classifications of deep neural networks . | ||
| program | 10 | |
| P18-2004 We report results for the two features; black-box and glass-box using unseen 24 Arabic broadcast ***** program *****s. | ||
| 2020.coling-main.418 We then introduce a total optimization method using integer linear ***** program *****ming to prevent span overlapping and obtain non-monotonic alignments. | ||
| L10-1071 Working within the EU funded COMPANIONS ***** program *****, we investigate the use of appropriateness as a measure of conversation quality, the hypothesis being that good companions need to be good conversational partners . | ||
| 2012.amta-government.13 The RevP ***** program ***** saves time by removing the need for post-editing of Chinese names, and improves consistency in the translation of these names. | ||
| C16-2013 However, the usage of established NLP frameworks is often hampered for several reasons: in most cases, they require basic to sophisticated ***** program *****ming skills, interfere with interoperability due to using non-standard I/O-formats and often lack tools for visualizing computational results. | ||
| comparison | 10 | |
| 2020.lrec-1.855 The corpus database is distributed to permit fast indexing, and provides a simple web front-end with corpus linguistics methods for sub-corpus ***** comparison ***** and retrieval. | ||
| 2020.emnlp-main.748 We hope that these architectures and experiments may serve as strong points of ***** comparison ***** for future work. | ||
| P19-3028 %consists in projecting them in two-dimensional planes without any interpretable semantics associated to the axes of the projection, which makes detailed analyses and ***** comparison ***** among multiple sets of embeddings challenging. | ||
| 2020.findings-emnlp.100 By collecting comparative adjectives from existing dictionaries and utilizing a semantic framework to catch comparative quantifiers, the semantics of clues concerning ***** comparison ***** structures are better understood, ensuring conversion to correct logic representation. | ||
| L10-1434 Our second interest lies in the actual ***** comparison ***** of the models: How does a very simple distributional model compare to much more complex approaches, and which representation of selectional preferences is more appropriate, using (i) second-order properties, (ii) an implicit generalisation of nouns (by clusters), or (iii) an explicit generalisation of nouns by WordNet classes within clusters? | ||
| extractive text | 10 | |
| 2020.lrec-1.31 For the first time, and as a fast, scalable, and cost-effective alternative, we propose micro-task crowdsourcing to evaluate both the intrinsic and extrinsic quality of query-based ***** extractive text ***** summaries. | ||
| 2020.ngt-1.9 Our analysis shows that decoders containing attention mechanisms over the encoder output achieve high-scoring results by generating ***** extractive text *****. | ||
| 2021.emnlp-main.11 Based on Multi-GCN, we propose a Multiplex Graph Summarization (Multi-GraS) model for ***** extractive text ***** summarization. | ||
| D19-1300 In this work, we re-examine the problem of ***** extractive text ***** summarization for long documents. | ||
| W19-1906 We report on the use of this pipeline in a disease-specific ***** extractive text ***** summarization task on clinical notes, focusing primarily on progress notes by physicians and nurse practitioners. | ||
| deep semantic | 10 | |
| L12-1299 What would be a good method to provide a large collection of semantically annotated texts with formal, ***** deep semantic *****s rather than shallow? | ||
| C18-1237 Capturing redundancy is challenging as it may involve investigating at a ***** deep semantic ***** level. | ||
| 2014.lilt-9.3 This is probably because it requires a combination of robust, ***** deep semantic ***** analysis and logical inference—and why develop something with this complexity if you perhaps can get away with something simpler? | ||
| 2020.emnlp-main.511 In particular, we develop a new Relational Pointer Decoder (referred as RPD) by incorporating the relative ordering information into the pointer network with a Deep Relational Module (referred as DRM), which utilizes BERT to exploit the ***** deep semantic ***** connection and relative ordering between sentences.This enables us to strengthen both local and global dependencies among sentences. | ||
| P17-1054 We name it as deep keyphrase generation since it attempts to capture the ***** deep semantic ***** meaning of the content with a deep learning method. | ||
| bayesian optimization | 10 | |
| D17-1038 Inspired by work on curriculum learning, we propose to learn data selection measures using *****Bayesian Optimization***** and evaluate them across models, domains and tasks. | ||
| 2021.eacl-demos.31 In this paper, we present OCTIS, a framework for training, analyzing, and comparing Topic Models, whose optimal hyper-parameters are estimated using a *****Bayesian Optimization***** approach. | ||
| 2021.ranlp-1.157 In this paper, we present an empirical analysis and comparison of Neural Topic Models by finding the optimal hyperparameters of each model for four different performance measures adopting a single-objective *****Bayesian optimization*****. | ||
| 2021.emnlp-main.711 We regard a combination of various operations as an augmentation policy and utilize an efficient *****Bayesian Optimization***** algorithm to automatically search for the best policy, which substantially improves the generalization capability of models. | ||
| N19-1355 To address these issues, we present AutoSeM, a two-stage MTL pipeline, where the first stage automatically selects the most useful auxiliary tasks via a Beta-Bernoulli multi-armed bandit with Thompson Sampling, and the second stage learns the training mixing ratio of these selected auxiliary tasks via a Gaussian Process based *****Bayesian optimization***** framework. | ||
| aspect - level sentiment | 10 | |
| N18-1051 Target - dependent classification tasks , such as *****aspect - level sentiment***** analysis , perform fine - grained classifications towards specific targets . | ||
| P19-2035 Abstract Attention based deep learning systems have been demonstrated to be the state of the art approach for *****aspect - level sentiment***** analysis , however , end - to - end deep neural networks lack flexibility as one can not easily adjust the network to fix an obvious problem , especially when more training data is not available : e.g. | ||
| 2020.coling-main.83 We release large - scale datasets of users ' comments in two languages , English and Korean , for *****aspect - level sentiment***** analysis in automotive domain . | ||
| 2020.acl-main.340 Aspect - based sentiment analysis ( ABSA ) involves three subtasks , i.e. , aspect term extraction , opinion term extraction , and *****aspect - level sentiment***** classification . | ||
| P18-2092 Attention - based long short - term memory ( LSTM ) networks have proven to be useful in *****aspect - level sentiment***** classification . | ||
| detection of | 10 | |
| 2020.wnut-1.68 In this system paper , we present a transformer - based approach to the *****detection of***** informativeness in English tweets on the topic of the current COVID-19 pandemic . | ||
| 2021.alta-1.6 The *****detection of***** hyperbole is an important stepping stone to understanding the intentions of a hyperbolic utterance . | ||
| 2020.findings-emnlp.93 Accurate *****detection of***** emotions in user- generated text was shown to have several applications for e - commerce , public well - being , and disaster management . | ||
| R19-1103 The *****detection of***** quotations ( i.e. , reported speech , thought , and writing ) has established itself as an NLP analysis task . | ||
| 2020.figlang-1.36 We present an ensemble approach for the *****detection of***** sarcasm in Reddit and Twitter responses in the context of The Second Workshop on Figurative Language Processing held in conjunction with ACL 2020 . | ||
| Chinese word | 10 | |
| N19-1278 We investigate subword information for *****Chinese word***** segmentation , by integrating sub word embeddings trained using byte - pair encoding into a Lattice LSTM ( LaLSTM ) network over a character sequence . | ||
| 2021.emnlp-demo.6 We introduce N - LTP , an open - source neural language technology platform supporting six fundamental Chinese NLP tasks : lexical analysis ( *****Chinese word***** segmentation , part - of - speech tagging , and named entity recognition ) , syntactic parsing ( dependency parsing ) , and semantic parsing ( semantic dependency parsing and semantic role labeling ) . | ||
| D17-1025 In this paper , we propose new methods to learn *****Chinese word***** representations . | ||
| 2020.socialnlp-1.7 *****Chinese word***** segmentation is necessary to provide word - level information for Chinese named entity recognition ( NER ) systems . | ||
| D18-1529 A wide variety of neural - network architectures have been proposed for the task of *****Chinese word***** segmentation . | ||
| Hyperpartisan News | 10 | |
| S19-2157 In the effort to tackle the challenge of *****Hyperpartisan News***** Detection , i.e. , the task of deciding whether a news article is biased towards one party , faction , cause , or person , we experimented with two systems : i ) a standard supervised learning approach using superficial text and bag - of - words features from the article title and body , and ii ) a deep learning system comprising a four - layer convolutional neural network and max - pooling layers after the embedding layer , feeding the consolidated features to a bi - directional recurrent neural network . | ||
| S19-2176 We describe the system submitted by the Jack Ryder team to SemEval-2019 Task 4 on *****Hyperpartisan News***** Detection . | ||
| S19-2178 This paper describes the approach of team Kit Kittredge to SemEval-2019 Task 4 : *****Hyperpartisan News***** Detection . | ||
| S19-2170 This paper describes our system for detecting hyperpartisan news articles , which was submitted for the shared task in SemEval 2019 on *****Hyperpartisan News***** Detection . | ||
| S19-2188 We present our deep learning models submitted to the SemEval-2019 Task 4 competition focused at *****Hyperpartisan News***** Detection . | ||
| minority | 10 | |
| L08-1581 Producing machine translation ( MT ) for the many *****minority***** languages in the world is a serious challenge . | ||
| 2020.wac-1.4 Web corpora creation for *****minority***** languages that do not have their own top - level Internet domain is no trivial matter . | ||
| 2020.sltu-1.8 Occitan is a *****minority***** language spoken in Southern France , some Alpine Valleys of Italy , and the Val d'Aran in Spain , which only very recently started developing language and speech technologies . | ||
| L14-1122 This paper describes the efforts for the construction of Language Resources and NLP tools for Mirandese , a *****minority***** language spoken in North - eastern Portugal , now available on a community - led portal , Casa de la Lhngua . | ||
| P19-1163 We investigate how annotators ' insensitivity to differences in dialect can lead to racial bias in automatic hate speech detection models , potentially amplifying harm against *****minority***** populations . | ||
| identification of | 10 | |
| 2020.lrec-1.787 Dialect IDentification ( DID ) is a challenging task , and it becomes more complicated when it is about the *****identification of***** dialects that belong to the same country . | ||
| P17-1081 Multimodal sentiment analysis is a developing area of research , which involves the *****identification of***** sentiments in videos . | ||
| 2020.lrec-1.2 Anaphora resolution ( coreference ) systems designed for the CONLL 2012 dataset typically can not handle key aspects of the full anaphora resolution task such as the *****identification of***** singletons and of certain types of non - referring expressions ( e.g. , expletives ) , as these aspects are not annotated in that corpus . | ||
| S18-1106 This paper describes the participation of the # NonDicevoSulSerio team at SemEval2018 - Task3 , which focused on Irony Detection in English Tweets and was articulated in two tasks addressing the *****identification of***** irony at different levels of granularity . | ||
| L06-1208 One of the main challenges in biomedical text mining is the *****identification of***** terminology , which is a key factor for accessing and integrating the information stored in literature . | ||
| Statistical Machine Translation ( SMT ) | 10 | |
| 2006.amta-papers.22 *****Statistical Machine Translation ( SMT )***** accuracy degrades when there is only a limited amount of training , or when the training is not from the same domain or genre of text as the target application . | ||
| P19-1019 While machine translation has traditionally relied on large amounts of parallel corpora , a recent research line has managed to train both Neural Machine Translation ( NMT ) and *****Statistical Machine Translation ( SMT )***** systems using monolingual corpora only . | ||
| W17-2511 A *****Statistical Machine Translation ( SMT )***** system is always trained using large parallel corpus to produce effective translation . | ||
| L16-1466 We construct a case - based English - to - Chinese semantic constituent parallel Treebank for a *****Statistical Machine Translation ( SMT )***** task by labelling each node of the Deep Syntactic Tree ( DST ) with our refined semantic cases . | ||
| 2014.amta-researchers.5 In this paper , we address the problem of extracting and integrating bilingual terminology into a *****Statistical Machine Translation ( SMT )***** system for a Computer Aided Translation ( CAT ) tool scenario . | ||
| Multiword expressions ( MWEs | 10 | |
| W17-1709 *****Multiword expressions ( MWEs***** ) pose a problem for lexicalist theories like Lexical Functional Grammar ( LFG ) , since they are prima facie counterexamples to a strong form of the lexical integrity principle , which entails that a lexical item can only be realised as a single , syntactically atomic word . | ||
| L14-1433 *****Multiword expressions ( MWEs***** ) are quite frequent in languages such as English , but their diversity , the scarcity of individual MWE types , and contextual ambiguity have presented obstacles to corpus - based studies and NLP systems addressing them as a class . | ||
| 2020.readi-1.3 *****Multiword expressions ( MWEs***** ) were shown to be useful in a number of NLP tasks . | ||
| J17-4005 *****Multiword expressions ( MWEs***** ) are a class of linguistic forms spanning conventional word boundaries that are both idiosyncratic and pervasive across different languages . | ||
| W19-5101 *****Multiword expressions ( MWEs***** ) feature prominently in the mental lexicon of native speakers ( Jackendoff , 1997 ) in all languages and domains , from informal to technical contexts ( Biber et al . , 1999 ) with about four MWEs being produced per minute of discourse ( Glucksberg , 1989 ) . | ||
| Cross - lingual | 10 | |
| 2021.newsum-1.5 *****Cross - lingual***** summarization is a challenging task for which there are no cross - lingual scientific resources currently available . | ||
| 2020.sigmorphon-1.22 *****Cross - lingual***** transfer between typologically related languages has been proven successful for the task of morphological inflection . | ||
| 2021.acl-long.244 *****Cross - lingual***** transfer has improved greatly through multi - lingual language model pretraining , reducing the need for parallel data and increasing absolute performance . | ||
| P19-1311 *****Cross - lingual***** transfer is an effective way to build syntactic analysis tools in low - resource languages . | ||
| W19-4327 *****Cross - lingual***** embeddings aim to represent words in multiple languages in a shared vector space by capturing semantic similarities across languages . | ||
| aspect - based sentiment | 10 | |
| W19-3641 This paper presents our experimental work on exploring the potential of neural network models developed for *****aspect - based sentiment***** analysis for entity - level adverse drug reaction ( ADR ) classification . | ||
| W19-0413 In this paper , we propose a language - agnostic deep neural network architecture for *****aspect - based sentiment***** analysis . | ||
| P17-1036 Aspect extraction is an important and challenging task in *****aspect - based sentiment***** analysis . | ||
| D19-1464 Due to their inherent capability in semantic alignment of aspects and their context words , attention mechanism and Convolutional Neural Networks ( CNNs ) are widely applied for *****aspect - based sentiment***** classification . | ||
| L16-1465 The fine - grained task of automatically detecting all sentiment expressions within a given document and the aspects to which they refer is known as *****aspect - based sentiment***** analysis . | ||
| pre - trained language models ( PLMs | 10 | |
| 2021.naacl-main.463 Existing *****pre - trained language models ( PLMs***** ) are often computationally expensive in inference , making them impractical in various resource - limited real - world applications . | ||
| 2021.emnlp-main.110 While *****pre - trained language models ( PLMs***** ) are the go - to solution to tackle many natural language processing problems , they are still very limited in their ability to capture and to use common - sense knowledge . | ||
| 2021.acl-srw.27 Despite the development of *****pre - trained language models ( PLMs***** ) significantly raise the performances of various Chinese natural language processing ( NLP ) tasks , the vocabulary ( vocab ) for these Chinese PLMs remains to be the one provided by Google Chinese BERT ( CITATION ) , which is based on Chinese characters ( chars ) . | ||
| 2021.acl-long.491 Event extraction ( EE ) has considerably benefited from *****pre - trained language models ( PLMs***** ) by fine - tuning . | ||
| 2021.sustainlp-1.10 Large *****pre - trained language models ( PLMs***** ) have led to great success on various commonsense question answering ( QA ) tasks in an end - to - end fashion . | ||
| low - resource language | 10 | |
| 2020.sltu-1.24 The aim of this paper is to present a framework developed for crowdsourcing sentiment annotation for the *****low - resource language***** Luxembourgish . | ||
| D19-1446 Neural machine translation , which achieves near human - level performance in some languages , strongly relies on the large amounts of parallel sentences , which hinders its applicability to *****low - resource language***** pairs . | ||
| W18-3405 In this paper , we investigate the effectiveness of training a multimodal neural machine translation ( MNMT ) system with image features for a *****low - resource language***** pair , Hindi and English , using synthetic data . | ||
| 2021.mtsummit-research.6 Massively multilingual machine translation ( MT ) has shown impressive capabilities and including zero and few - shot translation between *****low - resource language***** pairs . | ||
| 2021.emnlp-main.129 Back - translation ( BT ) of target monolingual corpora is a widely used data augmentation strategy for neural machine translation ( NMT ) , especially for *****low - resource language***** pairs . | ||
| SemEval-2017 Task | 10 | |
| S17-2081 This paper describes our approach for *****SemEval-2017 Task***** 8 . | ||
| S17-2167 This paper describes our participation in *****SemEval-2017 Task***** 10 . | ||
| S17-2172 This paper describes our TTI - COIN system that participated in *****SemEval-2017 Task***** 10 . | ||
| S17-2045 This paper presents the system in *****SemEval-2017 Task***** 3 , Community Question Answering ( CQA ) . | ||
| S17-2041 This paper describes Sew - Embed , our language - independent approach to multilingual and cross - lingual semantic word similarity as part of the *****SemEval-2017 Task***** 2 . | ||
| eye - tracking | 10 | |
| D19-1160 We explore whether it is possible to leverage *****eye - tracking***** data in an RNN dependency parser ( for English ) when such information is only available during training - i.e. | ||
| N19-1001 Previous research shows that *****eye - tracking***** data contains information about the lexical and syntactic properties of text , which can be used to improve natural language processing models . | ||
| D17-1107 We present a machine learning analysis of *****eye - tracking***** data for the detection of mild cognitive impairment , a decline in cognitive abilities that is associated with an increased risk of developing dementia . | ||
| W17-1710 This study investigates the processing of idiomatic variants through an *****eye - tracking***** experiment . | ||
| 2021.cmcl-1.11 This paper describes Team Ohio State 's approach to the CMCL 2021 Shared Task , the goal of which is to predict five *****eye - tracking***** features from naturalistic self - paced reading corpora . | ||
| professional | 10 | |
| 2015.lilt-12.3 Here we use computational methods to compare the stylistic features of 359 English poems written by 19th century *****professional***** poets , Imagist poets , contemporary professional poets , and contemporary amateur poets . | ||
| 2020.wmt-1.41 Even though sentence - centric metrics are used widely in machine translation evaluation , document - level performance is at least equally important for *****professional***** usage . | ||
| 2012.amta-papers.22 This paper addresses the problem of reliably measuring productivity gains by *****professional***** translators working with a machine translation enhanced computer assisted translation tool . | ||
| L16-1318 With the increasing amount of audiovisual and digital data deriving from televisual and radiophonic sources , *****professional***** archives such as INA , France 's national audiovisual institute , acknowledge a growing need for efficient indexing tools . | ||
| N18-1195 Using a case study , we show that variation in oral reading rate across passages for *****professional***** narrators is consistent across readers and much of it can be explained using features of the texts being read . | ||
| transition - based dependency | 10 | |
| 2020.findings-emnlp.294 We propose the Graph2Graph Transformer architecture for conditioning on and predicting arbitrary graphs , and apply it to the challenging task of *****transition - based dependency***** parsing . | ||
| N18-2066 Because the most common transition systems are projective , training a *****transition - based dependency***** parser often implies to either ignore or rewrite the non - projective training examples , which has an adverse impact on accuracy . | ||
| Q14-1010 We develop parsing oracles for two *****transition - based dependency***** parsers , including the arc - standard parser , solving a problem that was left open in ( Goldberg and Nivre , 2013 ) . | ||
| D17-1002 We first present a minimal feature set for *****transition - based dependency***** parsing , continuing a recent trend started by Kiperwasser and Goldberg ( 2016a ) and Cross and Huang ( 2016a ) of using bi - directional LSTM features . | ||
| E17-2051 This paper formalizes a sound extension of dynamic oracles to global training , in the frame of *****transition - based dependency***** parsers . | ||
| collaborative | 10 | |
| L16-1342 Crowdsourcing is an arising *****collaborative***** approach applicable among many other applications to the area of language and speech processing . | ||
| L14-1412 Crowdsourcing is an emerging *****collaborative***** approach that can be used for the acquisition of annotated corpora and a wide range of other linguistic resources . | ||
| D19-1218 We study a *****collaborative***** scenario where a user not only instructs a system to complete tasks , but also acts alongside it . | ||
| 2020.coling-main.164 Commitments and requests are a hallmark of *****collaborative***** communication , especially in team settings . | ||
| L08-1348 This paper focuses on different aspects of *****collaborative***** work used to create the electronic version of a dictionary in paper format , edited and printed by the Romanian Academy during the last century . | ||
| cultural | 10 | |
| 2021.latechclfl-1.2 Although olfactory references play a crucial role in our *****cultural***** memory , only few works in NLP have tried to capture them from a computational perspective . | ||
| W18-4310 The HEI System is a system for events annotation and temporal reasoning in Natural Language Texts and media , mainly oriented to texts of historical and *****cultural***** contents available on the Web . | ||
| 2021.semeval-1.36 Humor and Offense are highly subjective due to multiple word senses , *****cultural***** knowledge , and pragmatic competence . | ||
| W18-4507 Measuring similarity is a basic task in information retrieval , and now often a building - block for more complex arguments about *****cultural***** change . | ||
| 2020.lrec-1.102 Obituaries contain information about people 's values across times and cultures , which makes them a useful resource for exploring *****cultural***** history . | ||
| human - computer | 10 | |
| 2021.wat-1.10 With the growing popularity of smart speakers , such as Amazon Alexa , speech is becoming one of the most important modes of *****human - computer***** interaction . | ||
| D18-1507 Computational detection and understanding of empathy is an important factor in advancing *****human - computer***** interaction . | ||
| 2020.sltu-1.37 It is known that Automatic Speech Recognition ( ASR ) is very useful for *****human - computer***** interaction in all the human languages . | ||
| 2020.lrec-1.288 Recognizing spatial relations and reasoning about them is essential in multiple applications including navigation , direction giving and *****human - computer***** interaction in general . | ||
| C18-1253 Incrementality is ubiquitous in human - human interaction and beneficial for *****human - computer***** interaction . | ||
| natural - language | 10 | |
| W17-3511 For situated agents to effectively engage in *****natural - language***** interactions with humans , they must be able to refer to entities such as people , locations , and objects . | ||
| W17-2603 We propose a recurrent neural model that generates *****natural - language***** questions from documents , conditioned on answers . | ||
| L16-1409 Morphological analysis is a fundamental task in *****natural - language***** processing , which is used in other NLP applications such as part - of - speech tagging , syntactic parsing , information retrieval , machine translation , etc . | ||
| 2020.emnlp-main.469 Complex question - answering ( CQA ) involves answering complex *****natural - language***** questions on a knowledge base ( KB ) . | ||
| K17-1034 We show that relation extraction can be reduced to answering simple reading comprehension questions , by associating one or more *****natural - language***** questions with each relation slot . | ||
| Machine reading comprehension ( MRC | 10 | |
| 2020.coling-main.237 *****Machine reading comprehension ( MRC***** ) is one of the most critical yet challenging tasks in natural language understanding(NLU ) , where both syntax and semantics information of text are essential components for text understanding . | ||
| P18-1178 *****Machine reading comprehension ( MRC***** ) on real web data usually requires the machine to answer a question by analyzing multiple passages retrieved by search engine . | ||
| 2021.ranlp-1.51 *****Machine reading comprehension ( MRC***** ) is one of the most challenging tasks in natural language processing domain . | ||
| 2021.eacl-main.137 *****Machine reading comprehension ( MRC***** ) has received considerable attention as a benchmark for natural language understanding . | ||
| 2021.ccl-1.95 *****Machine reading comprehension ( MRC***** ) is a typical natural language processing ( NLP)task and has developed rapidly in the last few years . | ||
| Text style | 10 | |
| D19-1325 *****Text style***** transfer without parallel data has achieved some practical success . | ||
| 2020.inlg-1.17 *****Text style***** transfer aims to change an input sentence to an output sentence by changing its text style while preserving the content . | ||
| 2021.naacl-main.171 *****Text style***** transfer aims to controllably generate text with targeted stylistic changes while maintaining core meaning from the source sentence constant . | ||
| D19-1322 *****Text style***** transfer is the task of transferring the style of text having certain stylistic attributes , while preserving non - stylistic or content information . | ||
| N19-1320 *****Text style***** transfer rephrases a text from a source style ( e.g. , informal ) to a target style ( e.g. , formal ) while keeping its original meaning . | ||
| each | 10 | |
| L14-1726 translators or lay - users of machine translations , can get quality predictions ( or internal features of the framework ) for translations without having to install the toolkit , obtain resources or build prediction models ; ( ii ) it significantly improves over the previous runtime performance by keeping resources ( such as language models ) in memory ; ( iii ) it provides an option for users to submit the source text only and automatically obtain translations from Bing Translator ; ( iv ) it provides a ranking of multiple translations submitted by users for *****each***** source text according to their estimated quality . | ||
| P19-1277 Previous studies on this topic adopt prototypical networks , which calculate the embedding vector of a query instance and the prototype vector of the support set for *****each***** relation candidate independently . | ||
| E17-1112 Specifically , we create word embeddings of English and Japanese and map the Japanese embeddings into the English space so that we can calculate the similarity of each Japanese word and *****each***** English word . | ||
| P19-1481 QG systems are typically built assuming access to a large number of training instances where *****each***** instance is a question and its corresponding answer . | ||
| 2021.acl-long.529 Nevertheless , the majority of existing research into this task has focused on textual data , and the few recent inquiries into structured data have been for the closed - domain setting where appropriate evidence for *****each***** claim is assumed to have already been retrieved . | ||
| World Wide | 10 | |
| L08-1438 Although the *****World Wide***** Web has late become an important source to consult for the meaning of words , a number of technical terms related to high technology are not found on the Web . | ||
| 2019.gwc-1.1 The schema.org initiative was designed to introduce machine readable metadata into the *****World Wide***** Web . | ||
| 1997.mtsummit-papers.21 The Java programming language started as the language Oak when the *****World Wide***** Web was still being developed at CERN . | ||
| L08-1255 This paper addresses a novel approach that integrates two different types of information resources : the *****World Wide***** Web and libraries . | ||
| 2020.acl-tutorials.6 The *****World Wide***** Web contains vast quantities of textual information in several forms : unstructured text , template - based semi - structured webpages ( which present data in key - value pairs and lists ) , and tables . | ||
| semi - supervised | 10 | |
| 2021.emnlp-main.430 Unsupervised consistency training is a way of *****semi - supervised***** learning that encourages consistency in model predictions between the original and augmented data . | ||
| W17-2312 We propose in this paper a *****semi - supervised***** method for labeling terms of texts with concepts of a domain ontology . | ||
| 2021.woah-1.13 We present a data set consisting of German news articles labeled for political bias on a five - point scale in a *****semi - supervised***** way . | ||
| 2020.lrec-1.858 We propose ThaiLMCut , a *****semi - supervised***** approach for Thai word segmentation which utilizes a bi - directional character language model ( LM ) as a way to leverage useful linguistic knowledge from unlabeled data . | ||
| N18-2057 We propose a novel approach to *****semi - supervised***** learning for information extraction that uses ladder networks ( Rasmus et al . , 2015 ) . | ||
| it | 10 | |
| 2020.sigmorphon-1.18 It is usually assumed , however , that a linguist working with inflectional examples could in principle develop a gold standard - level morphological analyzer and generator that would surpass a trained neural network model in accuracy of predictions , but that *****it***** may require significant amounts of human labor . | ||
| 2001.mtsummit-papers.19 Machine translation is usually regarded as a possible solution for this , but so far *****it***** can not provide acceptable translations of unedited texts . | ||
| L16-1055 Because the temporal information is useful for various applications , *****it***** became important to develop a system of extracting the temporal information from the documents . | ||
| W17-4412 However , a non - negligible part of Uyghur text appearing in social media is unsystematically written with the Latin alphabet , and *****it***** continues to increase in size . | ||
| W19-0605 We show that topological information , extracted from the relationships between sentences can be used in inference , namely *****it***** can be applied to the very difficult legal entailment given in the COLIEE 2018 data set . | ||
| Universal Dependencies ( UD | 10 | |
| W17-6308 In applying word - based dependency parsing such as *****Universal Dependencies ( UD***** ) to Japanese , the uncertainty of word segmentation emerges for defining a word unit of the dependencies . | ||
| E17-1022 We present UDP , the first training - free parser for *****Universal Dependencies ( UD***** ) . | ||
| 2021.iwpt-1.13 The introduction of pre - trained transformer - based contextualized word embeddings has led to considerable improvements in the accuracy of graph - based parsers for frameworks such as *****Universal Dependencies ( UD***** ) . | ||
| W18-6003 Although treebanks annotated according to the guidelines of *****Universal Dependencies ( UD***** ) now exist for many languages , the goal of annotating the same phenomena in a cross - linguistically consistent fashion is not always met . | ||
| E17-5001 *****Universal Dependencies ( UD***** ) is a project that seeks to develop cross - linguistically consistent treebank annotation for many languages . | ||
| specific | 10 | |
| 2020.wmt-1.113 It proposes two key strategies for quality estimation : ( 1 ) task - specific pretraining scheme , and ( 2 ) task - *****specific***** data augmentation . | ||
| W17-6306 We show how an L1 - L2 parallel treebank i.e. , parse trees of non - native sentences , aligned to the parse trees of their target hypotheses can facilitate retrieval of sentences with *****specific***** learner errors . | ||
| 2021.nllp-1.9 Language models have proven to be very useful when adapted to *****specific***** domains . | ||
| 2020.findings-emnlp.270 In *****specific***** domains , such as procedural scientific text , human labeled data for shallow semantic parsing is especially limited and expensive to create . | ||
| D19-1279 By leveraging a multilingual BERT self - attention model pretrained on 104 languages , we found that fine - tuning it on all datasets concatenated together with simple softmax classifiers for each UD task can meet or exceed state - of - the - art UPOS , UFeats , Lemmas , ( and especially ) UAS , and LAS scores , without requiring any recurrent or language - *****specific***** components . | ||
| Exact | 9 | |
| D19-5827 Experimental results show the effectiveness of our methods, with an average ***** Exact ***** Match score of 56.59 and an average F1 score of 68.98, which significantly improves the BERT-Large baseline by8.39 and 7.22, respectively | ||
| N18-1071 ***** Exact ***** inference is prohibitively expensive for our globally normalized model. | ||
| N19-1335 ***** Exact ***** structured inference with neural network scoring functions is computationally challenging but several methods have been proposed for approximating inference. | ||
| 2021.ranlp-1.167 Our best model achieved very promising results of 83.82, 87.84, and 85.75 for Precision, Recall, and F1, respectively, for extracting Condition, Action, and Consequence clauses using ***** Exact ***** Match metric. | ||
| 2021.wnut-1.1 Our experiments show that simplification leads to up to 2.04% and 1.74% increase in ***** Exact ***** Match and F1, respectively | ||
| Implemented | 9 | |
| L10-1580 ***** Implemented ***** in the Protëgë-OWL editor, the alignment of the two databases illustrates how wordnets can be turned into ontolexicons. | ||
| 2003.mtsummit-papers.41 ***** Implemented ***** as a workshop at a major conference in 2002, the experiment defined an evaluation task, description of the metrics, as well as test data consisting of human and machine translations of two texts. | ||
| K19-1093 ***** Implemented ***** with automatically labeled Twitter data, the proposed model has shown positive results employing different input formulations for representing the concerned information. | ||
| L16-1434 ***** Implemented ***** system characters were tested by asking users of the dialogue system to identify the source speakers in the corpus. | ||
| N19-4021 ***** Implemented ***** using HTML, Javascript, and CSS, the dictionary is set in an uncluttered interface and permits users to search in Yupik or in English for Yupik root words and Yupik derivational suffixes | ||
| degraded | 9 | |
| L10-1357 This increase correlates with the previously observed ***** degraded ***** performances. | ||
| 2020.coling-main.396 NMT systems suffer ***** degraded ***** performance when trained with mixed data having different features, such as noisy data and clean data. | ||
| P17-1154 Though a variety of neural network models have been proposed recently, however, previous models either depend on expensive phrase-level annotation, most of which has remarkably ***** degraded ***** performance when trained with only sentence-level annotation; or do not fully employ linguistic resources (e.g., sentiment lexicons, negation words, intensity words). | ||
| 2021.nlp4if-1.8 However, the state-of-the-art supervised models display ***** degraded ***** performance when they are evaluated on abusive comments that differ from the training corpus. | ||
| 2020.lrec-1.445 Neural machine translation (NMT) systems suffer ***** degraded ***** performance when trained with noisy data | ||
| editorials | 9 | |
| 2021.naacl-main.344 We argue that identifying and abstracting such natural language perspectives from ***** editorials ***** is a crucial step toward studying the implicit argumentation structure in news ***** editorials *****. | ||
| L10-1452 The collection includes broadcast news (BN) and broadcast conversation (BC) including talk shows, roundtable discussions, call-in shows, ***** editorials ***** and other conversational programs that focus on news and current events. | ||
| 2020.acl-main.287 To this end, we first compare content- and style-oriented classifiers on ***** editorials ***** from the liberal NYTimes with ideology-specific effect annotations. | ||
| 2020.peoples-1.4 We further analyze the importance of various text features with respect to the ***** editorials *****' impact, the readers' profile, and the ***** editorials *****' geographical scope. | ||
| D17-1141 Given nearly 29,000 argumentative ***** editorials ***** from the New York Times, we develop two machine learning models, one for determining an editorial's topic, and one for identifying evidence types in the editorial | ||
| simpler | 9 | |
| E17-1021 We show that our approach to cross-lingual dependency parsing is not only ***** simpler *****, but also achieves an absolute improvement of 2.25% averaged across 10 languages compared to the previous state of the art. | ||
| 2020.coling-main.134 Third, it has ***** simpler ***** structure but much higher parameter efficiency. | ||
| 2021.eacl-main.141 This setting makes training ***** simpler ***** than previous approaches by relying only on standard log-likelihood loss and mainstream models. | ||
| 2020.acl-main.420 A commonly held belief is that using ***** simpler ***** models as probes is better; the logic is that ***** simpler ***** models will identify linguistic structure, but not learn the task itself. | ||
| L14-1117 With growing interest in the creation and search of linguistic annotations that form general graphs (in contrast to formally ***** simpler *****, rooted trees), there also is an increased need for infrastructures that support the exploration of such representations, for example logical-form meaning representations or semantic dependency graphs | ||
| codes | 9 | |
| 2019.iwslt-1.4 All of our ***** codes ***** are publicly available in ESPnet. | ||
| 2021.semeval-1.61 We make our ***** codes ***** available at https://github.com/HardikArora17/SemEval-2021-INNOVATORS. | ||
| 2020.emnlp-main.60 Both our ***** codes ***** and the extracted denotation graphs on the Flickr30K and the COCO datasets are publically available on https://sha-lab.github.io/DG. | ||
| 2020.acl-main.552 To encourage more instantiations in the future, we have released our ***** codes *****, processed dataset, as well as generated summaries in <https://github.com/maszhongming/MatchSum>. | ||
| W19-3814 We participated in the Gender Bias for Natural Language Processing 2019 shared task, and our ***** codes ***** are available online | ||
| contents | 9 | |
| W16-3911 Accurate event detection in social media is very challenging because user generated ***** contents ***** are extremely noisy and sparse in content. | ||
| 2020.findings-emnlp.94 Neural network (NN) based data2text models achieve state-of-the-art (SOTA) performance in most metrics, but they sometimes drop or modify the information in the input, and it is hard to control the generation ***** contents *****. | ||
| 2019.icon-1.27 In this paper, we propose two effective models based on deep learning for solving fake news detection problem in online news ***** contents ***** of multiple domains. | ||
| L08-1562 Furthermore, a method for the automatic recognition and resolution of temporal expressions in Spanish ***** contents ***** is provided, obtaining promising results when it is tested by means of an evaluation corpus. | ||
| 2019.icon-1.23 In Bengali, the news ***** contents ***** of the synthetic sentences are presented in such a rich way that it usually becomes difficult to identify the synthetic part of it | ||
| distinguish | 9 | |
| K17-1004 We show that a simple linear classifier informed by stylistic features is able to successfully ***** distinguish ***** among the three cases, without even looking at the story context. | ||
| D19-1391 Our goal is to instill an inductive bias in the parser to help it ***** distinguish ***** between spurious and correct programs. | ||
| S19-2057 We show that our neural ensemble systems can successfully ***** distinguish ***** three emotions (SAD, HAPPY, and ANGRY) and separate them from the rest (OTHERS) in a highly-imbalanced scenario. | ||
| L12-1240 In this paper, we present a work-in progress annotation model that allows a user to a) track hands/face b) extract features c) ***** distinguish ***** strokes from non-strokes. | ||
| 2019.gwc-1.12 This way, a distributional model can only ***** distinguish ***** between positive and negative examples through evidence for a target property | ||
| Existing datasets | 9 | |
| 2020.cmcl-1.5 ***** Existing datasets ***** for multiple languages also include linguistic variables such as the length and the frequency of lemmas in different corpora. | ||
| W19-8617 ***** Existing datasets ***** for keyphrase generation are only readily available for the scholarly domain and include non-expert annotations. | ||
| 2020.acl-main.541 ***** Existing datasets ***** for regular expression (regex) generation from natural language are limited in complexity; compared to regex tasks that users post on StackOverflow, the regexes in these datasets are simple, and the language used to describe them is not diverse. | ||
| P18-2124 ***** Existing datasets ***** either focus exclusively on answerable questions, or use automatically generated unanswerable questions that are easy to identify. | ||
| 2021.emnlp-main.381 ***** Existing datasets ***** focus on high-level description of how research is carried out | ||
| mixture | 9 | |
| W18-2711 This study focuses on the use of incomplete multilingual corpora in multi-encoder NMT and ***** mixture ***** of NMT experts and examines a very simple implementation where missing source translations are replaced by a special symbol NULL. | ||
| 2020.acl-main.587 Drawing connections between multi-head attention and ***** mixture ***** of experts, we propose the ***** mixture ***** of attentive experts model (MAE). | ||
| 2012.amta-commercial.6 This paper evaluates the performance of two adaptive techniques based on log-linear and ***** mixture ***** models on data from the legal domain in real-world settings. | ||
| 2021.emnlp-main.710 In particular, we combine domain adaptation techniques such as ***** mixture ***** of experts and domain-adversarial training with label embeddings, and we demonstrate sizable performance gains over strong baselines, both (i) in-domain, i.e., for seen targets, and (ii) out-of-domain, i.e., for unseen targets. | ||
| 2021.acl-long.273 Since the distribution of outlier utterances is arbitrary and unknown in the training stage, existing methods commonly rely on strong assumptions on data distribution such as ***** mixture ***** of Gaussians to make inference, resulting in either complex multi-step training procedures or hand-crafted rules such as confidence threshold selection for outlier detection | ||
| IMDB | 9 | |
| E17-1096 Significant improvements are observed in classifying the ***** IMDB ***** movie review dataset. | ||
| D18-1284 The models encode dialogue snippets from ***** IMDB ***** into representations that can capture the various categories of film characters. | ||
| W19-3802 We explore the ***** IMDB ***** movie review dataset and 9 different corpora from Project Gutenberg. | ||
| P19-1398 We demonstrate the effectiveness of CVDD quantitatively as well as qualitatively on the well-known Reuters, 20 Newsgroups, and ***** IMDB ***** Movie Reviews datasets. | ||
| W19-4824 The attack methods are evaluated on the Convolutional Neural Network (CNN) sentiment classifier trained on the ***** IMDB ***** movie review dataset | ||
| submodular | 9 | |
| 2021.naacl-industry.39 We compare four widely used SSL techniques, Pseudo-label (PL), Knowledge Distillation (KD), Virtual Adversarial Training (VAT) and Cross-View Training (CVT) in conjunction with two data selection methods including committee-based selection and ***** submodular ***** optimization based selection. | ||
| N18-1157 This approach is known to have three advantages: its applicability to many useful ***** submodular ***** objective functions, the efficiency of the greedy algorithm, and the provable performance guarantee. | ||
| D17-1114 Finally, all the multi-modal aspects are considered to generate the textural summary by maximizing the salience, non-redundancy, readability and coverage through budgeted optimization of ***** submodular ***** functions. | ||
| E17-2074 Our approach builds on the graph-of-words representation of text and leverages the k-core decomposition algorithm and properties of ***** submodular ***** functions. | ||
| Q18-1015 Second, our novel taxonomy guided, ***** submodular *****, active learning method for collecting annotations about rare entities (e.g., oriole, a bird) is 6x more effective at inferring further new facts about them than multiple active learning baselines | ||
| correlating | 9 | |
| 2020.cmcl-1.5 An intrinsic evaluation is proposed, ***** correlating ***** estimated Italian lemmas' AoA with English lemmas' AoA. | ||
| 2021.naacl-main.353 Error analysis reveals a variability in the ability of neural model to capture different phonological changes, ***** correlating ***** with the complexity of the changes. | ||
| 2021.acl-long.130 We propose a framework for computationally measuring uptake, by (1) releasing a dataset of student-teacher exchanges extracted from US math classroom transcripts annotated for uptake by experts; (2) formalizing uptake as pointwise Jensen-Shannon Divergence (pJSD), estimated via next utterance classification; (3) conducting a linguistically-motivated comparison of different unsupervised measures and (4) ***** correlating ***** these measures with educational outcomes. | ||
| 2008.amta-govandcom.22 In this paper we argue for a framework of practices to describe the PE process by ***** correlating ***** data obtained in laboratory experiments and augmented by additional data from different resources such as interviews and mathematical prediction models with the tasks fulfilled, and to model the identified process in a multi-facetted fashion as a basis for the implementation of a human PE-aware interactive software system. | ||
| L14-1523 Such classifiers are built automatically by parallel corpus analysis: Creating subcorpora for each translation of a 1:n package, and identifying ***** correlating ***** concepts in these subcorpora as features of the classifier | ||
| motivations | 9 | |
| 2020.lrec-1.161 This relatively new form of fact-checking receives a fair amount of attention from academics, with current research focusing mostly on journalists' ***** motivations ***** for publishing post-hoc fact-checks, the effects of fact-checking on the perceived accuracy of false claims, and the creation of computational tools for automatic fact-checking. | ||
| I17-1078 Applying this model on a large quantity of tweets collected before, after, and on election day reveals ***** motivations ***** and patterns of inflammatory language. | ||
| 2021.naacl-main.64 These contain natural language ***** motivations ***** paired with in-game goals and human demonstrations; completing a quest might require dialogue or actions (or both). | ||
| 2020.dmr-1.9 It reveals the semantic ***** motivations ***** that lead to constructions being metaphorically extended. | ||
| 2020.acl-main.658 We review ***** motivations *****, definition, approaches, and methodology for unsupervised cross-lingual learning and call for a more rigorous position in each of them | ||
| bilinear | 9 | |
| P17-2051 Our evaluation suggests that the proposed models can better incorporate side information than previously proposed combinations of ***** bilinear ***** models with convolutional neural networks, showing large improvements when scoring the plausibility of unobserved facts with associated textual mentions. | ||
| 2021.semeval-1.145 joint text and vision features by combining them with compact ***** bilinear ***** pooling, to automatically identify rhetorical and psychological disinformation techniques. | ||
| 2020.mwe-1.19 The dependency parse tree prediction is modelled by a linear layer and a ***** bilinear ***** one plus a tree CRF architecture on top of the shared BERT. | ||
| P19-1026 Despite of their successful performances, existing ***** bilinear ***** forms overlook the modeling of relation compositions, resulting in lacks of interpretability for reasoning on KG. | ||
| C16-1289 After that, we fully incorporate information of different linguistic units into a ***** bilinear ***** semantic similarity model | ||
| misalignment | 9 | |
| 2021.naacl-main.193 Second, to alleviate the temporal ***** misalignment ***** issue, our method incorporates an entropy minimization-based constrained attention loss, to encourage the model to automatically focus on the correct caption from a pool of candidate ASR captions. | ||
| 2021.humeval-1.8 We study this ***** misalignment ***** problem by surveying 10 randomly sampled papers published in ACL 2020 that report results with human evaluation. | ||
| 2020.coling-main.167 In each rectification-modulation layer, unlike existing methods directly conducting the cross-modal interaction, we first devise a rectification module to correct implicit attention ***** misalignment ***** which focuses on the wrong position during the cross-interaction process. | ||
| 2012.amta-papers.7 Although fixing errors, when applicable, is a preferable strategy to removal, its benefits only become apparent for fairly high ***** misalignment ***** rates. | ||
| 2000.amta-papers.4 Points that may cause ***** misalignment ***** are filtered using confidence bands of linear regression analysis instead of heuristics, which are not theoretically reliable | ||
| inferencing | 9 | |
| 1993.eamt-1.7 In this paper, I will show that TFF can also be used as a means to model finite automata (FA) and to perform certain types of logical ***** inferencing *****. | ||
| 2016.lilt-14.1 With the focus moving to questions of natural language understanding and ***** inferencing ***** as well as to sentiment and opinion analysis, it becomes necessary to distinguish between actual and envisioned eventualities and to draw conclusions about the attitude of the writer or speaker towards the eventualities referred to. | ||
| 2020.semeval-1.74 Commonsense reasoning is a challenging task in the domain of natural language understanding and systems augmented with it can improve performance in various other tasks such as reading comprehension, and ***** inferencing *****. | ||
| 2020.emnlp-main.606 HABERTOR inherits BERT's architecture, but is different in four aspects: (i) it generates its own vocabularies and is pre-trained from the scratch using the largest scale hatespeech dataset; (ii) it consists of Quaternion-based factorized components, resulting in a much smaller number of parameters, faster training and ***** inferencing *****, as well as less memory usage; (iii) it uses our proposed multi-source ensemble heads with a pooling layer for separate input sources, to further enhance its effectiveness; and (iv) it uses a regularized adversarial training with our proposed fine-grained and adaptive noise magnitude to enhance its robustness. | ||
| P16-5007 Specifically, we will go over various techniques in knowledge acquisition, representation, and ***** inferencing ***** has been proposed for text understanding, and we will describe massive structured and semi-structured data that have been made available in the recent decade that directly or indirectly encode human knowledge, turning the knowledge representation problems into a computational grand challenge with feasible solutions insight | ||
| multilingual dataset | 9 | |
| N19-1275 The experiments on a standard ***** multilingual dataset ***** for verbal MWEs show that our model outperforms the baselines not only in the case of discontinuous MWEs but also in overall F-score. | ||
| 2020.lrec-1.352 We detail our effort to scrape, clean, align, and utilize this ripe ***** multilingual dataset *****. | ||
| 2021.acl-short.86 In this work, we introduce : the largest publicly available ***** multilingual dataset ***** for factual verification of naturally existing real-world claims. | ||
| 2021.ltedi-1.10 For this task, we have used a Shared Task ***** multilingual dataset ***** on Hope Speech Detection for Equality, Diversity, and Inclusion (HopeEDI) for three languages English, code-switched Tamil and Malayalam. | ||
| D19-1165 On a massively ***** multilingual dataset ***** of 103 languages, our adaptation approach bridges the gap between individual bilingual models and one massively multilingual model for most language pairs, paving the way towards universal machine translation | ||
| “what | 9 | |
| 2020.emnlp-main.373 Inspired by inquiry-based discovery learning (Bruner, 1961), our approach inquires language models with a number of information seeking questions such as ***** “what ***** is the definition of...” to discover additional background knowledge. | ||
| 2020.lrec-1.742 For a long time, philosophers, linguists and scientists have been keen on finding an answer to the mind-bending question ***** “what ***** does abstract language look like?”, which has also sprung from the phenomenon of mental imagery and how this emerges in the mind. | ||
| 2020.emnlp-main.88 However, current machine reading comprehension benchmarks have practically no questions that test temporal phenomena, so systems trained on these benchmarks have no capacity to answer questions such as ***** “what ***** happened before/after [some event]?” | ||
| W18-6221 Essentially, these methods answers questions such as, ***** “what ***** is being talked about, regarding X”, and ***** “what ***** do people feel, regarding X”. | ||
| 2020.acl-demos.13 In addition, the GUI interface enables researchers with limited computer background to compose tools into NLP pipelines and then apply the pipelines on their own datasets in a ***** “what ***** you see is what you get” (WYSIWYG) way | ||
| 1B | 9 | |
| 2021.eacl-main.217 This includes Reviews2Movielens, mapping the ~***** 1B ***** word corpus of Amazon movie reviews (He and McAuley, 2016) to MovieLens tags (Harper and Konstan, 2016), as well as Reddit Movie Suggestions with natural language queries and corresponding community recommendations. | ||
| 2021.emnlp-main.396 Using over ***** 1B ***** comments collected from the largest communities on Reddit.com representing ~40% of Reddit activity, we demonstrate the efficacy of this approach to uncover complex ideological differences across multiple axes of polarization. | ||
| 2021.acl-long.90 We then draw learning curves that track the growth of these different measures of model ability with respect to pretraining data volume using the MiniBERTas, a group of RoBERTa models pretrained on 1M, 10M, 100M and ***** 1B ***** words. | ||
| 2020.emnlp-main.16 We pretrain RoBERTa from scratch on quantities of data ranging from 1M to ***** 1B ***** words and compare their performance on MSGS to the publicly available RoBERTa_BASE. | ||
| 2020.sdp-1.29 For Task ***** 1B *****, we follow previous submissions in applying methods that deal well with low resources and imbalanced classes | ||
| CommonGen | 9 | |
| 2021.gem-1.13 We participate in the modeling shared task where we submit outputs on four datasets for data-to-text generation, namely, DART, WebNLG (en), E2E and ***** CommonGen *****. | ||
| 2021.acl-long.9 Our experiments on the Common Sense Generation task (***** CommonGen *****) (Lin et al., 2020), End2end Restaurant Dialog task (E2ENLG) (Dusek et al., 2020) and Novel Object Captioning task (nocaps) | ||
| 2020.findings-emnlp.165 The ***** CommonGen ***** task is challenging because it inherently requires 1) relational reasoning with background commonsense knowledge and 2) compositional generalization ability to work on unseen concept combinations. | ||
| 2020.coling-main.182 We conduct experiment on ***** CommonGen ***** benchmark, experimental results show that our method significantly improves the performance on all the metrics. | ||
| 2021.gem-1.15 In a current experiment we were testing *****CommonGen***** dataset for structure - to - text task from GEM living benchmark with the constraint based POINTER model . | ||
| subregular | 9 | |
| W19-4223 This paper demonstrates that there are regular functions that are not weakly deterministic, and, because all attested processes are weakly deterministic, supports the ***** subregular ***** hypothesis. | ||
| W19-3901 We find that LSTMs function like counter machines and relate convolutional networks to the ***** subregular ***** hierarchy. | ||
| 2021.sigmorphon-1.19 We describe the learner and show how to parameterize it to induce unrestricted regular languages, as well as how to restrict it to certain ***** subregular ***** classes such as Strictly k-Local and Strictly k-Piecewise languages. | ||
| W19-4225 This paper defines a ***** subregular ***** class of functions called the tier-based synchronized strictly local (TSSL) functions. | ||
| W19-4216 This paper situates culminative unbounded stress systems within the *****subregular***** hierarchy for functions . | ||
| Kaggle | 9 | |
| 2020.wosp-1.12 The tasks were hosted on ***** Kaggle *****, and the participated systems were evaluated using the macro f-score. | ||
| W19-3801 263 teams competed via a ***** Kaggle ***** competition, with the winning system achieving logloss of 0.13667 and near gender parity. | ||
| 2021.semeval-1.9 We collected 10,000 texts from Twitter and the ***** Kaggle ***** Short Jokes dataset, and had each annotated for humor and offense by 20 annotators aged 18-70. | ||
| 2021.emnlp-main.487 Our model has up to 11% performance improvement over state-of-the-art results on the benchmark SemEval-2013 datasets, and surpasses custom approaches designed for a ***** Kaggle ***** challenge, demonstrating its generality | ||
| 2021.teachingnlp-1.8 The resource creates all the needed resources to create a classroom competition that engages and inspires your students on the free , self - service *****Kaggle***** platform . | ||
| experimentation | 9 | |
| 2020.stoc-1.4 The next hype of data ***** experimentation ***** is going to be heavily dependent on privacy preserving techniques mainly as it's going to be a legal responsibility rather than a mere social responsibility. | ||
| N19-1381 Some researchers resort to human judgment ***** experimentation ***** for assessing response quality, which is expensive, time consuming, and not scalable. | ||
| 2021.alta-1.19 These are the pathogens that are the focus of direct ***** experimentation ***** in the research, rather than those that are referred to for context or as playing secondary roles. | ||
| P19-3029 Flambë is a machine learning ***** experimentation ***** framework built to accelerate the entire research life cycle. | ||
| W17-1912 The TTCS^ℰ has been evaluated in a preliminary ***** experimentation ***** on a conceptual similarity task | ||
| etymologically | 9 | |
| L10-1215 Relationships are derived from WordNet and Wiktionary to allow users to discover semantically related words, ***** etymologically ***** related words, alternative spellings, as well as misspellings. | ||
| 2020.conll-1.21 Compositional distributional models of meaning have been argued to deal well with finer shades of meaning variation known as polysemy, but are not so well equipped to handle word senses that are ***** etymologically ***** unrelated, or homonymy. | ||
| W17-1210 As neither annotated corpora nor parallel corpora are electronically available for Rusyn, we propose to combine existing resources from the ***** etymologically ***** close Slavic languages Russian, Ukrainian, Slovak, and Polish and adapt them to Rusyn. | ||
| E17-1113 Most current approaches in phylogenetic linguistics require as input multilingual word lists partitioned into sets of ***** etymologically ***** related words (cognates). | ||
| L14-1619 In this paper, we describe our generic approach for transferring part-of-speech annotations from a resourced language towards an ***** etymologically ***** closely related non-resourced language, without using any bilingual (i.e., parallel) data | ||
| seen | 9 | |
| 2021.splurobonlp-1.5 Our neural agent improves strong baselines on the ***** seen ***** environments and shows competitive performance on the un***** seen ***** environments. | ||
| 2021.emnlp-main.746 However, as most of the existing methods do not achieve effective knowledge transfer to the target domain, they just fit the distribution of the ***** seen ***** slot and show poor performance on un***** seen ***** slot in the target domain. | ||
| 2021.naacl-main.272 In this paper, we formulate the zero-shot relation extraction problem by incorporating the text description of ***** seen ***** and un***** seen ***** relations. | ||
| C18-1185 Our approach can recognize previously un***** seen ***** NE categories while preserving the knowledge of the ***** seen ***** data. | ||
| 2020.acl-main.39 However, they are often unable to extrapolate patterns beyond the ***** seen ***** data, even when the abstractions required for such patterns are simple | ||
| IPA | 9 | |
| C16-1328 PanPhon is a database relating over 5,000 ***** IPA ***** segments to 21 subsegmental articulatory features. | ||
| 1997.mtsummit-papers.11 And other linguistic data recently developed in Japan, which includes the RWC text database and the simple sentence data by the CRL and ***** IPA *****. | ||
| 2020.lrec-1.369 ENGLAWI contains 752,769 articles encoding the full body of information included in Wiktionary: simple words, compounds and multiword expressions, lemmas and inflectional paradigms, etymologies, phonemic transcriptions in ***** IPA *****, definition glosses and usage examples, translations, semantic and morphological relations, spelling variants, etc. | ||
| L14-1131 Finally, we implement Arabic transcription technology (Brierley et al under review; Sawalha et al forthcoming) to create a qalqalah pronunciation guide where each word is transcribed phonetically in ***** IPA ***** and mapped to its chapter-verse ID. | ||
| D19-6121 We hy- pothesize that the sequences of ***** IPA ***** characters used to represent pronunciation do not capture its full nuance, especially when cleaned to fa- cilitate machine learning | ||
| enriched | 9 | |
| P17-1118 Taken together, the results indicate that complex networks ***** enriched ***** with embedding is promising for detecting MCI in large-scale assessments. | ||
| L16-1203 The Nederlab project aims to bring together all digitized texts relevant to the Dutch national heritage, the history of the Dutch language and culture (circa 800 – present) in one user friendly and tool ***** enriched ***** open access web interface. | ||
| 2020.lrec-1.886 Second, we ***** enriched ***** the PCMEP and PLAEME, which adopted the annotation format of the PPCME2, with verb lemmas to undertake studies that fill the well-known data gap in the subperiod (1250–1350) of the PPCME2. | ||
| L10-1265 We show how we ***** enriched ***** the Lefff syntactic lexicon so that it provides an account for quotation verbs heading a quotation parenthetical, especially those extracted from a news wire corpus. | ||
| 2020.lrec-1.627 For the purposes of this study, we ***** enriched ***** the existing annotation in the treebank, with a further level that includes irony activators | ||
| complement | 9 | |
| D19-5315 We show these proxy evaluation methods ***** complement ***** each other regarding error handling, coverage, interpretability, and scope, and thus altogether contribute to the observation of the relative strength of existing models. | ||
| W19-2902 found that a preceding infinitival “to” increases the use of following optional “to”, but the use of “to” in the ***** complement ***** of help is reduced following “to help”. | ||
| 2020.emnlp-main.274 Recent works have shown that generative data augmentation, where synthetic samples generated from deep generative models ***** complement ***** the training dataset, benefit NLP tasks. | ||
| 2020.eamt-1.21 Furthermore, when we model this source- and target-language syntactic information together as the conditional context, both types ***** complement ***** each other and our fully syntax-informed INMT model statistically significantly reduces human efforts in a French–to–English translation task, achieving 4.30 points absolute (corresponding to 9.18% relative) improvement in terms of word prediction accuracy (WPA) and 4.84 points absolute (corresponding to 9.01% relative) reduction in terms of word stroke ratio (WSR) over the baseline. | ||
| D19-1420 Moreover, we explicitly control the rationale ***** complement ***** via an adversary so as not to leave any useful information out of the selection | ||
| randomized | 9 | |
| N19-1336 On a case-study of the negotiation agent developed by (Lewis et al., 2017), our attacks reduced the average advantage of rewards between the attacker and the trained RL-based agent from 2.68 to -5.76 on a scale from -10 to 10 for ***** randomized ***** goals. | ||
| P17-1097 The new algorithm guards against spurious programs by combining the systematic search traditionally employed in MML with the ***** randomized ***** exploration of RL, and by updating parameters such that probability is spread more evenly across consistent programs. | ||
| W19-2606 Choosing from these, they can access ***** randomized ***** control trials (RCTs) describing individual studies. | ||
| P17-2080 However, these latent variables are highly ***** randomized *****, leading to uncontrollable generated responses. | ||
| 2021.emnlp-main.830 It approximates the softmax attention with ***** randomized ***** or heuristic feature maps, but can be difficult to train and may yield suboptimal accuracy | ||
| subsumption | 9 | |
| L06-1267 Unlike other methods, such as those reported in (Brants, 1995) or (Tufis & Dragomirescu, 2004), which assume a ***** subsumption ***** relation between the considered tagsets, and as such they aim at minimizing the tagsets by eliminating the feature-value redundancy, this method is applicable for completely unrelated tagsets. | ||
| Q14-1006 To compute these denotational similarities, we construct a denotation graph, i.e. a ***** subsumption ***** hierarchy over constituents and their denotations, based on a large corpus of 30K images and 150K descriptive captions. | ||
| D17-3001 This tutorial examines the theoretical foundations of ***** subsumption *****, and its practical embodiment through IsA relations compiled manually or extracted automatically. | ||
| W89-0241 allows us to take full advantage of both BU and TD aspects of a unificatin-based grammar without incurring prohibitive overheads such as feature-structure comparison or ***** subsumption ***** checking. | ||
| Q17-1032 In contrast to the standard approach of simple ranking by association measure, in our model n-grams are arranged in a lattice structure based on ***** subsumption ***** and overlap relationships, with nodes inhibiting other nodes in their vicinity when they are selected as a lexical item | ||
| aka | 9 | |
| D19-1391 Semantic parsing aims to map natural language utterances onto machine interpretable meaning representations, ***** aka ***** programs whose execution against a real-world environment produces a denotation. | ||
| 2020.iwslt-1.13 In this paper, we demonstrate our machine translation system applied for the Chinese-Japanese bidirectional translation task (***** aka *****. | ||
| P19-1412 Inferring speaker commitment (***** aka ***** event factuality) is crucial for information extraction and question answering. | ||
| 2020.acl-main.41 We use the norm (***** aka ***** length or module) of a word embedding as a measure of 1) the difficulty of the sentence, 2) the competence of the model, and 3) the weight of the sentence. | ||
| P18-1143 We show that when training examples are sampled appropriately from this synthetic data and presented in certain order (***** aka ***** training curriculum) along with monolingual and real CM data, it can significantly reduce the perplexity of an RNN-based language model | ||
| degenerate | 9 | |
| 2021.naacl-main.121 We find that existing fact checking systems that perform well on claims in formal style significantly ***** degenerate ***** on colloquial claims with the same semantics. | ||
| 2020.emnlp-main.257 In this work, we ask whether non-isomorphism is also crucially a sign of ***** degenerate ***** word vector spaces. | ||
| 2021.naacl-main.209 We explore the possibility of training autoregressive machine translation models with latent alignment objectives, and observe that, in practice, this approach results in ***** degenerate ***** models. | ||
| 2021.acl-short.8 We formally prove that — save for the ***** degenerate ***** case — attention weights and leave-one-out values cannot be Shapley Values. | ||
| 2021.americasnlp-1.14 Word-level annotation results in ***** degenerate ***** trees for some Yupik sentences and often fails to capture syntactic relations that can be manifested at the morpheme level | ||
| harmonized | 9 | |
| L14-1474 This paper discusses a trial to build a multilingual ***** harmonized ***** dictionary that contains more than 40 languages, with special reference to Arabic which represents about 20% of the whole size of the dictionary. | ||
| 2020.findings-emnlp.6 To alleviate the imbalance issue, we extend the gradient ***** harmonized ***** mechanism used in object detection to the aspect-based sentiment analysis by adjusting the weight of each label dynamically. | ||
| W18-4917 We have evaluated and ***** harmonized ***** each annotation part to obtain a high annotated-quality corpus. | ||
| C16-1039 Similarly, we learn cross-register word embeddings from the ***** harmonized ***** Hindi and Urdu corpora to nullify their lexical divergences. | ||
| L14-1415 Future work will be dedicated to the comparison of the ***** harmonized ***** lexicon with German corpora annotated with polarity information | ||
| biomedicine | 9 | |
| 2021.emnlp-main.429 Extracting relations across large text spans has been relatively underexplored in NLP, but it is particularly important for high-value domains such as ***** biomedicine *****, where obtaining high recall of the latest findings is crucial for practical applications. | ||
| W17-2344 We recently developed Olelo, a QA system for ***** biomedicine ***** which includes various NLP components, such as question processing, document and passage retrieval, answer processing and multi-document summarization. | ||
| P17-5001 The tutorial will provide an accessible overview of ***** biomedicine *****, and does not presume knowledge in biology or healthcare. | ||
| L04-1155 In the field of ***** biomedicine *****, there is a critical need for automatic text processing. | ||
| W19-5006 Inspired by the success of the General Language Understanding Evaluation benchmark, we introduce the Biomedical Language Understanding Evaluation (BLUE) benchmark to facilitate research in the development of pre-training language representations in the ***** biomedicine ***** domain | ||
| scope | 9 | |
| 2020.lrec-1.704 Our model, referred to as NegBERT, achieves a token level F1 score on ***** scope ***** resolution of 92.36 on the Sherlock dataset, 95.68 on the BioScope Abstracts subcorpus, 91.24 on the BioScope Full Papers subcorpus, 90.95 on the SFU Review Corpus, outperforming the previous state-of-the-art systems by a significant margin. | ||
| 2021.naacl-main.227 In this paper, we propose a multi-task learning method to incorporate information from syntactic and semantic auxiliary tasks, including negation and speculation ***** scope ***** detection, to create English-language models that are more robust to these phenomena. | ||
| 2020.emnlp-main.356 The size, ***** scope ***** and detail of RxR dramatically expands the frontier for research on embodied language agents in photorealistic simulated environments. | ||
| 2021.isa-1.3 Literary texts feature a rich variety in expressing quantification, including a broad range of lexemes to express quantifiers and complex sentence structures to express the restrictor and the nuclear ***** scope ***** of a quantification. | ||
| 2020.emnlp-main.684 For example , the semantic capacity of artificial intelligence is higher than that of linear regression since artificial intelligence possesses a broader meaning *****scope***** . | ||
| clustered | 9 | |
| 2021.rocling-1.29 For concept expansion of a given topic, related posts are collected from social media and ***** clustered ***** by word embeddings. | ||
| L10-1211 These resources can be deployed as local services or web services, even possible to be hosted in ***** clustered ***** machines to increase the performance, while users do not need to be aware of such differences. | ||
| 2021.ccl-1.88 Having realized this we propose a novel method that utilizes the topic knowledge implied by the ***** clustered ***** messages to aid in the comprehension of those short messages. | ||
| 2020.acl-main.695 Then, we bootstrap a neural transducer on top of the ***** clustered ***** data to predict words to realize the empty paradigm slots. | ||
| D18-1536 Then, the annotators are asked to assign the ***** clustered ***** questions into different intent categories | ||
| uncovering | 9 | |
| P18-1198 We introduce here 10 probing tasks designed to capture simple linguistic features of sentences, and we use them to study embeddings generated by three different encoders trained in eight distinct ways, ***** uncovering ***** intriguing properties of both encoders and training methods. | ||
| C18-1135 We build on sociolinguistic theories and focus on the relation between the spread of a novel term and the social role of the individuals who use it, ***** uncovering ***** characteristics of innovators and adopters. | ||
| P19-1039 We reveal subtle signs of concealing information in speech and text, compare and contrast them with those in deception detection literature, ***** uncovering ***** the link between concealing information and deception. | ||
| 2020.emnlp-main.171 Latent structure models are a powerful tool for modeling language data: they can mitigate the error propagation and annotation bottleneck in pipeline systems, while simultaneously ***** uncovering ***** linguistic insights about the data. | ||
| 2020.lrec-1.234 Empirical results suggest that the proposed methodology can be meaningfully applied to parsing into graph-structured target representations, ***** uncovering ***** hitherto unknown properties of the different systems that can inform future development and cross-fertilization across approaches | ||
| tag | 9 | |
| 2020.findings-emnlp.325 However, these approaches have used heuristics or off-the-shelf models to first ***** tag ***** training stories with the desired type of plan, and then train generation models in a supervised fashion. | ||
| N19-1028 For each collected question-answer pair, we first ***** tag ***** all entities in each question and search for relevant predicates that bridge a ***** tag *****ged entity with the answer in Freebase. | ||
| L14-1565 In addition, they have discussed alternative ways of ***** tag ***** assignment in terms of bipartite ***** tag *****s (stem, token) for historical texts and tripartite ***** tag *****s (lexicon, morphology, distribution) for learner texts. | ||
| L14-1633 A non-negligible advan***** tag *****e of the first strategy for generating SD annotated texts is that semi-automatic extensions of the training resource are more easily and consistently carried out with respect to a reduced dependency ***** tag ***** set. | ||
| Q16-1018 These HMMs, which we call anchor HMMs, assume that each ***** tag ***** is associated with at least one word that can have no other ***** tag *****, which is a relatively benign condition for POS ***** tag *****ging (e.g., “the” is a word that appears only under the determiner ***** tag *****) | ||
| factored | 9 | |
| 2016.iwslt-1.3 Compared to the standard NMT system, ***** factored ***** architecture increases significantly the vocabulary coverage while decreasing the number of unknown words. | ||
| 2021.splurobonlp-1.4 Both our ***** factored ***** models and black-box baseline models perform quite well, but the ***** factored ***** models will enable reasoned explanations of spatial relation judgements. | ||
| P19-1149 LRN uses input and forget gates to handle long-range dependencies as well as gradient vanishing and explosion, with all parameter related calculations ***** factored ***** outside the recurrence. | ||
| L14-1205 Three MT systems are compared: (1) a baseline phrase-based SMT; (2) a tense-aware SMT system using the above predictions within a ***** factored ***** translation model; and (3) a system using oracle predictions from the aligned VPs. | ||
| 2008.amta-papers.3 Two string-to-chunks translation models are proposed: a ***** factored ***** model, which augments phrase-based SMT with layered dependencies, and a joint model, that extends the phrase translation table with microtags, i.e. per-word projections of chunk labels | ||
| observed | 9 | |
| 2020.acl-main.684 By leveraging the idea of inverse semantics from program synthesis to reason backwards from ***** observed ***** demonstrations, we ensure that all considered interpretations are consistent with executable actions in any context, thus simplifying the problem of search over logical forms. | ||
| 2021.semeval-1.27 Additionally, we perform a thorough ablative analysis and analyze our ***** observed ***** results. | ||
| 2021.emnlp-main.408 Notably, although prior work has emphasized the use of clever augmentation techniques including back-translation, we find that enforcing consistency between predictions assigned to ***** observed ***** and randomly substituted words often yields comparable (or greater) benefits compared to these more complex perturbation models. | ||
| D19-1287 This behavior is highly regular and even sensitive to local syntactic context, however it differs crucially from ***** observed ***** human behavior. | ||
| 2021.acl-long.403 Abductive reasoning aims at inferring the most plausible explanation for ***** observed ***** events, which would play critical roles in various NLP applications, such as reading comprehension and question answering | ||
| propagate | 9 | |
| 2021.emnlp-main.820 All datasets for training and evaluating models for EL consist of convenience samples, such as news articles and tweets, that ***** propagate ***** the prior probability bias of the entity distribution towards more frequently occurring entities. | ||
| 2021.wassa-1.26 We include three separate transcription tools and show that while all automated transcriptions ***** propagate ***** errors that substantially impact downstream performance, the open-source tools fair worse than the paid tool, though not always straightforwardly, and word error rates do not correlate well with downstream performance. | ||
| 2021.acl-long.301 Pipelines are conceptually simple, but errors ***** propagate ***** from one component to the next, without later components being able to revise earlier decisions. | ||
| 2021.cl-4.26 Various deep neural networks have been proposed to jointly perform entity extraction and relation prediction, which only ***** propagate ***** information implicitly via representation learning. | ||
| 2020.findings-emnlp.350 To ***** propagate ***** and integrate information beyond the scope of the target function, we design a novel learning framework based on the bidirectional gated recurrent unit and a graph attention network with a pointer mechanism | ||
| premise | 9 | |
| N18-2017 Large-scale datasets for natural language inference are created by presenting crowd workers with a sentence (***** premise *****), and asking them to generate three new sentences (hypotheses) that it entails, contradicts, or is logically neutral with respect to. | ||
| 2021.naacl-main.104 Our findings show that: (1) the relatively shorter length of ***** premise *****s in traditional NLI datasets is the primary challenge prohibiting usage in downstream applications (which do better with longer contexts); (2) this challenge can be addressed by automatically converting resource-rich reading comprehension datasets into longer-***** premise ***** NLI datasets; and (3) models trained on the converted, longer-***** premise ***** datasets outperform those trained using short-***** premise ***** traditional NLI datasets on downstream tasks primarily due to the difference in ***** premise ***** lengths. | ||
| 2021.emnlp-main.504 We tackle the task of generating the implicit ***** premise ***** in an enthymeme, which requires not only an understanding of the stated conclusion and ***** premise ***** but also additional inferences that could depend on commonsense knowledge. | ||
| 2021.argmining-1.9 In this paper, we address (A) the identification of argumentative discourse units and (B) their classification as major position or ***** premise ***** in German public participation processes. | ||
| 2014.lilt-9.10 The model contains a formally defined interpreted lexicon, which specifies the inventory of symbols and the supported semantic operators, and an informally defined annotation scheme that instructs annotators in which way to bind words and constructions from a given pair of ***** premise ***** and hypothesis to the interpreted lexicon | ||
| DeftEval | 9 | |
| 2020.semeval-1.93 Our proposed model produces better results than BERT and achieves comparable results to BERT with fine tuned language model in ***** DeftEval ***** (Task 6 of SemEval 2020), a shared task of classifying whether a sentence contains a definition or not (Subtask 1). | ||
| 2020.semeval-1.91 This paper describes participation in ***** DeftEval ***** 2020 (part of SemEval sharing task competition), and is focused on the sentence classification. | ||
| 2020.semeval-1.59 ***** DeftEval ***** was split into three subtasks: sentence classification, sequence labeling and relation classification. | ||
| 2020.semeval-1.92 In this paper we describe our submissions to the ***** DeftEval ***** shared task (SemEval-2020 Task 6), which is evaluated on an English textbook corpus. | ||
| 2020.semeval-1.41 In this work, we present ***** DeftEval *****, a SemEval shared task in which participants must extract definitions from free text using a term-definition pair corpus that reflects the complex reality of definitions in natural language | ||
| Finnish | 9 | |
| W18-6410 We participate in the multilingual subtrack with a system trained under the constrained condition to translate from English to both ***** Finnish ***** and Estonian. | ||
| 2020.lrec-1.735 We use synchronized 120 fps motion capture and 50 fps eye tracking data from two native signers to investigate the temporal order in which the dominant hand, the head, the chest and the eyes start producing overt constructed action from regular narration in seven short ***** Finnish ***** Sign Language stories. | ||
| 2021.nodalida-main.42 The exception to this is ***** Finnish *****, which we assume is due to inferior translation quality. | ||
| L16-1574 Beyond the specific task at hand the approach will also be useful for the analysis of other types of spaceless text such as Twitter hashtags and texts in agglutinative or spaceless languages like ***** Finnish ***** or Chinese | ||
| 2020.lrec-1.224 We use state - of - the - art neural machine translation models trained on the Opusparcus corpus to generate paraphrases in six languages : German , English , *****Finnish***** , French , Russian , and Swedish . | ||
| monologue | 9 | |
| C16-1037 Evaluation proves that inter-annotator agreement reaches satisfactory values, from 0.60 to 0.80 Cohen's kappa, while the prosody tagger achieves acceptable recall and f-measure figures for five spontaneous samples used in the evaluation of ***** monologue ***** and dialogue formats in English and Spanish. | ||
| P18-1052 Our model achieves state of the art results on standard coherence assessment tasks in ***** monologue ***** and conversations outperforming existing models. | ||
| L10-1079 The corpus was constructed as a resource for extracting rules for automated generation of dialogue from ***** monologue *****. | ||
| L06-1056 Recently , *****monologue***** data such as lecture and commentary by professionals have been considered as valuable intellectual resources , and have been gathering attention . | ||
| 2020.acl-main.133 Recent dialogue coherence models use the coherence features designed for *****monologue***** texts , e.g. | ||
| 2D | 9 | |
| 2021.naacl-main.419 To resolve challenging occlusion issues, we argue that it's crucial to take advantage of both ***** 2D ***** and 3D signals to resolve challenging occlusion issues. | ||
| C16-1329 To integrate the features on both dimensions of the matrix, this paper explores applying ***** 2D ***** max pooling operation to obtain a fixed-length representation of the text. | ||
| L10-1116 By now, our formalism only uses hand ***** 2D ***** locations, we finally discuss about the way of integrating other parameters as hand shape or facial expression in our framework. | ||
| 2020.findings-emnlp.408 SR-Bert can decode both explicit and implicit language to ***** 2D ***** spatial arrangements, generalizes to out-of-sample data to a reasonable extent and can generate complete abstract scenes if paired with a clip-arts predictor. | ||
| L14-1096 Current approaches to sign recognition by computer generally have at least some of the following limitations : they rely on laboratory conditions for sign production , are limited to a small vocabulary , rely on *****2D***** modeling ( and therefore can not deal with occlusions and off - plane rotations ) , and/or achieve limited success . | ||
| multinomial | 9 | |
| I17-1055 We first apply two state-of-the-art lightly-supervised classification models, generalized expectation (GE) criteria (Druck et al., 2008) and ***** multinomial ***** naive Bayes (MNB) with priors (Settles, 2011) to one-class classification where the user only needs to provide a small list of labeled words for the target class. | ||
| 2021.emnlp-main.315 Based on the original token embeddings, we construct a ***** multinomial ***** mixture for augmenting virtual data embeddings, where a masked language model guarantees the semantic relevance and the Gaussian noise provides the augmentation diversity. | ||
| 2021.cmcl-1.4 We propose an analysis method based on ***** multinomial ***** processing tree models (Batchelder and Riefer, 1999) which can correct for this bias and allows for a separation of parameters of theoretical importance from nuisance parameters. | ||
| D19-5621 Neural models that eliminate the softmax bottleneck by generating word embeddings (rather than ***** multinomial ***** distributions over a vocabulary) attain faster training with fewer learnable parameters. | ||
| D18-1160 Unsupervised learning of syntactic structure is typically performed using generative models with discrete latent variables and ***** multinomial ***** parameters | ||
| intention | 9 | |
| L06-1053 This corpus with speech ***** intention ***** tag could be widely used from basic research to applications of spoken dialogue. | ||
| W18-5026 Thus, this mechanism can be effective to express the system's ***** intention ***** to make social distance to the user closer; however, an actual effect of this method is not investigated enough when introduced to the dialog system. | ||
| 2020.nlp4convai-1.5 In hope of facilitating and democratizing research focused on ***** intention ***** detection, we release our code, as well as a new challenging single-domain intent detection dataset comprising 13,083 annotated examples over 77 intents. | ||
| 2020.coling-main.310 Experiments on two SLU benchmark datasets, including two tasks (***** intention ***** detection and slot filling) and federated learning settings (horizontal federated learning, vertical federated learning and federated transfer learning), demonstrate the effectiveness and universality of our approach. | ||
| L12-1515 Compared to prior models, ours is a novel synthesis of the notions of goal, plan, ***** intention *****, outcome, affect and time that is amenable to corpus annotation | ||
| morphophonological | 9 | |
| 2021.naacl-main.435 Despite the performance, the opacity of neural models makes it difficult to determine whether complex generalizations are learned, or whether a kind of separate rote memorization of each ***** morphophonological ***** process takes place. | ||
| 2020.sigmorphon-1.18 We conclude that a significant development effort by trained linguists to analyze and model ***** morphophonological ***** patterns are required in order to surpass the accuracy of neural models. | ||
| W18-5819 The goal of this paper is to explore possible interactions between information-theoretic methods and deterministic linguistic knowledge and to examine some ways in which both can be used in tandem to extract phonological and ***** morphophonological ***** patterns from a small annotated dataset. | ||
| L14-1149 This paper reports on the design and implementation of a *****morphophonological***** analyzer for Lakota , a member of the Siouan language family . | ||
| L14-1686 We describe a morphological analyzer for the Swahili language , written in an extension of XFST / LEXC intended for the easy declaration of *****morphophonological***** patterns and importation of lexical resources . | ||
| projection | 9 | |
| W19-5438 We present a very simple method for parallel text cleaning of low-resource languages, based on ***** projection ***** of word embeddings trained on large monolingual corpora in high-resource languages. | ||
| D19-1084 We evaluate the model extrinsically on data ***** projection ***** for Chinese NER, showing that our alignments lead to higher performance when used to project NER tags from English to Chinese. | ||
| L06-1081 To train the system, we used a semantically annotated corpus that was produced by ***** projection ***** across parallel corpora. | ||
| 2021.acl-long.60 In this paper, we propose a knowledge ***** projection ***** paradigm for event relation extraction: projecting discourse knowledge to narratives by exploiting the commonalities between them. | ||
| E17-2087 We present a new approach to extraction of hypernyms based on ***** projection ***** learning and word embeddings | ||
| suggestion | 9 | |
| S19-2208 This paper describes the participation of DBMS-KU team in the SemEval 2019 Task 9, that is, ***** suggestion ***** mining from online reviews and forums. | ||
| S19-2210 To that end, we classify sentences of a given review as ***** suggestion ***** or not ***** suggestion ***** so that readers of the reviews do not have to go through thousands of reviews but instead can focus on actionable items and applicable ***** suggestion *****s. | ||
| S19-2151 The dataset is made freely available to help advance the research in ***** suggestion ***** mining, and reproduce the systems submitted under this tas | ||
| S19-2216 This paper describes our system partici- pated in Task 9 of SemEval-2019 : the task is focused on *****suggestion***** mining and it aims to classify given sentences into sug- gestion and non - suggestion classes in do- main specific and cross domain training setting respectively . | ||
| L12-1161 The layers of annotation provide : 1 ) quality assessments for 830 correction suggestions for translations into English , at the segment level , and 2 ) 814 usefulness assessments for English - Spanish and English - French translation suggestions , a *****suggestion***** being useful if it contains at least local clues that can be used to improve translation quality . | ||
| Winograd Schema | 9 | |
| 2020.acl-main.671 We propose a self-supervised method to solve Pronoun Disambiguation and ***** Winograd Schema ***** Challenge problems. | ||
| N19-1094 We propose two neural network models based on the Deep Structured Semantic Models (DSSM) framework to tackle two classic commonsense reasoning tasks, ***** Winograd Schema ***** challenges (WSC) and Pronoun Disambiguation (PDP). | ||
| P19-1477 We show that the attentions produced by BERT can be directly utilized for tasks such as the Pronoun Disambiguation Problem and ***** Winograd Schema ***** Challenge. | ||
| 2020.acl-main.679 Large - scale pretrained language models are the major driving force behind recent improvements in perfromance on the *****Winograd Schema***** Challenge , a widely employed test of commonsense reasoning ability . | ||
| W18-4105 The *****Winograd Schema***** Challenge targets pronominal anaphora resolution problems which require the application of cognitive inference in combination with world knowledge . | ||
| EmotionX | 9 | |
| W18-3508 This paper describes the participation of SmartDubai_NLP team in ***** EmotionX ***** shared task and our investigation to detect the emotions from utterance using Neural networks and Natural language understanding. | ||
| W18-3509 This paper presents our system submitted to the ***** EmotionX ***** challenge. | ||
| W18-3506 In this paper, we propose a self-attentive bidirectional long short-term memory (SA-BiLSTM) network to predict multiple emotions for the ***** EmotionX ***** challenge. | ||
| W18-3505 This paper describes an overview of the Dialogue Emotion Recognition Challenge, ***** EmotionX *****, at the Sixth SocialNLP Workshop, which recognizes the emotion of each utterance in dialogues | ||
| W18-3510 This paper addresses the problem of automatic recognition of emotions in conversational text datasets for the *****EmotionX***** challenge . | ||
| morphologically annotated | 9 | |
| L16-1361 TermoPL accepts as input ***** morphologically annotated ***** and disambiguated domain texts and creates a list of terms, the top part of which comprises domain terminology. | ||
| L16-1207 The resources include corpora for each dialect which have been ***** morphologically annotated *****, and morphological analyzers for each dialect which are derived from these corpora. | ||
| 2020.udw-1.21 This treebank aims to create a syntactically and ***** morphologically annotated ***** resource for further research. | ||
| L12-1083 The system needs information about parts of speech and grammatical categories coded in the word-forms, i.e. it takes ***** morphologically annotated ***** text as input, but requires no information about the syntactic structure of the sentence. | ||
| D19-6207 The paper addresses experiments to expand ad hoc ambiguous abbreviations in medical notes on the basis of ***** morphologically annotated ***** texts, without using additional domain resources | ||
| ranked | 9 | |
| C18-1019 Despite individual users' differences in vocabulary knowledge, current systems do not consider these variations; rather, they are trained to find one optimal substitution or ***** ranked ***** list of substitutions for all users. | ||
| D19-1256 We have published the ***** ranked ***** documents so that they can be used off-the-shelf to improve downstream decision models. | ||
| 2021.semeval-1.18 In our experiments, we investigated the possibility of using an all-words fine-grained word sense disambiguation system trained purely on sense-annotated data in English and draw predictions on the semantic equivalence of words in context based on the similarity of the ***** ranked ***** lists of the (English) WordNet synsets returned for the target words decisions had to be made for. | ||
| R17-1086 The experimental results are evaluated, and the efficacy of the ***** ranked ***** features discussed. | ||
| 2019.jeptalnrecital-tia.3 Using the ***** ranked ***** sentences, we propose two approaches to embed documents and show their performances with respect to two baselines | ||
| Training | 9 | |
| 2012.amta-tutorials.2 It covers managing bilingual and monolingual data using Corpus Manager, training hybrid or statistical translation models with ***** Training ***** Manager, and evaluating quality using automatic scoring and side-by-side translation comparison. | ||
| N18-1050 *****Training***** data for sentiment analysis are abundant in multiple domains , yet scarce for other domains . | ||
| 2020.acl-main.213 *****Training***** objectives based on predictive coding have recently been shown to be very effective at learning meaningful representations from unlabeled speech . | ||
| I17-2046 *****Training***** efficiency is one of the main problems for Neural Machine Translation ( NMT ) . | ||
| 2020.acl-main.690 *****Training***** data for NLP tasks often exhibits gender bias in that fewer sentences refer to women than to men . | ||
| sexist | 9 | |
| W18-5114 In this paper, we use ConceptNet and Wikidata to improve ***** sexist ***** tweet classification by two methods (1) text augmentation and (2) text generation. | ||
| 2020.coling-main.552 While extensive popularity of online social media platforms has made information dissemination faster, it has also resulted in widespread online abuse of different types like hate speech, offensive language, ***** sexist ***** and racist opinions, etc. | ||
| 2020.acl-main.373 In a context of offensive content mediation on social media now regulated by European laws, it is important not only to be able to automatically detect ***** sexist ***** content but also to identify if a message with a ***** sexist ***** content is really ***** sexist ***** or is a story of sexism experienced by a woman. | ||
| W19-3638 In the midst of a generation widely exposed to and influenced by media entertainment, the NLP research community has shown relatively little attention on the ***** sexist ***** comments in popular TV series. | ||
| W18-5101 The advent of social media in recent years has fed into some highly undesirable phenomena such as proliferation of offensive language, hate speech, ***** sexist ***** remarks, etc. on the Internet | ||
| lightweight | 9 | |
| 2021.emnlp-main.192 FLiText introduces an inspirer network together with the consistency regularization framework, which leverages a generalized regular constraint on the ***** lightweight ***** models for efficient SSL. | ||
| K17-3025 The parser is fast, ***** lightweight ***** and effective on big treebanks. | ||
| E17-3006 Together with this article we offer a very fast, ***** lightweight *****, open source parser with support for various output formats. | ||
| 2020.coling-main.432 Extensive experiments and analyses on the ***** lightweight ***** models show that our proposed methods achieve significantly higher scores and substantially improve the robustness of both intent detection and slot filling. | ||
| 2020.codi-1.13 Our new tree self-attention is based on document-level discourse information, extending the recently proposed “Synthesizer” framework with another ***** lightweight ***** alternative | ||
| uniform information density | 9 | |
| 2021.emnlp-main.74 The ***** uniform information density ***** (UID) hypothesis posits a preference among language users for utterances structured such that information is distributed uniformly across a signal. | ||
| 2020.emnlp-main.170 We find that beam search enforces ***** uniform information density ***** in text, a property motivated by cognitive science. | ||
| 2021.acl-long.404 The ***** uniform information density ***** (UID) hypothesis, which posits that speakers behaving optimally tend to distribute information uniformly across a linguistic signal, has gained traction in psycholinguistics as an explanation for certain syntactic, morphological, and prosodic choices. | ||
| 2021.acl-long.405 Moreover, this discrepancy between English and Japanese is further explored from the perspective of (non-)***** uniform information density *****. | ||
| W16-4120 Our results show that surprisal does not predict the word order choice by itself, but is a significant predictor when used in a measure of ***** uniform information density ***** (UID). | ||
| semantic types | 9 | |
| W17-2406 We show how English FrameNet and other Frame Semantic resources can be represented as sets of interconnected graphs of frames, frame elements, ***** semantic types *****, and annotated instances of them in text. | ||
| W17-5102 Less emphasis has been placed on analyzing the ***** semantic types ***** of argument components. | ||
| 2020.lrec-1.598 It follows a three step projection, validation with alignment, completion methodology consisting on the manual validation and expansion of the outcome of an automatic projection procedure of synsets and their hypernym relations, followed by another automatic procedure that transferred the relations of remaining ***** semantic types ***** across wordnets of different languages. | ||
| L14-1499 This paper focuses on our efforts at defining the ***** semantic types ***** and varieties of caused motion constructions (CMCs) through an iterative annotation process and establishing annotation guidelines based on these criteria to aid in the production of a consistent and reliable annotation. | ||
| C16-1266 CSP models the narrative consistency between the predicate and preceding contexts of its arguments, in addition to the conventional SP based on ***** semantic types *****. | ||
| user satisfaction | 9 | |
| 2020.findings-emnlp.347 Current automated methods to estimate turn and dialogue level ***** user satisfaction ***** employ hand-crafted features and rely on complex annotation schemes, which reduce the generalizability of the trained models. | ||
| L12-1581 Another outcome of the experiment was the preliminary evaluation of the pronunciation learning service in terms of ***** user satisfaction *****, which would be difficult to conduct before integrating the HCI part. | ||
| L10-1398 In this paper, we propose an estimation method of ***** user satisfaction ***** for a spoken dialog system using an N-gram-based dialog history model. | ||
| 2020.aacl-main.89 Explainable recommendation is a good way to improve ***** user satisfaction *****. | ||
| 2021.emnlp-main.767 Unlike prior efforts in dialog mining, by utilizing local ***** user satisfaction ***** as a bridge, global satisfaction detector and handoff predictor can effectively exchange critical information. | ||
| lexical constraints | 9 | |
| 2020.emnlp-main.701 Lexically constrained generation requires the target sentence to satisfy some ***** lexical constraints *****, such as containing some specific words or being the paraphrase to a given sentence, which is very important in many real-world natural language generation applications. | ||
| 2021.naacl-main.339 Conditional text generation often requires ***** lexical constraints *****, i.e., which words should or shouldn't be included in the output text. | ||
| Q13-1034 We also investigate the use of rule-based morphological analyzers to provide hard or soft ***** lexical constraints ***** and the use of word clusters to tackle the sparsity of lexical features. | ||
| P17-1141 We present Grid Beam Search (GBS), an algorithm which extends beam search to allow the inclusion of pre-specified ***** lexical constraints *****. | ||
| 2021.emnlp-main.413 Based on the observations, we then propose a simple yet effective framework to automatically extract, denoise, and enforce important input concepts as ***** lexical constraints *****. | ||
| complex natural language | 9 | |
| N18-2111 We examine whether computational models can capture this interaction, when both character attributes and actions are expressed as ***** complex natural language ***** descriptions. | ||
| 2020.findings-emnlp.292 Pre-trained language models have been dominating the field of natural language processing in recent years, and have led to significant performance gains for various ***** complex natural language ***** tasks. | ||
| S17-1020 Semantic parsing shines at analyzing ***** complex natural language ***** that involves composition and computation over multiple pieces of evidence. | ||
| Q19-1012 Neural Program Induction (NPI) is a pragmatic approach toward modularizing the reasoning process by translating a ***** complex natural language ***** query into a multi-step executable program. | ||
| D18-1234 To address that, in this paper, we propose a State Transition-based approach to translate a ***** complex natural language ***** question N to a semantic query graph (SQG), which is used to match the underlying knowledge graph to find the answers to question N. In order to generate SQG, we propose four primitive operations (expand, fold, connect and merge) and a learning-based state transition approach. | ||
| creating | 9 | |
| 2020.wmt-1.51 Finally, we make use of additional monolingual data by ***** creating ***** synthetic parallel data through back-translation. | ||
| W17-1320 We describe the process of ***** creating ***** NUDAR, a Universal Dependency treebank for Arabic. | ||
| W17-8105 The E-platform integrates: 1/ an environment for ***** creating *****, organizing and maintaining electronic text archives, for extracting text corpora and aligning corpora; 2/ a linguistic database; 3/ a concordancer; 4/ a set of modules for the generation and editing of practice exercises for each text or corpus; 5/ functionalities for export from the platform and import to other educational platforms. | ||
| L16-1065 The paper describes the process of collecting, cleaning and ***** creating ***** the corpus. | ||
| 2021.nodalida-main.1 In this paper, we introduce a simple, fully automated pipeline for ***** creating ***** language-specific BERT models from Wikipedia data and introduce 42 new such models, most for languages up to now lacking dedicated deep neural language models. | ||
| learning semantic | 9 | |
| D17-2007 We introduce “KnowYourNyms?”, a web-based game for ***** learning semantic ***** relations. | ||
| S18-2032 Tree-structured LSTMs have shown advantages in ***** learning semantic ***** representations by exploiting syntactic information. | ||
| 2021.naacl-main.315 Training with CQA pairs helps our model ***** learning semantic ***** QA relevance and distant supervision enables learning of syntactic features as well as the nuances of user querying language. | ||
| K18-1051 Methods for ***** learning semantic ***** spaces, however, are mostly aimed at modelling similarity. | ||
| K19-1042 In this work, we examine LTAL for ***** learning semantic ***** representations, such as QA-SRL. | ||
| coherence modeling | 9 | |
| 2020.coling-main.594 We investigate this question in ***** coherence modeling *****. | ||
| 2021.eacl-main.308 Although ***** coherence modeling ***** has come a long way in developing novel models, their evaluation on downstream applications for which they are purportedly developed has largely been neglected. | ||
| 2020.lrec-1.210 Results from these evaluations show that except for certain extreme conditions, the recurrent graph neural network-based model is an optimal choice for ***** coherence modeling *****. | ||
| W18-5040 We also empirically examine two variants of ***** coherence modeling *****: order-oriented and topic-oriented negative sampling, showing that, of the two, topic-oriented negative sampling tends to be more effective. | ||
| 2020.lrec-1.134 In this paper, we focus on ***** coherence modeling ***** at the intra-discursive level and describe our approach to build a corpus of incoherent pairs of sentences. | ||
| unsupervised neural | 9 | |
| 2020.loresmt-1.10 In this work, we devise an ***** unsupervised neural ***** machine translation (UNMT) system consisting of a transformer based shared encoder and language specific decoders using denoising autoencoder and backtranslation with an additional Manipuri side multiple test reference. | ||
| P19-1119 Unsupervised bilingual word embedding (UBWE), together with other technologies such as back-translation and denoising, has helped ***** unsupervised neural ***** machine translation (UNMT) achieve remarkable results in several language pairs. | ||
| K19-1027 In this paper, we alleviate the local optimality of back-translation by learning a policy (takes the form of an encoder-decoder and is defined by its parameters) with future rewarding under the reinforcement learning framework, which aims to optimize the global word predictions for ***** unsupervised neural ***** machine translation. | ||
| D19-6123 With that in mind, we train instances of the PRPN architecture (Shen et al., 2018)—one of these ***** unsupervised neural ***** network parsers—for Arabic, Chinese, English, and German. | ||
| 2020.wmt-1.128 Our core ***** unsupervised neural ***** machine translation (UNMT) system follows the strategy of Chronopoulou et al. | ||
| related work | 9 | |
| L10-1476 After a short introduction and a description of ***** related work *****, we illustrate the annotation process, including a description of the annotation methodology and the developed tool for the annotation process. | ||
| C16-1073 It differs from most of the ***** related work ***** in that it learns one semantic center embedding and one context bias instead of training multiple embeddings per word type. | ||
| 2020.emnlp-main.326 We empirically validate our claims by applying Spot The Bot to three domains, evaluating several state-of-the-art chat bots, and drawing comparisonsto ***** related work *****. | ||
| I17-1024 However, most ***** related work ***** ignores the relatedness among word senses which actually plays an important role. | ||
| P19-1571 Most ***** related work *****s focus on using complicated compositionality functions to model SC while few works consider external knowledge in models. | ||
| reason | 9 | |
| 2020.pam-1.2 We use this model to develop a socio-semantic theory of conventionalised ***** reason *****ing patterns, known as topoi. | ||
| 2021.sigdial-1.9 For this ***** reason *****, we fully annotated only the test data and left the annotation of the training data incomplete. | ||
| C18-1227 For this ***** reason *****, we tackle the task of semantic searches of FE dictionaries. | ||
| 2020.acl-main.119 At each time step, our model performs multiple rounds of attention, ***** reason *****ing, and composition that aim to answer two critical questions: (1) which part of the input sequence to abstract; and (2) where in the output graph to construct the new concept. | ||
| W19-2505 Our experiments show that after applying a ***** reason *****able amount of semi-automatic postprocessing we can obtain high-quality aligned and annotated resources for a new language. | ||
| expansion | 9 | |
| W18-2313 Our method, evaluated using the TREC 2016 clinical decision support challenge dataset, shows statistically significant improvement not only over various baselines that use standard MeSH terms and UMLS concepts for query ***** expansion *****, but also over baselines using human expert–assigned concept tags for the queries, run on top of a standard Okapi BM25–based document retrieval system. | ||
| L14-1031 These two methods (along with a naive hybrid approach combining the two) have been shown to significantly outperform a state-of-the-art resource ***** expansion ***** system at our pilot task of imageability ***** expansion *****. | ||
| 2020.isa-1.3 The code and its analytic ***** expansion *****s represent a cross-linguistically wide range of phenomena of languages and language structures. | ||
| C18-1221 We learn word embeddings for all words in the corpus and compare the averaged context vector of the words in the ***** expansion ***** of an acronym with the weighted average vector of the words in the context of the acronym. | ||
| 2021.emnlp-main.313 Extensive evaluations exhibit that TEMP outperforms prior state-of-the-art taxonomy ***** expansion ***** approaches by 14.3% in accuracy and 15.8% in mean reciprocal rank on three public benchmarks. | ||
| text features | 9 | |
| C18-1314 Most of the existing research explores different ***** text features ***** of reply comments on word level and ignores interactions between participants. | ||
| W17-1718 Lexical and syntactic con***** text features ***** derived from vector representations are shown to be more effective over traditional statistical measures to identify tokens of MWEs. | ||
| P19-1239 We treat ***** text features *****, image features and image attributes as three modalities and propose a multi-modal hierarchical fusion model to address this task. | ||
| Q14-1015 We present a probabilistic language model that captures temporal dynamics and conditions on arbitrary non-linguistic con***** text features *****. | ||
| W19-5913 While there has been much work in the language learning and assessment literature on human and automated scoring of essays and short constructed responses, there is little to no work examining ***** text features ***** for scoring of dialog data, particularly interactional aspects thereof, to assess conversational proficiency over and above constructed response skills. | ||
| unsupervised bilingual lexicon | 9 | |
| P17-1179 We carry out evaluation on the ***** unsupervised bilingual lexicon ***** induction task. | ||
| D17-1207 We demonstrate the success on the ***** unsupervised bilingual lexicon ***** induction task. | ||
| 2021.naacl-main.465 In this paper, we exploit ***** unsupervised bilingual lexicon ***** induction (BLI) to map training questions in source language into those in target language as augmented training data, which circumvents language inconsistency between training and inference. | ||
| 2021.naacl-main.39 Our experiments on ***** unsupervised bilingual lexicon ***** induction and cross-lingual document classification show that this method improves performance over previous single-mapping methods, especially for distant languages. | ||
| P19-1308 The task of ***** unsupervised bilingual lexicon ***** induction (UBLI) aims to induce word translations from monolingual corpora in two languages. | ||
| minimum description length | 9 | |
| 2013.iwslt-papers.15 In total, we see a jump in BLEU score, from 17.53 for a standalone ***** minimum description length ***** baseline with no category learning, to 20.93 when incorporating category induction on a Chinese–English translation task. | ||
| 2020.emnlp-main.14 Instead, we propose an alternative to the standard probes, information-theoretic probing with ***** minimum description length ***** (MDL). | ||
| 2021.blackboxnlp-1.29 Instead, we adopt an alternative information-theoretic probing with ***** minimum description length *****, which has recently been proven to provide more reliable and informative results. | ||
| W17-1004 In this paper we introduce a new unsupervised approach for query-based extractive summarization, based on the ***** minimum description length ***** (MDL) principle that employs Krimp compression algorithm (Vreeken et al., 2011). | ||
| 2021.emnlp-main.748 We categorize common NLP tasks according to their causal direction and empirically assay the validity of the ICM principle for text data using ***** minimum description length *****. | ||
| sentence segmentation | 9 | |
| K18-2016 We introduce a complete neural pipeline system that takes raw text as input, and performs all tasks required by the shared task, ranging from tokenization and ***** sentence segmentation *****, to POS tagging and dependency parsing. | ||
| 2020.emnlp-main.207 Most publicly available parallel corpora for Bengali are not large enough; and have rather poor quality, mostly because of incorrect sentence alignments resulting from erroneous ***** sentence segmentation *****, and also because of a high volume of noise present in them. | ||
| 2020.lt4hala-1.8 Tasks such as lexical analysis need to be based on ***** sentence segmentation ***** because of the reason that a plenty of ancient books are not punctuated. | ||
| 2014.iwslt-papers.16 Syntactic parsing is a fundamental natural language processing technology that has proven useful in machine translation, language modeling, ***** sentence segmentation *****, and a number of other applications related to speech translation. | ||
| K18-2009 We also present a new ***** sentence segmentation ***** neural architecture based on Stack-LSTMs that was the 4th best overall. | ||
| wassa | 9 | |
| 2021.***** wassa *****-1.26 We explicitly examine the impact of transcription errors on the downstream performance of a multi-modal system on three related tasks from three datasets: emotion, sarcasm, and personality detection. | ||
| 2021.***** wassa *****-1.10 This paper presents the results that were obtained from the WASSA 2021 shared task on predicting empathy and emotions. | ||
| 2021.***** wassa *****-1.8 We release an open-source Python library, so researchers can use a model trained on FEEL-IT for inferring both sentiments and emotions from Italian text. | ||
| 2021.***** wassa *****-1.22 Therefore, in this paper, we propose an approach using weighted k Nearest Neighbours (kNN), a simple, easy to implement, and explainable machine learning model. | ||
| 2021.***** wassa *****-1.30 Over the past few years, computational understanding and detection of emotional aspects in language have been vital in advancing human-computer interaction. | ||
| dependency annotation | 9 | |
| L08-1571 The texts are linguistically annotated using different layers from part of speech tags and morphological features to ***** dependency annotation *****. | ||
| L08-1150 The target corpus for the word-level ***** dependency annotation ***** is a large spontaneous Japanese-speech corpus, the Corpus of Spontaneous Japanese (CSJ). | ||
| Q19-1022 The auxiliary tasks provide syntactic information that is specific to semantic role labeling and are learned from training data (***** dependency annotation *****s) without relying on existing dependency parsers, which can be noisy (e.g., on out-of-domain data or infrequent constructions). | ||
| L10-1459 The syntactic annotation comprises POS annotation, Penn Treebank style constituent annotations as well as ***** dependency annotation *****s based on the dependencies of pennconverter. | ||
| 2020.framenet-1.9 We propose an approach for generating an accurate and consistent PropBank-annotated corpus, given a FrameNet-annotated corpus which has an underlying ***** dependency annotation ***** layer, namely, a parallel Universal Dependencies (UD) treebank. | ||
| targeted sentiment | 9 | |
| C16-1146 This work extends both aspect-based sentiment analysis – that assumes a single entity per document — and ***** targeted sentiment ***** analysis — that assumes a single sentiment towards a target entity. | ||
| E17-2091 However, they do not explicitly model the contribution of each word in a sentence with respect to ***** targeted sentiment ***** polarities. | ||
| 2021.naacl-main.227 The majority of work in ***** targeted sentiment ***** analysis has concentrated on finding better methods to improve the overall results. | ||
| P19-1051 Open-domain ***** targeted sentiment ***** analysis aims to detect opinion targets along with their sentiment polarities from a sentence. | ||
| C16-1233 We address the task of ***** targeted sentiment ***** as a means of understanding the sentiment that students hold toward courses and instructors, as expressed by students in their comments. | ||
| impact | 9 | |
| 2021.nlp4pos***** impact *****-1.14 We also release the first publicly available data set at the intersection of geopolitical relations and a raging pandemic in the context of India and Pakistan. | ||
| 2020.gebnlp-1.6 Furthermore, we analyze the effect of the debiasing techniques on downstream tasks which show a negligible ***** impact ***** on traditional embeddings and a 2% decrease in performance in contextualized embeddings. | ||
| 2020.acl-main.744 We start with a strong baseline (RoBERTa) to validate the ***** impact ***** of our approach, and show that our framework outperforms the baseline by learning to comply with declarative constraints. | ||
| 2021.wassa-1.26 We explicitly examine the ***** impact ***** of transcription errors on the downstream performance of a multi-modal system on three related tasks from three datasets: emotion, sarcasm, and personality detection. | ||
| 2021.emnlp-main.386 We propose RDI, a context-aware methodology which takes into account the ***** impact ***** of secondary attributes on the model's predictions and increases sensitivity for secondary attributes over reweighted counterfactually augmented data. | ||
| statistical analysis | 9 | |
| L14-1153 Detailed ***** statistical analysis ***** have been done to compute the information about entropy, perplexity, vocabulary growth rate etc. | ||
| W19-6127 Read-aloud speech of Finnish-speaking and Russian-speaking parent-child pairs was subject to perceptual and multi-step instrumental and ***** statistical analysis *****. | ||
| 2021.nodalida-main.20 We perform manual error analysis and perform a ***** statistical analysis ***** of factors which affect how difficult specific tags are. | ||
| L14-1105 We interpret these findings as evidence for the claim that human association acquisition must be based on the ***** statistical analysis ***** of perceived language and that when producing associations the detected statistical regularities are replicated. | ||
| W19-8910 The system also performs the ***** statistical analysis ***** of the evaluation results and provides different visualization charts. | ||
| blocks world | 9 | |
| W18-1403 We implemented our models in a 3D ***** blocks world ***** and a room world in a computer graphics setting, and found that true/false judgments based on these models do not differ much more from human judgments that the latter differ from one another. | ||
| 2020.sigdial-1.16 A physical ***** blocks world *****, despite its relative simplicity, requires (in fully interactive form) a rich set of functional capabilities, ranging from vision to natural language understanding. | ||
| W18-1402 We demonstrate a system for understanding natural language utterances for structure description and placement in a situated ***** blocks world ***** context. | ||
| N19-1195 Over the last few years, there has been growing interest in learning models for physically grounded language understanding tasks, such as the popular ***** blocks world ***** domain. | ||
| 2021.emnlp-main.85 To enable theory of mind modeling in situated interactions, we introduce a fine-grained dataset of collaborative tasks performed by pairs of human subjects in the 3D virtual ***** blocks world ***** of Minecraft. | ||
| hybrid machine | 9 | |
| L12-1129 The taraXÜ project paves the way for wide usage of ***** hybrid machine ***** translation outputs through various feedback loops in system development. | ||
| 2012.amta-government.5 While we have made strides from rule based, to statistical and ***** hybrid machine ***** translation engines, we cannot rely solely on machine translation to overcome the language barrier and accomplish the mission. | ||
| W16-4504 This is useful for instance in ***** hybrid machine ***** translation systems which are usually more dependent on high-quality translation dictionaries. | ||
| 2008.amta-govandcom.21 The recognized utterances are normalized into Modern Standard Arabic and the output of this Modern Standard Arabic interlingua is then translated by a ***** hybrid machine ***** translation system, combining statistical and rule-based features. | ||
| L12-1231 In recent years , machine translation ( MT ) research has focused on investigating how *****hybrid machine***** translation as well as system combination approaches can be designed so that the resulting hybrid translations show an improvement over the individual component translations . | ||
| automatic scoring | 9 | |
| 2021.acl-long.96 We also carry out multiple experiments to measure how much each augmentation strategy improves the performance of ***** automatic scoring ***** systems. | ||
| W17-4609 We consider the ***** automatic scoring ***** of a task for which both the content of the response as well its spoken fluency are important. | ||
| L06-1034 The project also aims to create quality-controlled resources such as domain-specific corpora, ***** automatic scoring ***** tool, etc. | ||
| 2012.amta-tutorials.2 It covers managing bilingual and monolingual data using Corpus Manager, training hybrid or statistical translation models with Training Manager, and evaluating quality using ***** automatic scoring ***** and side-by-side translation comparison. | ||
| W18-0550 We investigate the feasibility of cross-lingual content scoring, a scenario where training and test data in an ***** automatic scoring ***** task are from two different languages. | ||
| effects | 9 | |
| 2020.emnlp-main.52 Specifically, we devise two components, prototype enhanced retrospection and hierarchical distillation, to mitigate the adverse ***** effects ***** of semantic ambiguity and class imbalance, respectively. | ||
| 2020.emnlp-main.615 We present an analysis of the ***** effects ***** that different methods have on the distributions of the TM. | ||
| 2021.sigdial-1.49 But, the ***** effects ***** of minimizing an alternate training objective that fosters a model to generate alternate response and score it on semantic similarity has not been well studied. | ||
| 2021.gebnlp-1.4 Technology companies have produced varied responses to concerns about the ***** effects ***** of the design of their conversational AI systems. | ||
| 2020.insights-1.11 The web offers a wealth of discourse data that help researchers from various fields analyze debates about current societal issues and gauge the ***** effects ***** on society of important phenomena such as misinformation spread. | ||
| synthetic data | 9 | |
| P19-1025 Our experiments on ***** synthetic data ***** confirm this observation. | ||
| D18-1040 In this work, we explore different aspects of back-translation, and show that words with high prediction loss during training benefit most from the addition of ***** synthetic data *****. | ||
| 2021.emnlp-main.492 We study six negative sampling strategies and apply them to the fine-tuning stage and, as a noteworthy novelty, to the ***** synthetic data ***** that we use for pre-training. | ||
| 2020.wmt-1.11 We then fine-tune the model with parallel data and in-domain ***** synthetic data *****, generated with iterative back-translation. | ||
| 2021.acl-long.266 To make the most of authentic and ***** synthetic data *****, we combine these complementary approaches as a new training strategy for further boosting NAT performance. | ||
| inference time | 9 | |
| W19-4814 The models are architecturally identical at ***** inference time *****, but differ in the way that they are trained: our baseline model is trained with a task-success signal only, while the other model receives additional supervision on its attention mechanism (Attentive Guidance), which has shown to be an effective method for encouraging more compositional solutions. | ||
| 2020.acl-main.537 To improve their efficiency with an assured model performance, we propose a novel speed-tunable FastBERT with adaptive ***** inference time *****. | ||
| 2020.findings-emnlp.232 During testing, BlockBERT saves 27.8% ***** inference time *****, while having comparable and sometimes better prediction accuracy, compared to an advanced BERT-based model, RoBERTa. | ||
| 2021.naacl-main.77 We propose a simple yet highly effective approach, LightningDOT that accelerates the ***** inference time ***** of ITR by thousands of times, without sacrificing accuracy. | ||
| N19-1236 For ***** inference time *****, we describe a method for selecting high-quality text plans for new inputs. | ||
| statistics | 9 | |
| 2020.lrec-1.856 We first draw on a small set of annotated data to compute spelling error ***** statistics *****. | ||
| L12-1108 Frequency lists and/or lexicons contain information about the words and their ***** statistics *****. | ||
| P18-1003 While we may similarly expect that co-occurrence ***** statistics ***** can be used to capture rich information about the relationships between different words, existing approaches for modeling such relationships are based on manipulating pre-trained word vectors. | ||
| 2005.mtsummit-papers.18 At the same time we also apply a ***** statistics *****-based approach, the well-known toolkit GIZA++, to the same test data. | ||
| Q16-1028 Many use nonlinear operations on co-occurrence ***** statistics *****, and have hand-tuned hyperparameters and reweighting methods. | ||
| key information | 9 | |
| P19-2035 Meanwhile, it is less stressed that attention mechanism is likely to “over-focus” on particular parts of a sentence, while ignoring positions which provide ***** key information ***** for judging the polarity. | ||
| S18-1110 To retrieve the majority of the relevant documents, we carefully designed our system to extract ***** key information ***** from each question and document pair. | ||
| C16-1309 The BINets are time-aware, efficient and can be easily analyzed for identifying ***** key information ***** (centroids). | ||
| C16-1021 Our approach involves finding a sequence of sentences that best represent the ***** key information ***** in a coherent way. | ||
| P18-1204 Interrogatives lexicalize the pattern of questioning, topic words address the ***** key information ***** for topic transition in dialogue, and ordinary words play syntactical and grammatical roles in making a natural sentence. | ||
| provide | 9 | |
| 2021.naacl-main.353 We ***** provide ***** a novel dataset for this task, encompassing over 8,000 comparative entries, and show that neural sequence models outperform conventional methods applied to this task so far. | ||
| L14-1466 Answers to this questionnaire are ***** provide *****d with DINASTI. | ||
| 2021.naacl-main.258 We ***** provide ***** supportive evidence by ex-perimentally confirming that well-performingmodels show a low sensitivity to noise andfine-tuning with LNSR exhibits clearly bet-ter generalizability and stability. | ||
| 1997.mtsummit-workshop.4 (1) From a linguistic viewpoint, the expected benefits include a refinement of the aspectual model in (Olsen, 1994; Olsen, 1997) (which ***** provide *****s necessary but not sufficient conditions for aspectual com- position), and a refinement of the verb classifications in (Levin, 1993); we also expect our approach to eventually produce a systematic definition (in terms of LCSs and compositional operations) of the precise meaning components responsible for Levin's classification. | ||
| 2020.sltu-1.7 Overall, we show that the proposed multilingual graphemic hybrid ASR with various data augmentation can not only recognize any within training set languages, but also ***** provide ***** large ASR performance improvements. | ||
| track | 9 | |
| 2020.acl-main.54 The goal-oriented dialogue system needs to be optimized for ***** track *****ing the dialogue flow and carrying out an effective conversation under various situations to meet the user goal. | ||
| 2020.wmt-1.15 This paper describes Tilde's submission to the WMT2020 shared task on news translation for both directions of the English-Polish language pair in both the constrained and the unconstrained ***** track *****s. | ||
| 2021.naacl-main.342 In the pursuit of natural language understanding, there has been a long standing interest in ***** track *****ing state changes throughout narratives. | ||
| W17-1212 Our system reached the 7th position in the ***** track *****. | ||
| N18-5019 Borrowing from the scientific field of Decision Analysis , its essential role is to identify alternatives and criteria associated with a given decision , to keep *****track***** of who proposed them and of the expressed sentiment towards them . | ||
| objects | 9 | |
| 2020.lrec-1.710 People choose particular names for ***** objects *****, such as dog or puppy for a given dog. | ||
| K18-1051 Crucially, our method is fully unsupervised, requiring only a bag-of-words representation of the ***** objects ***** as input. | ||
| 2018.gwc-1.44 The goal is to establish a dataset that helps us to understand how people categorize everyday common ***** objects ***** via their parts, attributes, and context. | ||
| 2021.crac-1.1 We conduct extensive experiments to show that even though current models are achieving good performance on the standard evaluation set, they are still not ready to be used in real applications (e.g., all SOTA models struggle on correctly resolving pronouns to infrequent ***** objects *****). | ||
| L10-1397 This paper draws a distinction between discourse context ―other entities that have been mentioned in the dialogue― and visual context ―visually available ***** objects ***** near the intended referent. | ||
| multilingual representations | 9 | |
| W19-5202 Such experiments shed light on the ability of NMT encoders to learn ***** multilingual representations *****, in general. | ||
| 2020.emnlp-main.358 It has been shown that multilingual BERT (mBERT) yields high quality ***** multilingual representations ***** and enables effective zero-shot transfer. | ||
| 2021.naacl-main.42 Moreover, recent work shows that ***** multilingual representations ***** are surprisingly disjoint across languages, bringing additional challenges for transfer onto extremely low-resource languages. | ||
| D19-6106 We release a subset of the XNLI dataset translated into an additional 14 languages at https://www.github.com/salesforce/xnli_extension to assist further research into ***** multilingual representations *****. | ||
| 2021.emnlp-main.470 A simple but highly effective method “Language Information Removal (LIR)” factors out language identity information from semantic related components in ***** multilingual representations ***** pre-trained on multi-monolingual data. | ||
| success | 9 | |
| W19-4324 The most recent ***** success *****es are predominantly due to the use of different variations of attention mechanisms, but their cognitive plausibility is questionable. | ||
| R19-1021 The ability to produce high-quality publishable material is critical to academic ***** success ***** but many Post-Graduate students struggle to learn to do so. | ||
| C16-1121 We present a ***** success *****ful collaboration of word embeddings and co-training to tackle in the most difficult test case of semantic role labeling: predicting out-of-domain and unseen semantic frames. | ||
| 2003.mtsummit-papers.19 Divergence imposes a great challenge to the ***** success ***** of EBMT. | ||
| W18-5419 We find that, contrary to earlier results, disfluencies have very little impact on the task ***** success ***** of seq-to-seq models with attention. | ||
| assessing | 9 | |
| C18-1283 The recently increased focus on misinformation has stimulated research in fact checking, the task of ***** assessing ***** the truthfulness of a claim. | ||
| 2020.emnlp-main.381 Besides, we present the first results of comparing multilingual models in the translated diagnostic test set and offer the first steps to further expanding or ***** assessing ***** State-of-the-art models independently of language. | ||
| 2012.amta-wptp.4 This has created the need for a formal method of ***** assessing ***** the performance of post-editors in terms of whether they are able to produce post-edited target texts that follow project specifications. | ||
| 2021.eval4nlp-1.1 While such scores provide a general idea of the behavior of these systems, they ignore a key piece of information that can be useful for ***** assessing ***** progress and discerning remaining challenges: the relative difficulty of test instances. | ||
| C18-1027 Although it is known that conceptual complexity plays a significant role in text understanding, no attempts have been made at ***** assessing ***** it automatically. | ||
| temporal knowledge | 9 | |
| 2021.naacl-main.202 Moreover, we investigate the effect of the temporal dataset's time granularity on ***** temporal knowledge ***** graph completion. | ||
| 2020.coling-main.139 However, the recent availability of ***** temporal knowledge ***** graphs (TKGs) that contain time information for each fact created the need for reasoning over time in such TKGs. | ||
| 2021.naacl-main.451 Recently, ***** temporal knowledge ***** graph (TKG) embedding (TKGE) has emerged. | ||
| 2020.emnlp-main.593 There has recently been increasing interest in learning representations of ***** temporal knowledge ***** graphs (KGs), which record the dynamic relationships between entities over time. | ||
| 2020.emnlp-main.305 Research on *****temporal knowledge***** bases , which associate a relational fact ( s , r , o ) with a validity time period ( or time instant ) , is in its early days . | ||
| numerical reasoning | 9 | |
| 2021.emnlp-main.300 In contrast to existing tasks on general domain, the finance domain includes complex ***** numerical reasoning ***** and understanding of heterogeneous representations. | ||
| 2021.acl-long.115 The pre-trained models are fine-tuned to produce fluent text that is enriched with ***** numerical reasoning *****. | ||
| 2021.iwpt-1.14 This talk will describe work that relies on compositionality in semantic parsing and in reading comprehension requiring ***** numerical reasoning *****. | ||
| 2020.acl-main.89 Consequently, existing models for ***** numerical reasoning ***** have used specialized architectures with limited flexibility. | ||
| 2021.emnlp-main.557 Specialized number representations in NLP have shown improvements on *****numerical reasoning***** tasks like arithmetic word problems and masked number prediction . | ||
| misinformation detection | 9 | |
| 2021.acl-srw.28 To access the performance of the CMTA multilingual model, we performed a comparative analysis of 8 monolingual model and CMTA for the ***** misinformation detection ***** task. | ||
| 2020.rdsm-1.3 The existing studies on ***** misinformation detection ***** hypothesise that the initial message is fake. | ||
| 2021.wanlp-1.8 Tweets were manually-annotated by veracity to support research on ***** misinformation detection *****, which is one of the major problems faced during a pandemic. | ||
| W18-5502 Therefore collecting well-balanced and carefully-assessed training data is a priority for developing robust ***** misinformation detection ***** systems. | ||
| 2021.nlp4if-1.18 This paper describes the TOKOFOU system , an ensemble model for *****misinformation detection***** tasks based on six different transformer - based pre - trained encoders , implemented in the context of the COVID-19 Infodemic Shared Task for English . | ||
| multilingual evaluation | 9 | |
| D17-1009 We perform experiments on question answering against Freebase and provide German and Spanish translations of the WebQuestions and GraphQuestions datasets to facilitate ***** multilingual evaluation *****. | ||
| 2012.amta-commercial.1 To understand the performance of MT, we have developed HeMT: an integrated ***** multilingual evaluation ***** platform (http://txcdk-v10.unt.edu/HeMT/) to facilitate human evaluation of machine translation. | ||
| 2021.acl-long.309 We also establish a new 23-language ***** multilingual evaluation ***** set. | ||
| L14-1240 In this paper, we describe a publicly available ***** multilingual evaluation ***** corpus for phrase-level Sentiment Analysis that can be used to evaluate real world applications in an industrial context. | ||
| 2021.emnlp-main.571 However, existing ***** multilingual evaluation ***** datasets that evaluate lexical semantics “in-context” have various limitations. | ||
| dense passage retrieval | 9 | |
| 2021.emnlp-main.227 Recent work has shown that ***** dense passage retrieval ***** techniques achieve better ranking accuracy in open-domain question answering compared to sparse retrieval techniques such as BM25, but at the cost of large space and memory requirements. | ||
| 2021.emnlp-main.148 In this paper, we present a novel approach to zero-shot slot filling that extends ***** dense passage retrieval ***** with hard negatives and robust training procedures for retrieval augmented generation models. | ||
| 2021.mrqa-1.4 Finally, we demonstrate the wide range of applications of GermanQuAD by adapting it to GermanDPR, a training dataset for ***** dense passage retrieval ***** (DPR), and train and evaluate one of the first non-English DPR models. | ||
| 2021.emnlp-main.224 In this paper, we propose a novel joint training approach for ***** dense passage retrieval ***** and passage reranking. | ||
| 2021.naacl-main.466 In open-domain question answering, ***** dense passage retrieval ***** has become a new paradigm to retrieve relevant passages for finding answers. | ||
| toponym resolution | 9 | |
| S19-2156 This paper describes DM-NLP's system for ***** toponym resolution ***** task at Semeval 2019. | ||
| S19-2155 We present the SemEval-2019 Task 12 which focuses on ***** toponym resolution ***** in scientific articles. | ||
| S19-2229 In order to facilitate the study on ***** toponym resolution *****, the SemEval 2019 task 12 is proposed, which contains three subtasks, i.e., toponym detection, toponym disambiguation and ***** toponym resolution *****. | ||
| R19-1106 We evaluate different approaches in terms of machine learning classifiers, text pre-processing and location extraction on the SemEval-2019 Task 12 dataset, compiled for ***** toponym resolution ***** in the bio-medical domain. | ||
| S19-2230 The SemEval-2019 Task 12 is ***** toponym resolution ***** in scientific papers. | ||
| offline speech | 9 | |
| 2020.iwslt-1.2 This paper describes the ON-TRAC Consortium translation systems developed for two challenge tracks featured in the Evaluation Campaign of IWSLT 2020, ***** offline speech ***** translation and simultaneous speech translation. | ||
| 2020.iwslt-1.8 This paper describes FBK's participation in the IWSLT 2020 ***** offline speech ***** translation (ST) task. | ||
| 2020.iwslt-1.6 This paper describes the system that was submitted by DiDi Labs to the ***** offline speech ***** translation task for IWSLT 2020. | ||
| 2021.iwslt-1.6 We participate in the ***** offline speech ***** translation and text-to-text simultaneous translation tracks. | ||
| 2020.iwslt-1.10 This paper describes the University of Helsinki Language Technology group's participation in the IWSLT 2020 ***** offline speech ***** translation task, addressing the translation of English audio into German text. | ||
| release | 9 | |
| 2021.acl-long.334 Code will be ***** release *****d. | ||
| 2021.nlp4posimpact-1.14 We also ***** release ***** the first publicly available data set at the intersection of geopolitical relations and a raging pandemic in the context of India and Pakistan. | ||
| L06-1343 In this paper, we describe the second ***** release ***** of a suite of language analysers, developed over the last five years, called wraetlic, which includes tools for several partial parsing tasks, both for English and Spanish. | ||
| P19-1038 Nowadays, firm CEOs communicate information not only verbally through press ***** release *****s and financial reports, but also nonverbally through investor meetings and earnings conference calls. | ||
| C18-1073 We conduct our experiments on a recently ***** release *****d cloze-test dataset CLOTH (Xie et al., 2017), which consists of nearly 100k questions designed by professional teachers. | ||
| syntactic knowledge | 9 | |
| D19-1541 Inspired by the strong correlation between syntax and semantics, previous works pay much attention to improve SRL performance on exploiting ***** syntactic knowledge *****, achieving significant results. | ||
| N18-1123 This effectively injects semantic and/or ***** syntactic knowledge ***** into the translation model, which would otherwise require a large amount of training bitext to learn from. | ||
| L14-1109 In addition to the usual information on part-of-speech, gender, and number for nouns, offered by most dictionaries currently available, OpenLogos bilingual dictionaries have some distinctive features that make them unique: they contain cross-language morphological information (inflectional and derivational), semantico-***** syntactic knowledge *****, indication of the head word in multiword units, information about whether a source word corresponds to an homograph, information about verb auxiliaries, alternate words (i.e., predicate or process nouns), causatives, reflexivity, verb aspect, among others. | ||
| L14-1176 Taking multiword units into account, we propose an effective method to achieve MT hybridization based on the integration of semantico-***** syntactic knowledge ***** into SMT. | ||
| 2020.coling-main.116 It focuses on the contribution from ***** syntactic knowledge *****, exploiting linguistic resources where syntax is annotated according to the Universal Dependencies scheme. | ||
| bilingual word embedding | 9 | |
| C16-1300 We remove this constraint by introducing the Earth Mover's Distance into the training of ***** bilingual word embedding *****s. | ||
| P19-1312 State-of-the-art methods for unsupervised ***** bilingual word embedding *****s (BWE) train a mapping function that maps pre-trained monolingual word embeddings into a bilingual space. | ||
| N19-1188 Recent research has discovered that a shared ***** bilingual word embedding ***** space can be induced by projecting monolingual word embedding spaces from two languages using a self-learning paradigm without any bilingual supervision. | ||
| 2018.iwslt-1.2 We present a simple approach relying on ***** bilingual word embedding *****s trained in an unsupervised fashion. | ||
| 2020.bucc-1.4 In this paper, we show how to use ***** bilingual word embedding *****s (BWE) to automatically create a corresponding table of meaning tags from two dictionaries in one language and examine the effectiveness of the method. | ||
| bilingual text | 9 | |
| 2020.lrec-1.325 We also investigate how to exploit additional data, such as ***** bilingual text ***** harvested from the web, or user dictionaries; we find that NMT can significantly improve in performance with the use of these additional data. | ||
| L14-1602 We provide SwissAdmin in three versions: (i) plain texts of approximately 6 to 8 million words per language; (ii) sentence-aligned ***** bilingual text *****s for each language pair; (iii) a part-of-speech-tagged version consisting of annotations in both the Universal tagset and the richer Fips tagset, along with grammatical functions, verb valencies and collocations. | ||
| Q16-1029 In this paper, we conduct an in-depth adaptation of statistical machine translation to perform text simplification, taking advantage of large-scale paraphrases learned from ***** bilingual text *****s and a small amount of manual simplifications with multiple references. | ||
| L16-1436 Then tags such as speaker and discourse boundary from the script data are projected to its subtitle data via an information retrieval approach in order to map monolingual discourse to ***** bilingual text *****s. | ||
| P18-1084 We present a deep neural network that leverages images to improve *****bilingual text***** embeddings . | ||
| shown | 9 | |
| L06-1015 That is, the retrieved documents from both systems are ***** shown ***** to the judges without any information about thesearch techniques. | ||
| 2021.calcs-1.20 Multilingual language models have ***** shown ***** decent performance in multilingual and cross-lingual natural language understanding tasks. | ||
| R17-1061 We have ***** shown ***** that if component words of a phrase have each other as frequent associations, then this phrase can be considered as conventionalized. | ||
| L16-1360 Previous studies, based on the analysis of poetic texts, have ***** shown ***** that synaesthetic transfers tend to go from the lower toward the higher senses (e.g., sweet music vs. musical sweetness). | ||
| L14-1031 These two methods (along with a naive hybrid approach combining the two) have been ***** shown ***** to significantly outperform a state-of-the-art resource expansion system at our pilot task of imageability expansion. | ||
| text coherence | 9 | |
| 2020.lrec-1.210 In this paper, to evaluate ***** text coherence *****, we propose the paragraph ordering task as well as conducting sentence ordering. | ||
| L14-1199 Our main interest lies in opening the possibility to observe how ***** text coherence ***** is realized in different types (in the genre sense) of language data and, in the future, in exploring the ways of using genres as a feature for multi-sentence-level language technologies. | ||
| C16-2042 For ***** text coherence *****, we use a measure of agreement between a given and consecutive paragraph by tree kernel learning of their discourse trees. | ||
| W18-5040 We introduce an unsupervised learning method on ***** text coherence ***** that could produce numerical representations that improve implicit discourse relation recognition in a semi-supervised manner. | ||
| L08-1313 In this paper we present and discuss the results of a ***** text coherence ***** experiment performed on a small corpus of Romanian text from a number of alternative high school manuals. | ||
| address | 9 | |
| 2020.nlpcss-1.9 While this task has been closely associated with emotion prediction, we argue and show that identifying worry needs to be ***** address *****ed as a separate task given the unique challenges associated with it. | ||
| 2021.acl-long.363 To ***** address ***** the challenge that free-text relations are ambiguous, previous methods exploit neighbor entities and relations for additional context. | ||
| W19-4002 We ***** address ***** the non-trivial problem of evaluating the extractions produced by systems against the reference tuples, and share our evaluation script. | ||
| 2020.emnlp-main.352 To ***** address ***** this issue, we propose an alternative to the end-to-end classification on vocabulary. | ||
| N18-1124 To ***** address ***** this limitation, we propose a target-side-attentive residual recurrent network for decoding, where attention over previous words contributes directly to the prediction of the next word. | ||
| data sparseness | 9 | |
| J18-4008 Conventional topic models are ineffective for topic extraction from microblog messages, because the ***** data sparseness ***** exhibited in short messages lacking structure and contexts results in poor message-level word co-occurrence patterns. | ||
| W19-6147 At NODALIDA 2019 we demonstrate the method (called SHARP) online, showing how a traditional lexical-phonetic dictionary (with a very rich phone inventory) is transformed into an ASR-friendly database (with reduced phonetics, preventing ***** data sparseness *****). | ||
| L12-1026 Among these, there is the issue of ***** data sparseness *****, a problem that is particularly evident in cases such as our target language - Brazilian Portuguese - which is not only morphologically-rich, but relatively poor in NLP resources such as large, publicly available corpora. | ||
| L04-1259 Highly inflectional/agglutinative languages like Hungarian typically feature possible word forms in such a magnitude that automatic methods that provide morphosyntactic annotation on the basis of some training corpus often face the problem of ***** data sparseness *****. | ||
| 2006.jeptalnrecital-recitalposter.7 There are, however, two major problems with this approach : computational complexity and ***** data sparseness *****. | ||
| semantic graph | 9 | |
| 2019.gwc-1.32 An effective conversion method was proposed in the literature to obtain a lexical semantic space from a lexical ***** semantic graph *****, thus permitting to obtain WordNet embeddings from WordNets. | ||
| P17-1112 We propose a neural encoder-decoder transition-based parser which is the first full-coverage ***** semantic graph ***** parser for Minimal Recursion Semantics (MRS). | ||
| P17-1077 To build a ***** semantic graph ***** for a given sentence, we design new Maximum Subgraph algorithms to generate noncrossing graphs on each page, and a Lagrangian Relaxation-based algorithm tocombine pages into a book. | ||
| 2021.emnlp-main.200 Particularly, we first build a document-level path for each output text with each sentence embedding as its node, and a revised self-organising map (SOM) is proposed to cluster similar nodes of a family of document-level paths to construct the directed ***** semantic graph *****. | ||
| 2021.emnlp-main.13 To capture the *****semantic graph***** structure from raw text , most existing summarization approaches are built on GNNs with a pre - trained model . | ||
| large pretrained | 9 | |
| 2021.ranlp-srw.3 With the recent success of ***** large pretrained ***** language models, we explore the possibility of using multilingual pretrained transformers like mBART and mT5 for exploring one such task of code-mixed Hinglish to English machine translation. | ||
| D19-6114 However, when such data is scarce, there remains a significant performance gap between ***** large pretrained ***** LMs and smaller task-specific models, even when training via distillation. | ||
| W19-8665 Recent advances in transfer-learning from ***** large pretrained ***** language models give rise to alternative approaches that do not rely on copy-attention and instead learn to generate concise and abstractive summaries. | ||
| 2021.nlp4convai-1.21 We experiment in detail with various controlled generation methods for ***** large pretrained ***** language models: specifically, conditional training, guided fine-tuning, and guided decoding. | ||
| 2020.emnlp-main.586 The success of ***** large pretrained ***** language models (LMs) such as BERT and RoBERTa has sparked interest in probing their representations, in order to unveil what types of knowledge they implicitly capture. | ||
| offense | 9 | |
| S19-2099 Our results show 85.12% accuracy and 80.57% F1 scores in Subtask A (offensive language identification), 87.92% accuracy and 50% F1 scores in Subtask B (categorization of ***** offense ***** types), and 69.95% accuracy and 50.47% F1 score in Subtask C (***** offense ***** target identification). | ||
| 2020.semeval-1.280 We achieve the second place (2nd) in sub-task B: Automatic categorization of ***** offense ***** types and are ranked 55th with a macro F1-score of 90.59 in sub-task A: Offensive language identification. | ||
| 2021.semeval-1.9 SemEval 2021 Task 7, HaHackathon, was the first shared task to combine the previously separate domains of humor detection and ***** offense ***** detection. | ||
| 2021.semeval-1.155 This paper utilizes and compares transformers models; BERT base and Large, BERTweet, RoBERTa base and Large, and RoBERTa base irony, for detecting and rating humor and ***** offense *****. | ||
| S19-2131 Finally, for the OffensEval the classifier performed well (F1=0.74), proving to have a better performance for ***** offense ***** detection than for hate speech. | ||
| multilingual setting | 9 | |
| 2021.nodalida-main.16 This article studies register classification of documents from the unrestricted web, such as news articles or opinion blogs, in a ***** multilingual setting *****, exploring both the benefit of training on multiple languages and the capabilities for zero-shot cross-lingual transfer. | ||
| 2021.emnlp-main.676 We show that (i) neural architectures outperform other approaches by more than 20 accuracy points, with the BERT-based model performing the best in both the monolingual and ***** multilingual setting *****s; (ii) while many individual hand-crafted translationese features correlate with neural model predictions, feature importance analysis shows that the most important features for neural and classical architectures differ; and (iii) our multilingual experiments provide empirical evidence for translationese universals across languages. | ||
| 2021.semeval-1.15 XLMR performs better than mBERT in the cross-lingual setting both with fine-tuning and feature extraction, whereas these two models give a similar performance in the ***** multilingual setting *****. | ||
| 2020.acl-main.493 To better understand this overlap, we extend recent work on finding syntactic trees in neural networks' internal representations to the ***** multilingual setting *****. | ||
| 2021.semeval-1.92 In this paper, we apply our system to the English and Chinese ***** multilingual setting ***** and the experimental results show that our method has certain advantages. | ||
| strategy | 9 | |
| 2021.acl-long.96 We also carry out multiple experiments to measure how much each augmentation ***** strategy ***** improves the performance of automatic scoring systems. | ||
| D19-6203 The experimental results suggest that dependency-based pooling is the best pooling ***** strategy ***** for RE in the biomedical domain, yielding the state-of-the-art performance on two benchmark datasets for this problem. | ||
| K18-1044 However, rarely do editorials change anyone's stance on an issue completely, nor do they tend to argue explicitly (but rather follow a subtle rhetorical ***** strategy *****). | ||
| D19-5402 As an attempt to combine extractive and abstractive summarization, Sentence Rewriting models adopt the ***** strategy ***** of extracting salient sentences from a document first and then paraphrasing the selected ones to generate a summary. | ||
| 2020.coling-main.296 This highlights the fact that focusing on the identification of seen VMWEs could be a ***** strategy ***** to improve VMWE identification in general. | ||
| toxicity detection | 9 | |
| 2021.semeval-1.26 Semeval-2021, Task 5 - Toxic Spans Detection is based on a novel annotation of a subset of the Jigsaw Unintended Bias dataset and is the first language ***** toxicity detection ***** task dedicated to identifying the toxicity-level spans. | ||
| 2020.acl-main.396 We investigate this assumption by focusing on two questions: (a) does context affect the human judgement, and (b) does conditioning on context improve performance of ***** toxicity detection ***** systems? | ||
| S19-2102 PERSPECTIVE is an API, that serves multiple machine learning models for the improvement of conversations online, as well as a ***** toxicity detection ***** system, trained on a wide variety of comments from platforms across the Internet. | ||
| W19-3501 In this paper, we explore various aspects of sentiment detection and their correlation to toxicity, and use our results to implement a ***** toxicity detection ***** tool. | ||
| 2021.emnlp-main.386 By implementing RDI in the context of ***** toxicity detection *****, we find that accounting for secondary attributes can significantly improve robustness, with improvements in sliced accuracy on the original dataset up to 7% compared to existing robustness methods. | ||
| lexical analysis | 9 | |
| 2020.lt4hala-1.8 Tasks such as ***** lexical analysis ***** need to be based on sentence segmentation because of the reason that a plenty of ancient books are not punctuated. | ||
| I17-5001 These deep learning models have been successfully used for ***** lexical analysis ***** and parsing. | ||
| L14-1130 Conclusions and recommendations from ***** lexical analysis ***** of localized terms are provided. | ||
| L12-1268 If tokens are ambiguous, ***** lexical analysis ***** must provide all possible sets of annotation for later (syntactic) disambiguation, be it tagging, or full parsing. | ||
| 2021.emnlp-demo.6 We introduce N-LTP, an open-source neural language technology platform supporting six fundamental Chinese NLP tasks: ***** lexical analysis ***** (Chinese word segmentation, part-of-speech tagging, and named entity recognition), syntactic parsing (dependency parsing), and semantic parsing (semantic dependency parsing and semantic role labeling). | ||
| memories | 9 | |
| 2021.naacl-main.157 We then propose a memory network to generate personalized responses in dialogue that utilizes a novel mechanism of splitting ***** memories *****: one for user profile meta attributes and the other for user-generated information like comment histories. | ||
| 2001.mtsummit-ebmt.5 This system takes advantage of the huge and underused resources available in existing translation ***** memories ***** and develops a traditional TM into a sophisticated example-based machine translation engine which when integrated into a hybrid MT solution can yield significant improvements in translation quality. | ||
| 2020.acl-main.178 Finally, our measures reveal the effect of narrativization of ***** memories ***** in stories (e.g., stories about frequently recalled ***** memories ***** flow more linearly; Bartlett, 1932). | ||
| D18-1280 To this end, we propose Interactive COnversational memory Network (ICON), a multimodal emotion detection framework that extracts multimodal features from conversational videos and hierarchically models the self- and inter-speaker emotional influences into global ***** memories *****. | ||
| 1997.iwpt-1.12 This system uses techniques from Part - of - Speech tagging in order to build a constituent structure and uses other techniques from dependency grammar in an original framework of *****memories***** in order to build a functional structure . | ||
| exist | 9 | |
| 2020.coling-main.278 Our proposed LaAP-Net outperforms ***** exist *****ing approaches on three benchmark datasets for the text VQA task by a noticeable margin. | ||
| D19-1566 Experimental results suggest the efficacy of the proposed model for both sentiment and emotion analysis over various ***** exist *****ing state-of-the-art systems. | ||
| 2021.emnlp-main.777 At the script level, most ***** exist *****ing studies only consider a single event sequence corresponding to one common protagonist. | ||
| D19-5817 Our study suggests that while current metrics may be suitable for ***** exist *****ing QA datasets, they limit the complexity of QA datasets that can be created. | ||
| 2021.emnlp-main.702 Despite achieving good performance on some public benchmarks, we observe that ***** exist *****ing text-to-SQL models do not generalize when facing domain knowledge that does not frequently appear in the training data, which may render the worse prediction performance for unseen domains. | ||
| web page | 9 | |
| L06-1104 Search engines on the web and most existing question-answering systems provide the user with a set of hyperlinks and/or ***** web page ***** extracts containing answer(s) to a question. | ||
| 2020.lrec-1.866 It is designed as a building kit-like application that fetches data from different sources and compiles them into a single, comprehensible and structured ***** web page *****. | ||
| L10-1556 The topic of cleaning arbitrary ***** web page *****s with the goal of extracting a corpus from web data, suitable for linguistic and language technology research and development, has attracted significant research interest recently. | ||
| W17-8006 We describe three working prototypes along these lines: NEW/S/LEAK, which was developed for investigative journalists who need a quick overview of large leaked document collections; STORYFINDER, which is a personalized organizer for information found in ***** web page *****s that allows adding entities as well as relations, and is capable of personalized information management; and adaptive annotation capabilities of WEBANNO, which is a general-purpose linguistic annotation tool. | ||
| L08-1559 Since the visual structure of a ***** web page ***** is very important and often informs the user before he has even read the text, a semiotic study is also presented in this paper. | ||
| tables | 9 | |
| W19-3701 It can be applied to other low-resourced inflectional languages which have internet corpora and linguistic descriptions of their inflection system, following the example of inflection ***** tables ***** for Ukrainian. | ||
| 2020.lrec-1.738 LIS sentences have been transcribed with Italian words into ***** tables ***** on simultaneous layers, each of which contains specific linguistic or non-linguistic pieces of information. | ||
| 2020.acl-main.745 In this paper we present TaBERT, a pretrained LM that jointly learns representations for NL sentences and (semi-)structured ***** tables *****. | ||
| N18-2098 To this end, we propose a mixed hierarchical attention based encoder-decoder model which is able to leverage the structure in addition to the content of the ***** tables *****. | ||
| 2020.coling-main.5 Different from plain text passages in Web documents, Web ***** tables ***** and lists have inherent structures, which carry semantic correlations among various elements in ***** tables ***** and lists. | ||
| conditional variational autoencoder | 9 | |
| D18-1423 Towards filling the gap, in this paper, we propose a ***** conditional variational autoencoder ***** with adversarial training for classical Chinese poem generation, where the autoencoder part generates poems with novel terms and a discriminator is applied to adversarially learn their thematic consistency with their titles. | ||
| D18-1432 We present three enhancements to existing encoder-decoder models for open-domain conversational agents, aimed at effectively modeling coherence and promoting output diversity: (1) We introduce a measure of coherence as the GloVe embedding similarity between the dialogue context and the generated response, (2) we filter our training corpora based on the measure of coherence to obtain topically coherent and lexically diverse context-response pairs, (3) we then train a response generator using a ***** conditional variational autoencoder ***** model that incorporates the measure of coherence as a latent variable and uses a context gate to guarantee topical consistency with the context and promote lexical diversity. | ||
| D19-1186 Unlike past work that has focused on diversifying the output at word-level or discourse-level with a flat model to alleviate this problem, we propose a hierarchical generation model to capture the different levels of diversity using the ***** conditional variational autoencoder *****s. | ||
| P17-1061 Unlike past work that has focused on diversifying the output of the decoder from word-level to alleviate this problem, we present a novel framework based on ***** conditional variational autoencoder *****s that capture the discourse-level diversity in the encoder. | ||
| 2020.acl-main.7 This paper proposes to adopt the personality-related characteristics of human conversations into variational response generators, by designing a specific ***** conditional variational autoencoder ***** based deep model with two new regularization terms employed to the loss function, so as to guide the optimization towards the direction of generating both persona-aware and relevant responses. | ||
| sentence matching | 9 | |
| 2020.acl-main.547 To address this problem, we propose neural graph matching networks, a novel ***** sentence matching ***** framework capable of dealing with multi-granular input information. | ||
| P19-1314 We benchmark neural word-based models which rely on word segmentation against neural char-based models which do not involve word segmentation in four end-to-end NLP benchmark tasks: language modeling, machine translation, ***** sentence matching *****/paraphrase and text classification. | ||
| D19-1157 Results from both are merged and optimized as a video-***** sentence matching ***** problem. | ||
| D19-1267 In this paper, we present an original semantics-oriented attention and deep fusion network (OSOA-DFN) for ***** sentence matching *****. | ||
| C18-2033 Qutr is a real-time messaging app that automatically translates conversations while supporting keyword-to-***** sentence matching *****. | ||
| free word | 9 | |
| 2004.jeptalnrecital-long.24 Tree Adjoining Grammars (TAG) are known not to be powerful enough to deal with scrambling in ***** free word ***** order languages. | ||
| 2000.iwpt-1.38 This technique is based on a cascade of finite state machines, adding to them a characteristic very crucial in the parsing of words with ***** free word ***** order: the simultaneous examination of part of speech and grammatical feature information, which are deemed equally important during the parsing procedure, in contrast with other methodologies. | ||
| N19-3017 Pregroup calculus has been used for the representation of ***** free word ***** order languages (Sanskrit and Hungarian), using a construction called precyclicity. | ||
| W19-1404 We explore how well a sequence labeling approach, namely, recurrent neural network, is suited for the task of resource-poor and POS tagging ***** free word ***** stress detection in the Russian, Ukranian, Belarusian languages. | ||
| L10-1342 This finding is novel, for Czech, with its ***** free word ***** order and rich morphology, is typologically different than languages analyzed with (R)MRS to date. | ||
| measuring semantic | 9 | |
| C16-1289 Estimating similarities at different levels of linguistic units, such as words, sub-phrases and phrases, is helpful for ***** measuring semantic ***** similarity of an entire bilingual phrase. | ||
| S17-2040 In this paper, we describe our proposed method for ***** measuring semantic ***** similarity for a given pair of words at SemEval-2017 monolingual semantic word similarity task. | ||
| 2021.inlg-1.25 For each sentence we evaluate, we select a subset of facts which are relevant by ***** measuring semantic ***** similarity to the sentence in question. | ||
| P19-1264 The most common automatic metrics, like BLEU and ROUGE, depend on exact word matching, an inflexible approach for ***** measuring semantic ***** similarity. | ||
| L10-1203 In this paper, we propose a random walk model based approach to ***** measuring semantic ***** relatedness between words or concepts, which seamlessly integrates various features extracted from Wikipedia to compute semantic relatedness. | ||
| parsing natural language | 9 | |
| 1995.iwpt-1.15 In this paper we present a robust parsing algorithm based on the link grammar formalism for ***** parsing natural language *****s. | ||
| 2019.gwc-1.10 The proposed approach does not use rules for ***** parsing natural language ***** queries but experiments showed that the embeddings model is tolerant enough for correctly predicting relation types that do not match known patterns exactly. | ||
| L10-1031 We will show that the valency data can be used for accurately ***** parsing natural language ***** with a rule-based approach by integrating it into a Left-Associative Grammar. | ||
| R17-1059 In this paper, we propose a novel supervised model for ***** parsing natural language ***** sentences into their formal semantic representations. | ||
| W17-6311 PP-attachments are an important source of errors in ***** parsing natural language *****. | ||
| Machine translation | 9 | |
| L06-1033 *****Machine translation***** systems , whether rule - based , example - based , or statistical , all rely on dictionaries that are in essence mappings between individual words of the source and the target language . | ||
| R17-1049 *****Machine translation***** systems are very sensitive to the domains they were trained on . | ||
| 2020.iwslt-1.22 *****Machine translation***** systems perform reasonably well when the input is well - formed speech or text . | ||
| L10-1023 *****Machine translation***** systems can be classified into rule - based and corpus - based approaches , in terms of their core technology . | ||
| 2021.emnlp-main.576 *****Machine translation***** models have discrete vocabularies and commonly use subword segmentation techniques to achieve an ` open vocabulary . ' | ||
| Spoken Language Understanding ( SLU | 9 | |
| D18-1417 *****Spoken Language Understanding ( SLU***** ) , which typically involves intent determination and slot filling , is a core component of spoken dialogue systems . | ||
| N18-3023 *****Spoken Language Understanding ( SLU***** ) , which extracts semantic information from speech , is not flawless , specially in practical applications . | ||
| 2020.nlp4convai-1.11 Slot Filling ( SF ) is one of the sub - tasks of *****Spoken Language Understanding ( SLU***** ) which aims to extract semantic constituents from a given natural language utterance . | ||
| D19-1097 *****Spoken Language Understanding ( SLU***** ) mainly involves two tasks , intent detection and slot filling , which are generally modeled jointly in existing works . | ||
| W17-5514 *****Spoken Language Understanding ( SLU***** ) is a key component of goal oriented dialogue systems that would parse user utterances into semantic frame representations . | ||
| using | 9 | |
| L10-1227 It seems that human - beings assign appropriate word senses to the given ambiguous word in the sentence depending on the words which followed the ambiguous word when they could not disambiguate by *****using***** the previous contextual information . | ||
| 2021.ranlp-1.107 The most frequently used sentence representation for neural - based NLP methods is a sequence of subwords that is different from the sentence representation of non - neural methods that are created *****using***** basic NLP technologies , such as part - of - speech ( POS ) tagging , named entity ( NE ) recognition , and parsing . | ||
| 2014.iwslt-papers.6 We consider the following practical situation : given a large scale , state - of - the - art SMT system containing a CSTM , the task is to adapt the CSTM to a new domain *****using***** a ( relatively ) small in - domain parallel corpus . | ||
| 2021.emnlp-main.252 MISO consists of ( 1 ) a semantic fusion module that learns entangled semantics among difficult and majority samples with an adaptive multi - head attention mechanism , ( 2 ) a mutual information loss that forces our model to learn new representations of entangled semantics in the non - overlapping region of the minority class , and ( 3 ) a coupled adversarial encoder - decoder that fine - tunes disentangled semantic representations to remain their correlations with the minority class , and then *****using***** these disentangled semantic representations to generate anchor instances for each difficult sample . | ||
| W17-1810 It identifies negation cues and their corresponding scope in either raw or parsed text *****using***** maximum - margin classification . | ||
| its | 9 | |
| 2020.lrec-1.248 Although existing approaches employ human - written ground - truth answers for answering conversational questions at test time , in a realistic scenario , the CoQA model will not have any access to ground - truth answers for the previous questions , compelling the model to rely upon *****its***** own previously predicted answers for answering the subsequent questions . | ||
| Q19-1028 This architecture considers raw words as its main input , but internally captures text structure and informs *****its***** word attention process using other syntax- and morphology - related datapoints , known to be of great importance to readability . | ||
| W19-4814 The models are architecturally identical at inference time , but differ in the way that they are trained : our baseline model is trained with a task - success signal only , while the other model receives additional supervision on *****its***** attention mechanism ( Attentive Guidance ) , which has shown to be an effective method for encouraging more compositional solutions . | ||
| L14-1098 The Votter Corpus is novel in its use of the mobile application format and novel in *****its***** coverage of specific demographics . | ||
| 2020.wmt-1.40 We introduced MUCOW at WMT 2019 to measure the ability of MT systems to perform word sense disambiguation ( WSD ) , i.e. , to translate an ambiguous word with *****its***** correct sense . | ||
| Implicit discourse relation | 9 | |
| W19-2703 *****Implicit discourse relation***** classification is one of the most challenging and important tasks in discourse parsing , due to the lack of connectives as strong linguistic cues . | ||
| 2021.acl-short.116 *****Implicit discourse relation***** classification is a challenging task , in particular when the text domain is different from the standard Penn Discourse Treebank ( PDTB ; Prasad et al . , 2008 ) training corpus domain ( Wall Street Journal in 1990s ) . | ||
| W19-0416 *****Implicit discourse relation***** classification is one of the most difficult steps in discourse parsing . | ||
| 2020.acl-main.14 *****Implicit discourse relation***** recognition is a challenging task due to the lack of connectives as strong linguistic clues . | ||
| P17-1093 *****Implicit discourse relation***** classification is of great challenge due to the lack of connectives as strong linguistic cues , which motivates the use of annotated implicit connectives to improve the recognition . | ||
| character - level | 9 | |
| D19-5506 Contemporary machine translation systems achieve greater coverage by applying subword models such as BPE and *****character - level***** CNNs , but these methods are highly sensitive to orthographical variations such as spelling mistakes . | ||
| N18-1107 We present a graph - based Tree Adjoining Grammar ( TAG ) parser that uses BiLSTMs , highway connections , and *****character - level***** CNNs . | ||
| D18-1278 When parsing morphologically - rich languages with neural models , it is beneficial to model input at the character level , and it has been claimed that this is because *****character - level***** models learn morphology . | ||
| 2020.sigmorphon-1.21 The Transformer model has been shown to outperform other neural seq2seq models in several *****character - level***** tasks . | ||
| W18-2905 Languages with logographic writing systems present a difficulty for traditional *****character - level***** models . | ||
| transformer - based | 9 | |
| 2020.wnut-1.68 In this system paper , we present a *****transformer - based***** approach to the detection of informativeness in English tweets on the topic of the current COVID-19 pandemic . | ||
| 2021.gwc-1.26 Neural language models , including *****transformer - based***** models , that are pre - trained on very large corpora became a common way to represent text in various tasks , including recognition of textual semantic relations , e.g. | ||
| 2021.acl-long.554 We present ReadOnce Transformers , an approach to convert a *****transformer - based***** model into one that can build an information - capturing , task - independent , and compressed representation of text . | ||
| 2021.ranlp-1.128 In this paper , we aim at improving Czech sentiment with *****transformer - based***** models and their multilingual versions . | ||
| 2020.trac-1.9 Modern *****transformer - based***** models with hundreds of millions of parameters , such as BERT , achieve impressive results at text classification tasks . | ||
| word sense disambiguation ( WSD | 9 | |
| E17-1086 In this paper , we present a novel unsupervised algorithm for *****word sense disambiguation ( WSD***** ) at the document level . | ||
| L06-1383 The paper presents advances in the use of semantic features and interlingua relations for *****word sense disambiguation ( WSD***** ) as part of unification - based deep processing grammars . | ||
| 2021.ranlp-1.57 Acquisition of multilingual training data continues to be a challenge in *****word sense disambiguation ( WSD***** ) . | ||
| D18-1517 Event detection ( ED ) and *****word sense disambiguation ( WSD***** ) are two similar tasks in that they both involve identifying the classes ( i.e. | ||
| W19-0422 As opposed to word sense induction , *****word sense disambiguation ( WSD***** ) has the advantage of us - ing interpretable senses , but requires annotated data , which are quite rare for most languages except English ( Miller et al . | ||
| text - to - SQL | 9 | |
| D19-1624 Most deep learning approaches for *****text - to - SQL***** generation are limited to the WikiSQL dataset , which only supports very simple queries over a single table . | ||
| 2021.acl-long.198 This work aims to tackle the challenging heterogeneous graph encoding problem in the *****text - to - SQL***** task . | ||
| 2020.emnlp-main.561 In Natural Language Interfaces to Databases systems , the *****text - to - SQL***** technique allows users to query databases by using natural language questions . | ||
| 2020.emnlp-main.563 On the WikiSQL benchmark , state - of - the - art *****text - to - SQL***** systems typically take a slot- filling approach by building several dedicated models for each type of slots . | ||
| D18-1425 We present Spider , a large - scale complex and cross - domain semantic parsing and *****text - to - SQL***** dataset annotated by 11 college students . | ||
| Question answering ( QA | 9 | |
| 2021.emnlp-main.758 *****Question answering ( QA***** ) primarily descends from two branches of research : ( 1 ) Alan Turing 's investigation of machine intelligence at Manchester University and ( 2 ) Cyril Cleverdon 's comparison of library card catalog indices at Cranfield University . | ||
| 2020.acl-main.19 *****Question answering ( QA***** ) is an important aspect of open - domain conversational agents , garnering specific research focus in the conversational QA ( ConvQA ) subtask . | ||
| P19-1225 *****Question answering ( QA***** ) using textual sources for purposes such as reading comprehension ( RC ) has attracted much attention . | ||
| 2021.emnlp-tutorials.4 *****Question answering ( QA***** ) is one of the most challenging and impactful tasks in natural language processing . | ||
| 2021.acl-short.79 *****Question answering ( QA***** ) in English has been widely explored , but multilingual datasets are relatively new , with several methods attempting to bridge the gap between high- and low - resourced languages using data augmentation through translation and cross - lingual transfer . | ||
| Neural sequence - to - sequence | 9 | |
| D18-1313 *****Neural sequence - to - sequence***** models have proven very effective for machine translation , but at the expense of model interpretability . | ||
| C18-1068 *****Neural sequence - to - sequence***** models have been successfully extended for summary generation . However , existing frameworks generate a single summary for a given input and do not tune the summaries towards any additional constraints / preferences . | ||
| 2021.eacl-main.118 *****Neural sequence - to - sequence***** models are currently the predominant choice for language generation tasks . | ||
| N19-1262 *****Neural sequence - to - sequence***** models have been successfully applied to text compression . | ||
| D19-5625 *****Neural sequence - to - sequence***** models , particularly the Transformer , are the state of the art in machine translation . | ||
| big | 9 | |
| 2020.lrec-1.867 Making corpora accessible and usable for linguistic research is a huge challenge in view of ( too ) *****big***** data , legal issues and a rapidly evolving methodology . | ||
| L08-1520 Some *****big***** languages like English are spoken by a lot of people whose mother tongues are different from . | ||
| D17-1176 We study the impact of *****big***** models ( in terms of the degree of lexicalization ) and big data ( in terms of the training corpus size ) on dependency grammar induction . | ||
| 2021.bionlp-1.4 The accelerating growth of *****big***** data in the biomedical domain , with an endless amount of electronic health records and more than 30 million citations and abstracts in PubMed , introduces the need for automatic structuring of textual biomedical data . | ||
| L14-1175 Like many other research fields , linguistics is entering the age of *****big***** data . | ||
| recurrent neural networks ( RNNs | 9 | |
| D18-1109 We introduce a class of convolutional neural networks ( CNNs ) that utilize *****recurrent neural networks ( RNNs***** ) as convolution filters . | ||
| P17-1062 End - to - end learning of *****recurrent neural networks ( RNNs***** ) is an attractive solution for dialog systems ; however , current techniques are data - intensive and require thousands of dialogs to learn simple behaviors . | ||
| W18-6127 While *****recurrent neural networks ( RNNs***** ) are widely used for text classification , they demonstrate poor performance and slow convergence when trained on long sequences . | ||
| D18-1503 Recent work has shown that *****recurrent neural networks ( RNNs***** ) can implicitly capture and exploit hierarchical information when trained to solve common natural language processing tasks ( Blevins et al . , 2018 ) such as language modeling ( Linzen et al . , 2016 ; Gulordava et al . , 2018 ) and neural machine translation ( Shi et al . , 2016 ) . | ||
| Q18-1047 In NLP , convolutional neural networks ( CNNs ) have benefited less than *****recurrent neural networks ( RNNs***** ) from attention mechanisms . | ||
| inter - annotator | 9 | |
| 2020.acl-srw.24 Recent humor classification shared tasks have struggled with two issues : either the data comprises a highly constrained genre of humor which does not broadly represent humor , or the data is so indiscriminate that the *****inter - annotator***** agreement on its humor content is drastically low . | ||
| S18-2028 While there have been many proposals for theories of semantic roles over the years , these models are mostly justified by intuition and the only evaluation methods have been *****inter - annotator***** agreement . | ||
| 2021.conll-1.18 This work describes an analysis of *****inter - annotator***** disagreements in human evaluation of machine translation output . | ||
| L10-1093 The annotation of causal relations in natural language texts can lead to a low *****inter - annotator***** agreement . | ||
| C18-1281 Human evaluations are broadly thought to be more valuable the higher the *****inter - annotator***** agreement . | ||
| morpho - syntactic | 9 | |
| L14-1332 Abstract Meaning Representations ( AMRs ) are rooted , directional and labeled graphs that abstract away from *****morpho - syntactic***** idiosyncrasies such as word category ( verbs and nouns ) , word order , and function words ( determiners , some prepositions ) . | ||
| W19-8706 We use a range of *****morpho - syntactic***** features inspired by research in register studies ( e.g. | ||
| 2006.amta-papers.16 This paper presents our study of exploiting *****morpho - syntactic***** information for phrase - based statistical machine translation ( SMT ) . | ||
| R19-1148 In this paper we present a *****morpho - syntactic***** tagger dedicated to Computer - mediated Communication texts in Polish . | ||
| L14-1331 The study provides an original standpoint of the speech transcription errors by focusing on the *****morpho - syntactic***** features of the erroneous chunks and of the surrounding left and right context . | ||
| Linguistic Data | 9 | |
| L08-1460 The Arabic Treebank team at the *****Linguistic Data***** Consortium has significantly revised and enhanced its annotation guidelines and procedure over the past year . | ||
| L12-1526 *****Linguistic Data***** Consortium and the National Institute of Standards and Technology are collaborating to create a large , heterogeneous annotated multimodal corpus to support research in multimodal event detection and related technologies . | ||
| L10-1587 This paper describes recent efforts at *****Linguistic Data***** Consortium at the University of Pennsylvania to create manual transcripts as a shared resource for human language technology research and evaluation . | ||
| 2020.lrec-1.806 We present a multimodal corpus for sentiment analysis based on the existing Switchboard-1 Telephone Speech Corpus released by the *****Linguistic Data***** Consortium . | ||
| L12-1245 The *****Linguistic Data***** Consortium and Georgetown University Press are collaborating to create updated editions of bilingual diction- aries that had originally been published in the 1960 's for English - speaking learners of Moroccan , Syrian and Iraqi Arabic . | ||
| word - level | 9 | |
| 2021.conll-1.23 The most straightforward approach to joint word segmentation ( WS ) , part - of - speech ( POS ) tagging , and constituent parsing is converting a *****word - level***** tree into a char - level tree , which , however , leads to two severe challenges . | ||
| 2021.rocling-1.47 In this paper , we proposed a BERT - based dimensional semantic analyzer , which is designed by incorporating with *****word - level***** information . | ||
| K19-1086 We propose a simple and effective method to inject *****word - level***** information into character - aware neural language models . | ||
| 2020.socialnlp-1.7 Chinese word segmentation is necessary to provide *****word - level***** information for Chinese named entity recognition ( NER ) systems . | ||
| D18-1510 Neural machine translation ( NMT ) models are usually trained with the *****word - level***** loss using the teacher forcing algorithm , which not only evaluates the translation improperly but also suffers from exposure bias . | ||
| annotation of | 9 | |
| L16-1187 This paper presents a framework and methodology for the *****annotation of***** perspectives in text . | ||
| L06-1197 It was originally developed for the annotation of semantic roles in the frame semantics paradigm , but can be used for graphical *****annotation of***** treebanks with general relational information in a simple drag - and - drop fashion . | ||
| L14-1337 Interoperability of annotation schemes is one of the key words in the discussions about *****annotation of***** corpora . | ||
| W17-1806 In this paper we present a complete framework for the *****annotation of***** negation in Italian , which accounts for both negation scope and negation focus , and also for language - specific phenomena such as negative concord . | ||
| L16-1632 To improve and facilitate language documentation of endangered languages , we attempt to use corpus linguistic methods and speech and language technologies to reduce the time needed for transcription and *****annotation of***** audio and video language recordings . | ||
| random | 9 | |
| 2021.eacl-main.156 ( CITATION ) argued for using *****random***** splits rather than standard splits in NLP experiments . | ||
| 2021.iwslt-1.23 Data augmentation , which refers to manipulating the inputs ( e.g. , adding *****random***** noise , masking specific parts ) to enlarge the dataset , has been widely adopted in machine learning . | ||
| 2021.naacl-main.258 Fine - tuning pre - trained language models suchas BERT has become a common practice dom - inating leaderboards across various NLP tasks . Despite its recent success and wide adoption , this process is unstable when there are onlya small number of training samples available . The brittleness of this process is often reflectedby the sensitivity to *****random***** seeds . | ||
| 2021.starsem-1.23 We suggest to model human - annotated Word Usage Graphs capturing fine - grained semantic proximity distinctions between word uses with a Bayesian formulation of the Weighted Stochastic Block Model , a generative model for *****random***** graphs popular in biology , physics and social sciences . | ||
| C18-1142 The variational encoder - decoder ( VED ) encodes source information as a set of *****random***** variables using a neural network , which in turn is decoded into target data using another neural network . | ||
| neural sequence - to - sequence | 9 | |
| W19-5917 Recent advances in *****neural sequence - to - sequence***** models have led to promising results for several language generation - based tasks , including dialogue response generation , summarization , and machine translation . | ||
| D17-1145 The input to a *****neural sequence - to - sequence***** model is often determined by an up - stream system , e.g. | ||
| D17-1074 The generation of complex derived word forms has been an overlooked problem in NLP ; we fill this gap by applying *****neural sequence - to - sequence***** models to the task . | ||
| P18-1014 We present a new *****neural sequence - to - sequence***** model for extractive summarization called SWAP - NET ( Sentences and Words from Alternating Pointer Networks ) . | ||
| D18-1441 Recent *****neural sequence - to - sequence***** models have shown significant progress on short text summarization . | ||
| grammatical error correction ( GEC ) | 9 | |
| 2020.acl-srw.5 Recently , several studies have focused on improving the performance of *****grammatical error correction ( GEC )***** tasks using pseudo data . | ||
| 2020.emnlp-main.680 Evaluation of *****grammatical error correction ( GEC )***** systems has primarily focused on essays written by non - native learners of English , which however is only part of the full spectrum of GEC applications . | ||
| W18-6111 We develop a *****grammatical error correction ( GEC )***** system for German using a small gold GEC corpus augmented with edits extracted from Wikipedia revision history . | ||
| N19-1132 This study explores the necessity of performing cross - corpora evaluation for *****grammatical error correction ( GEC )***** models . | ||
| C18-2018 This paper presents a *****grammatical error correction ( GEC )***** system that provides corrective feedback for essays . | ||
| end - to - end neural | 9 | |
| D17-1124 Idioms are peculiar linguistic constructions that impose great challenges for representing the semantics of language , especially in current prevailing *****end - to - end neural***** models , which assume that the semantics of a phrase or sentence can be literally composed from its constitutive words . | ||
| I17-1075 We propose an *****end - to - end neural***** network to predict the geolocation of a tweet . | ||
| 2021.gem-1.8 We present an *****end - to - end neural***** approach to generate English sentences from formal meaning representations , Discourse Representation Structures ( DRSs ) . | ||
| D18-1060 We present *****end - to - end neural***** models for detecting metaphorical word use in context . | ||
| P19-3011 We present ConvLab , an open - source multi - domain end - to - end dialog system platform , that enables researchers to quickly set up experiments with reusable components and compare a large set of different approaches , ranging from conventional pipeline systems to *****end - to - end neural***** models , in common environments . | ||
| transformer - based language | 9 | |
| 2021.ranlp-1.75 Modern *****transformer - based language***** models are revolutionizing NLP . | ||
| 2021.naacl-main.357 In human - level NLP tasks , such as predicting mental health , personality , or demographics , the number of observations is often smaller than the standard 768 + hidden state sizes of each layer within modern *****transformer - based language***** models , limiting the ability to effectively leverage transformers . | ||
| 2021.ranlp-1.153 We analyse how a *****transformer - based language***** model learns the rules of chess from text data of recorded games . | ||
| 2021.wanlp-1.46 Since their inception , *****transformer - based language***** models have led to impressive performance gains across multiple natural language processing tasks . | ||
| 2021.acl-short.18 Mechanisms for encoding positional information are central for *****transformer - based language***** models . | ||
| Aspect - based sentiment analysis ( ABSA | 9 | |
| 2020.coling-main.14 *****Aspect - based sentiment analysis ( ABSA***** ) aims to determine the sentiment polarity of each specific aspect in a given sentence . | ||
| 2021.acl-short.64 *****Aspect - based sentiment analysis ( ABSA***** ) has received increasing attention recently . | ||
| 2021.emnlp-main.726 *****Aspect - based sentiment analysis ( ABSA***** ) has been extensively studied in recent years , which typically involves four fundamental sentiment elements , including the aspect category , aspect term , opinion term , and sentiment polarity . | ||
| 2020.acl-main.340 *****Aspect - based sentiment analysis ( ABSA***** ) involves three subtasks , i.e. , aspect term extraction , opinion term extraction , and aspect - level sentiment classification . | ||
| N19-1035 *****Aspect - based sentiment analysis ( ABSA***** ) , which aims to identify fine - grained opinion polarity towards a specific aspect , is a challenging subtask of sentiment analysis ( SA ) . | ||
| expert | 9 | |
| 2020.coling-main.539 Despite the recent advances in coherence modelling , most such models including state - of - the - art neural ones , are evaluated on either contrived proxy tasks such as the standard order discrimination benchmark , or tasks that require special *****expert***** annotation . | ||
| 2020.law-1.11 Prepositional supersense annotation is time - consuming and requires *****expert***** training . | ||
| W19-4405 We present a model for automatic scoring of coherence based on comparing the rhetorical structure ( RS ) of college student summaries in L2 ( English ) against *****expert***** summaries . | ||
| 2020.emnlp-demos.27 Coreference annotation is an important , yet expensive and time consuming , task , which often involved *****expert***** annotators trained on complex decision guidelines . | ||
| L12-1478 Use of language resources including annotated corpora and tools is not easy for users , as it requires *****expert***** knowledge to determine which resources are compatible and interoperable . | ||
| Korean | 9 | |
| W18-6013 In this paper , for the purpose of enhancing Universal Dependencies for the *****Korean***** language , we propose a modified method for mapping Korean Part - of - Speech(POS ) tagset in relation to Universal Part - of - Speech ( UPOS ) tagset in order to enhance the Universal Dependencies for the Korean Language . | ||
| 2020.ccl-1.94 The manual labeling work for constructing the *****Korean***** corpus is too time - consuming and laborious . | ||
| D17-1075 We introduce a novel sub - character architecture that exploits a unique compositional structure of the *****Korean***** language . | ||
| W17-5705 Aiming at facilitating the research on quality estimation ( QE ) and automatic post - editing ( APE ) of machine translation ( MT ) outputs , especially for those among Asian languages , we have created new datasets for Japanese to English , Chinese , and *****Korean***** translations . | ||
| 2020.iwpt-1.13 In this paper , we first open on important issues regarding the Penn Korean Universal Treebank ( PKT - UD ) and address these issues by revising the entire corpus manually with the aim of producing cleaner UD annotations that are more faithful to *****Korean***** grammar . | ||
| diagnostic | 9 | |
| D19-1662 Elazar and Goldberg ( 2018 ) showed that protected attributes can be extracted from the representations of a debiased neural network for mention detection at above - chance levels , by evaluating a *****diagnostic***** classifier on a held - out subsample of the data it was trained on . | ||
| 2021.acl-demo.23 Probing ( or *****diagnostic***** classification ) has become a popular strategy for investigating whether a given set of intermediate features is present in the representations of neural models . | ||
| 1999.mtsummit-1.91 A new *****diagnostic***** system has been developed for an interactive template - structured intelligent language tutoring system ( ILTS ) for Japanese - English translation where an efficient heaviest common sequence ( HCS ) matching algorithm and a ` part - of - speech tagged ( POST ) parser ' play a key role . | ||
| 2020.eamt-1.30 In *****diagnostic***** interviews , elliptical utterances allow doctors to question patients in a more efficient and economical way . | ||
| 2021.acl-long.469 In order to deeply understand the capability of pretrained language models in text generation and conduct a *****diagnostic***** evaluation , we propose TGEA , an error - annotated dataset with multiple benchmark tasks for text generation from pretrained language models ( PLMs ) . | ||
| minimal | 9 | |
| L06-1144 In this paper we describe a machine translation prototype in which we use only *****minimal***** resources for both the source and the target language . | ||
| Q14-1009 We propose a new method for unsupervised tagging that finds *****minimal***** models which are then further improved by Expectation Maximization training . | ||
| 2021.semspace-1.2 We define a linear pregroup parser , by applying some key modifications to the *****minimal***** parser defined in ( Preller , 2007 ) . | ||
| J17-4002 This article considers the problem of correcting errors made by English as a Second Language writers from a machine learning perspective , and addresses an important issue of developing an appropriate training paradigm for the task , one that accounts for error patterns of non - native writers using *****minimal***** supervision . | ||
| E17-5006 This must sufficiently enclose the event localization , while optionally including space enough for a frame of reference for the event ( the viewers perspective).We first describe the formal multimodal foundations for the modeling language , VoxML , which creates a *****minimal***** simulation from the linguistic input interpreted by the multimodal language , DITL . | ||
| often | 9 | |
| 2020.cogalex-1.16 Lexicographic resources ( e.g. , WordNet ) capture only some of this context - dependent variation ; for example , they *****often***** do not encode how closely senses , or discretized word meanings , are related to one another . | ||
| 2021.naacl-main.446 However , in knowledge - grounded conversations , current models still lack the fine - grained control over knowledge selection and integration with dialogues , which finally leads to the knowledge - irrelevant response generation problems : 1 ) knowledge selection merely relies on the dialogue context , ignoring the inherent knowledge transitions along with conversation flows ; 2 ) the models *****often***** over - fit during training , resulting with incoherent response by referring to unrelated tokens from specific knowledge content in the testing phase ; 3 ) although response is generated upon the dialogue history and knowledge , the models often tend to overlook the selected knowledge , and hence generates knowledge - irrelevant response . | ||
| C16-1021 Existing methods focus on the extraction of key information , but *****often***** neglect coherence . | ||
| D19-1133 The definition of illegal characters and the specific removal strategy depend on the task , language , domain , etc , which *****often***** lead to tiresome and repetitive scripting of rules . | ||
| D19-1309 Despite the effectiveness of previous work based on generative models , there remain problems with exposure bias in recurrent neural networks , and *****often***** a failure to generate realistic sentences . | ||
| Natural language inference ( NLI | 9 | |
| W17-5309 *****Natural language inference ( NLI***** ) is a central problem in language understanding . | ||
| W18-3007 *****Natural language inference ( NLI***** ) is one of the most important tasks in NLP . | ||
| 2020.acl-main.768 *****Natural language inference ( NLI***** ) is an increasingly important task for natural language understanding , which requires one to infer whether a sentence entails another . | ||
| R19-1144 *****Natural language inference ( NLI***** ) is a key part of natural language understanding . | ||
| 2020.findings-emnlp.39 *****Natural language inference ( NLI***** ) and semantic textual similarity ( STS ) are key tasks in natural language understanding ( NLU ) . | ||
| Financial | 9 | |
| S17-2145 This paper describes the approach we used for SemEval-2017 Task 5 : Fine - Grained Sentiment Analysis on *****Financial***** Microblogs . | ||
| S17-2140 This paper presents the details of our system IBA - Sys that participated in SemEval Task : Fine - grained sentiment analysis on *****Financial***** Microblogs and News . | ||
| S17-2147 We present the system developed by the team DUTH for the participation in Semeval-2017 task 5 - Fine - Grained Sentiment Analysis on *****Financial***** Microblogs and News , in subtasks A and B. | ||
| S17-2142 This paper describes a supervised solution for detecting the polarity scores of tweets or headline news in the financial domain , submitted to the SemEval 2017 Fine - Grained Sentiment Analysis on *****Financial***** Microblogs and News Task . | ||
| 2020.fnp-1.3 We present the FinCausal 2020 Shared Task on Causality Detection in *****Financial***** Documents and the associated FinCausal dataset , and discuss the participating systems and results . | ||
| electronic health records ( EHRs | 9 | |
| L16-1598 This paper discusses the creation of a semantically annotated corpus of questions about patient data in *****electronic health records ( EHRs***** ) . | ||
| W19-5003 This paper proposes a dataset and method for automatically generating paraphrases for clinical questions relating to patient - specific information in *****electronic health records ( EHRs***** ) . | ||
| 2020.lrec-1.170 A crucial step within secondary analysis of *****electronic health records ( EHRs***** ) is to identify the patient cohort under investigation . | ||
| N19-5006 Rapid growth in adoption of *****electronic health records ( EHRs***** ) has led to an unprecedented expansion in the availability of large longitudinal datasets . | ||
| W16-4209 Importance of utilizing medical information is getting increased as *****electronic health records ( EHRs***** ) are widely used nowadays . | ||
| code | 9 | |
| 2021.nlp4prog-1.1 Automated source code summarization is a popular software engineering research topic wherein machine translation models are employed to translate *****code***** snippets into relevant natural language descriptions . | ||
| 2021.nlp4prog-1.10 The task of semantic code search is to retrieve *****code***** snippets from a source code corpus based on an information need expressed in natural language . | ||
| 2021.acl-long.394 Due to the great potential in facilitating software development , *****code***** generation has attracted increasing attention recently . | ||
| P18-5006 Semantic parsing , the study of translating natural language utterances into machine - executable programs , is a well - established research area and has applications in question answering , instruction following , voice assistants , and *****code***** generation . | ||
| 2021.naacl-main.211 Code summarization and generation empower conversion between programming language ( PL ) and natural language ( NL ) , while *****code***** translation avails the migration of legacy code from one PL to another . | ||
| Information Extraction ( IE | 9 | |
| W19-4605 Word Embeddings ( WE ) are getting increasingly popular and widely applied in many Natural Language Processing ( NLP ) applications due to their effectiveness in capturing semantic properties of words ; Machine Translation ( MT ) , Information Retrieval ( IR ) and *****Information Extraction ( IE***** ) are among such areas . | ||
| C16-2056 We present PolyglotIE , a web - based tool for developing extractors that perform *****Information Extraction ( IE***** ) over multilingual data . | ||
| L04-1245 We survey the evaluation methodology adopted in *****Information Extraction ( IE***** ) , as defined in the MUC conferences and in later independent efforts applying machine learning to IE . | ||
| 2020.lrec-1.243 Most of the current cross - lingual transfer learning methods for *****Information Extraction ( IE***** ) have been only applied to name tagging . | ||
| 2021.emnlp-main.763 *****Information Extraction ( IE***** ) aims to extract structural information from unstructured texts . | ||
| language models ( LMs | 9 | |
| 2021.privatenlp-1.1 Recent works have shown that *****language models ( LMs***** ) , e.g. , for next word prediction ( NWP ) , have a tendency to memorize rare or unique sequences in the training data . | ||
| 2021.emnlp-main.235 In computational linguistics , it has been shown that hierarchical structures make *****language models ( LMs***** ) more human - like . | ||
| 2021.mwe-1.1 In recent years , *****language models ( LMs***** ) have become almost synonymous with NLP . | ||
| 2021.eacl-main.155 Given the potential misuse of recent advances in synthetic text generation by *****language models ( LMs***** ) , it is important to have the capacity to attribute authorship of synthetic text . | ||
| 2021.emnlp-main.303 Recently , *****language models ( LMs***** ) have achieved significant performance on many NLU tasks , which has spurred widespread interest for their possible applications in the scientific and social area . | ||
| plain | 9 | |
| D17-1130 We present a transition - based AMR parser that directly generates AMR parses from *****plain***** text . | ||
| 2020.findings-emnlp.23 Joint entity and relation extraction aims to extract relation triplets from *****plain***** text directly . | ||
| D17-1186 Distantly supervised relation extraction has been widely used to find novel relational facts from *****plain***** text . | ||
| 2020.emnlp-main.129 Event detection ( ED ) , which means identifying event trigger words and classifying event types , is the first and most fundamental step for extracting event knowledge from *****plain***** text . | ||
| P19-1139 Neural language representation models such as BERT pre - trained on large - scale corpora can well capture rich semantic patterns from *****plain***** text , and be fine - tuned to consistently improve the performance of various NLP tasks . | ||
| exposure | 9 | |
| P19-2049 Scheduled sampling is a technique for avoiding one of the known problems in sequence - to - sequence generation : *****exposure***** bias . | ||
| 2021.blackboxnlp-1.16 This work focuses on relating two mysteries in neural - based text generation : *****exposure***** bias , and text degeneration . | ||
| D18-1510 Neural machine translation ( NMT ) models are usually trained with the word - level loss using the teacher forcing algorithm , which not only evaluates the translation improperly but also suffers from *****exposure***** bias . | ||
| 2020.acl-main.326 The standard training algorithm in neural machine translation ( NMT ) suffers from *****exposure***** bias , and alternative algorithms have been proposed to mitigate this . | ||
| D18-1396 Neural machine translation usually adopts autoregressive models and suffers from *****exposure***** bias as well as the consequent error propagation problem . | ||
| Natural language generation ( NLG | 9 | |
| C16-1105 *****Natural language generation ( NLG***** ) is the task of generating natural language from a meaning representation . | ||
| C16-1191 *****Natural language generation ( NLG***** ) is an important component of question answering(QA ) systems which has a significant impact on system quality . | ||
| N18-2010 *****Natural language generation ( NLG***** ) is a critical component in spoken dialogue systems . | ||
| W18-5020 *****Natural language generation ( NLG***** ) is an important component in spoken dialog systems ( SDSs ) . | ||
| 2021.emnlp-main.599 *****Natural language generation ( NLG***** ) spans a broad range of tasks , each of which serves for specific objectives and desires different properties of generated text . | ||
| Medical | 9 | |
| 2021.nlpmc-1.4 *****Medical***** simulators provide a controlled environment for training and assessing clinical skills . | ||
| 2021.acl-long.459 *****Medical***** imaging plays a significant role in clinical practice of medical diagnosis , where the text reports of the images are essential in understanding them and facilitating later treatments . | ||
| P18-1240 *****Medical***** imaging is widely used in clinical practice for diagnosis and treatment . | ||
| 2021.nlpmc-1.2 *****Medical***** conversations from patient visits are routinely summarized into clinical notes for documentation of clinical care . | ||
| 2020.multilingualbio-1.3 *****Medical***** language exhibits great variations regarding users , institutions and language registers . | ||
| Quality Estimation ( QE | 9 | |
| 2021.wmt-1.93 *****Quality Estimation ( QE***** ) is an important component of the machine translation workflow as it assesses the quality of the translated output without consulting reference translations . | ||
| 2021.eval4nlp-1.15 *****Quality Estimation ( QE***** ) for Machine Translation has been shown to reach relatively high accuracy in predicting sentence - level scores , relying on pretrained contextual embeddings and human - produced quality scores . | ||
| 2020.wmt-1.118 In this paper , we describe the Bering Lab 's submission to the WMT 2020 Shared Task on *****Quality Estimation ( QE***** ) . | ||
| L16-1356 This paper presents our work towards a novel approach for *****Quality Estimation ( QE***** ) of machine translation based on sequences of adjacent words , the so - called phrases . | ||
| 2020.wmt-1.116 This paper presents our submission to the WMT2020 Shared Task on *****Quality Estimation ( QE***** ) . | ||
| job | 9 | |
| 2020.coling-main.513 We introduce a deep learning model to learn the set of enumerated *****job***** skills associated with a job description . | ||
| 2020.computerm-1.5 Machine learning plays an ever - bigger part in online recruitment , powering intelligent matchmaking and *****job***** recommendations across many of the world 's largest job platforms . | ||
| P19-3008 Over 60 % of Australian PhD graduates land their first job after graduation outside academia , but this *****job***** market remains largely hidden to these job seekers . | ||
| L14-1615 This paper presents a system for suggesting a ranked list of appropriate vacancy descriptions to *****job***** seekers in a job board web site . | ||
| 2021.nlp4posimpact-1.11 Understanding the gaps between *****job***** requirements and university curricula is crucial for improving student success and institutional effectiveness in higher education . | ||
| Commonly | 8 | |
| 2020.semeval-1.178 ***** Commonly ***** occurring in settings such as social media platforms, code-mixed content makes the task of identifying sentiment notably more challenging and complex due to the lack of structure and noise present in the data. | ||
| 2020.emnlp-main.251 ***** Commonly ***** posed as a conditional generation problem, the task aims to leverage earlier inputs from users in a search session to predict queries that they will likely issue at a later time. | ||
| 2021.hackashop-1.11 ***** Commonly ***** comprising hundreds of millions of parameters, these models offer state-of-the-art performance, but at the expense of interpretability. | ||
| 2020.acl-main.445 ***** Commonly ***** adopted metrics for extractive summarization focus on lexical overlap at the token level. | ||
| 2021.emnlp-main.632 ***** Commonly ***** found in academic and formal texts, nominalizations can be difficult to interpret because of ambiguous semantic relations between the deverbal noun and its arguments | ||
| Computational Approaches | 8 | |
| W18-3219 In the third shared task of the ***** Computational Approaches ***** to Linguistic Code-Switching (CALCS) workshop, we focus on Named Entity Recognition (NER) on code-switched social-media data. | ||
| W18-3214 In the 3rd Workshop on ***** Computational Approaches ***** to Linguistic Code-Switching Shared Task, we achieved second place with 62.76% harmonic mean F1-score for English-Spanish language pair without using any gazetteer and knowledge-based information. | ||
| 2021.calcs-1.5 This paper describes the system submitted by IITP-MT team to ***** Computational Approaches ***** to Linguistic Code-Switching (CALCS 2021) shared task on MT for English→Hinglish. | ||
| W17-5223 This paper describes the entry NUIG in the WASSA 2017 (8th Workshop on ***** Computational Approaches ***** to Subjectivity, Sentiment & Social Media Analysis) shared task on emotion recognition. | ||
| W18-3216 This paper describes the system for the Named Entity Recognition Shared Task of the Third Workshop on ***** Computational Approaches ***** to Linguistic Code-Switching (CALCS) submitted by the Bilingual Annotations Tasks (BATs) research group of the University of Texas | ||
| graded | 8 | |
| 2021.starsem-1.23 By providing a probabilistic model of ***** graded ***** word meaning we aim to approach the slippery and yet widely used notion of word sense in a novel way. | ||
| W17-5018 On the one hand, we describe the compilation of a learner corpus of short answers ***** graded ***** with CEFR levels by three certified Cambridge examiners. | ||
| 2021.ranlp-1.91 We compare this model with the ***** graded ***** approach, in which the system returns texts at the optimal grade. | ||
| W18-0514 In this paper, we introduce NT2Lex, a novel lexical resource for Dutch as a foreign language (NT2) which includes frequency distributions of 17,743 words and expressions attested in expert-written textbook texts and readers ***** graded ***** along the scale of the Common European Framework of Reference (CEFR). | ||
| 2020.acl-main.386 We call for discarding the binary notion of faithfulness in favor of a more ***** graded ***** one, which we believe will be of greater practical utility | ||
| faceted | 8 | |
| 2021.internlp-1.5 Dynamic ***** faceted ***** search (DFS), an interactive query refinement technique, is a form of Human–computer information retrieval (HCIR) approach. | ||
| L10-1431 Keyword-based search, ***** faceted ***** search, question-answering, etc. are some of the automated methodologies that have been used to help analysts in their tasks. | ||
| D19-5317 In this work, we aim at building ***** faceted ***** concept hierarchies from scientific literature. | ||
| 2021.emnlp-demo.33 iFᴀᴄᴇᴛSᴜᴍ integrates interactive summarization together with ***** faceted ***** search, by providing a novel ***** faceted ***** navigation scheme that yields abstractive summaries for the user's selections. | ||
| D19-3030 It also employs various ***** faceted ***** views to group similar questions as well as filtering techniques to eliminate unanswerable questions | ||
| detect | 8 | |
| W18-2508 It assembles numerous state-of-the-art NLP technologies into a fully automated media ingestion pipeline that can record live broadcasts, ***** detect ***** and transcribe spoken content, translate from several languages (original text or transcribed speech) into English, recognize Named Entities, ***** detect ***** topics, cluster and summarize documents across language barriers, and extract and store factual claims in these news items. | ||
| 2020.lrec-1.785 We find that Analor tends to divide speech into smaller segments and that CRF models ***** detect ***** larger segments rather than macro-syntactic periods. | ||
| D19-1230 For the sake of understanding a negated statement, it is critical to precisely ***** detect ***** the negative focus in context. | ||
| 2021.emnlp-main.726 In this work, we introduce the Aspect Sentiment Quad Prediction (ASQP) task, aiming to jointly ***** detect ***** all sentiment elements in quads for a given opinionated sentence, which can reveal a more comprehensive and complete aspect-level sentiment structure. | ||
| K19-1082 Our best models ***** detect ***** the presence of slang at the sentence level with an F1-score of 0.80 and identify its exact position at the token level with an F1-Score of 0.50 | ||
| investigate | 8 | |
| 2020.lrec-1.550 In an attempt to reconciliate the diverging needs of unconstrained raw data use and preservation of data privacy in digital communication, we here ***** investigate ***** the automatic recognition of privacy-sensitive stretches of text in UGC and provide an algorithmic solution for the protection of personal data via pseudonymization. | ||
| 2021.eacl-main.168 We conduct a systematic comparison of several meta-learning methods, ***** investigate ***** multiple settings in terms of data availability, and show that meta-learning thrives in settings with a heterogeneous task distribution. | ||
| 2021.semeval-1.177 We present our system, ***** investigate ***** how architectural decisions affected model predictions, and conduct an error analysis. | ||
| W18-4203 In this position paper we systematically categorize human-created obfuscated language on various levels, ***** investigate ***** their basic mechanisms, give an overview on automated techniques needed to simulate human encoding. | ||
| 2020.aacl-main.11 The recently introduced pre-trained language model BERT advances the state-of-the-art on many NLP tasks through the fine-tuning approach, but few studies ***** investigate ***** how the fine-tuning process improves the model performance on downstream tasks | ||
| adapted | 8 | |
| W19-5015 For statistical classifiers trained for each of these problems, context-based representations based on ELMo, Universal Sentence Encoder, Neural-Net Language Model and FLAIR are better than Word2Vec, GloVe and the two ***** adapted ***** using the MESH ontology. | ||
| L12-1397 An iterative process is then applied, consisting in manually correcting errors found in the automatic annotations, retraining the linguistic models of the NLP tools on this corrected corpus, then checking the quality of the ***** adapted ***** models on the fully manual annotations of the GOLD corpus. | ||
| L10-1143 Named entity annotation guidelines for Dutch were developed, ***** adapted ***** from the MUC and ACE guidelines. | ||
| 2012.iwslt-papers.8 We show better performance for nearly all domain ***** adapted ***** systems, despite the fact that the domain***** adapted ***** systems are trained on a fraction of the content of their general domain counterparts. | ||
| 2020.sltu-1.9 Our analysis also reveals that, the speech recognition performance of the ***** adapted ***** acoustic model is highly influenced by the relatedness (in a relative sense) between the source and the target languages than other considered factors (e.g. the quality of source models) | ||
| autoencoders | 8 | |
| 2020.coling-main.542 Furthermore, a cross-lingual data augmentation method is designed by building ***** autoencoders ***** to learn the text representations shared by both languages. | ||
| C18-1070 The two most prominent approaches to this problem are structural correspondence learning and ***** autoencoders *****. | ||
| 2020.emnlp-main.491 Text ***** autoencoders ***** are commonly used for conditional generation tasks such as style transfer. | ||
| C16-1051 We used a combination of two different neural models, i.e., deep belief nets and deep ***** autoencoders *****, for both titles and descriptions. | ||
| 2021.emnlp-main.137 This approach stands in contrast to ***** autoencoders *****, also trained on raw text, but with the objective of learning to encode each input as a vector that allows full reconstruction | ||
| splitting | 8 | |
| 2020.eval4nlp-1.15 Motivated by the real-life necessity of applying machine learning models to different data distributions, we propose a clustering-based data ***** splitting ***** algorithm. | ||
| P18-2113 Different from MT, TS data comprises more elaborate transformations, such as sentence ***** splitting *****. | ||
| 2005.mtsummit-papers.33 Phrase alignment is viewed as a sentence ***** splitting ***** task. | ||
| 2020.emnlp-main.403 We show that NSP is detrimental to training due to its context ***** splitting ***** and shallow semantic signal. | ||
| W19-2717 The system encompasses three trainable component stacks: one for sentence ***** splitting *****, one for discourse unit segmentation and one for connective detection | ||
| optimality | 8 | |
| L08-1527 This goal is classically achieved by greedy algorithms which however do not guarantee the ***** optimality ***** of the desired cover. | ||
| K19-1027 In this paper, we alleviate the local ***** optimality ***** of back-translation by learning a policy (takes the form of an encoder-decoder and is defined by its parameters) with future rewarding under the reinforcement learning framework, which aims to optimize the global word predictions for unsupervised neural machine translation. | ||
| D17-1227 However, in the neural generation setting, hypotheses can finish in different steps, which makes it difficult to decide when to end beam search to ensure ***** optimality *****. | ||
| L12-1192 The ***** optimality ***** of such a process is directly related to the descriptive features of the sentences of a reference corpus. | ||
| 2020.wnut-1.16 Notably enough, on dialogue clarity and ***** optimality *****, the two paraphrase sources' human-perceived quality does not differ significantly | ||
| Multi30k | 8 | |
| D17-1105 We report new state-of-the-art results and our best models also significantly improve on a comparable phrase-based Statistical MT (PBSMT) model trained on the ***** Multi30k ***** data set according to all metrics evaluated. | ||
| W19-8620 In this work, we further investigate this hypothesis on a new large scale multimodal Machine Translation (MMT) dataset, How2, which has 1.57 times longer mean sentence length than ***** Multi30k ***** and no repetition. | ||
| W19-2307 We perform our experiments on Europarl and ***** Multi30k ***** datasets, on the English-French language pair, and document our performance using both supervised and unsupervised machine translation. | ||
| 2021.naacl-main.285 We show that our approach achieves SOTA performance in retrieval tasks on two multimodal multilingual image caption benchmarks: ***** Multi30k ***** with German captions and MSCOCO with Japanese captions. | ||
| D19-6402 Two multimodal multilingual datasets are used for evaluation: ***** Multi30k ***** with German and English captions and Microsoft-COCO with English and Japanese captions | ||
| NER dataset | 8 | |
| 2020.emnlp-main.592 To further verify our conclusions, we also construct a new open ***** NER dataset ***** that focuses on entity types with weaker name regularity and lower mention coverage to verify our conclusion. | ||
| 2020.splu-1.1 Experimental results on the standard Japanese ***** NER dataset ***** show that the proposed method achieves a higher F1 value (89.67%) than a baseline method, demonstrating the effectiveness of using element-wise visual information. | ||
| S18-2021 We also establish a new benchmark on the I2B2 2010 Clinical ***** NER dataset ***** with 84.70 F-score. | ||
| 2020.coling-main.54 For the evaluation of our methods we built our own Chinese biomedical patents ***** NER dataset *****, and our optimized model achieved an F1 score of 0.540.15. | ||
| P18-2012 We find our architecture achieves state-of-the-art performance on the CoNLL 2003 ***** NER dataset ***** | ||
| biLSTM | 8 | |
| K18-2004 The system consists of jointly trained tagger, lemmatizer, and dependency parser which are based on features extracted by a ***** biLSTM ***** network. | ||
| W19-1307 We propose convolution neural network (CNN) and bidirectional long-short term memory (***** biLSTM *****) (with and without Attention) models which take the generated bilingual embeddings as input. | ||
| E17-2053 We introduce a constituency parser based on a bi-LSTM encoder adapted from recent work (Cross and Huang, 2016b; Kiperwasser and Goldberg, 2016), which can incorporate a lower level character ***** biLSTM ***** (Ballesteros et al., 2015; Plank et al., 2016). | ||
| S18-1193 Our approach is to build distributed word embedding of reason, warrant and claim respectively, meanwhile, we use a series of frameworks such as CNN model, LSTM model, GRU with attention model and ***** biLSTM ***** with attention model for processing word vector. | ||
| S18-1180 Experiments demonstrate the superior performance of ***** biLSTM ***** with attention framework compared to other models | ||
| multilayer perceptron | 8 | |
| 2021.acl-long.232 Our model uniquely integrates BERT, K-Means embedding clustering, and ***** multilayer perceptron ***** to learn sentence embeddings, representation-explanations, and user-item interactions, respectively. | ||
| W19-5053 For the RQE task, we trained a traditional ***** multilayer perceptron ***** network based on embeddings generated by the universal sentence encoder. | ||
| E17-1012 We present a neural network architecture based on a combination of recurrent neural networks that are used to encode questions and answers, and a ***** multilayer perceptron *****. | ||
| W17-5044 The output of base classifiers, as probabilities for each class, are then fed into a ***** multilayer perceptron ***** to predict the native language of the author. | ||
| 2021.germeval-1.8 For this binary task, we propose three models: a German BERT transformer model; a ***** multilayer perceptron *****, which was first trained in parallel on textual input and 14 additional linguistic features and then concatenated in an additional layer; and a ***** multilayer perceptron ***** with both feature types as input | ||
| norms | 8 | |
| E17-2084 In this paper we present the first metaphor identification method that uses representations constructed from property ***** norms *****. | ||
| R17-1028 The distributional analysis of Marcus' text reveals that the passing from the communist regime period to democracy is sharply marked by two complementary changes in Marcus' writing: in the pre-democracy period, the communist ***** norms ***** of writing style demanded on the one hand long phrases, long words and clichës, and on the other hand, a short list of preferred “official” topics; in democracy tendency was towards shorten phrases and words while approaching a broader area of topics. | ||
| D17-1050 Concision requires wit to produce and wit to understand, which demands from each party knowledge of ***** norms *****, context and a speaker's mindset. | ||
| W17-5214 These differences in values can trigger reactions such as anger, disgust (contempt), sadness, etc., because these behaviors are evaluated by the public as being incompatible with their social/personal standards, ***** norms ***** or values. | ||
| 2020.lrec-1.366 We propose a semagram-based knowledge model composed of 26 semantic relationships which integrates features from a range of different sources, such as computational lexicons and property ***** norms ***** | ||
| recurrence | 8 | |
| 2020.acl-main.43 The hierarchy is based on two formal properties: space complexity, which measures the RNN's memory, and rational ***** recurrence *****, defined as whether the recurrent update can be described by a weighted finite-state machine. | ||
| D17-1249 In this paper, we present a method combining standard topic modeling with signature mining for analyzing topic ***** recurrence ***** in speeches of Clinton and Trump during the 2016 American presidential campaign. | ||
| 2020.tacl-1.9 However, the only factor that consistently contributed a hierarchical bias across tasks was the use of a tree-structured model rather than a model with sequential ***** recurrence *****, suggesting that human-like syntactic generalization requires architectural syntactic structure. | ||
| C16-1022 We also induce embeddings to generalize over elementary tree structures and exploit a tree ***** recurrence ***** over the input structure to model long-distance influences between NLG choices. | ||
| 2021.emnlp-main.602 In this work, we present SRU++, a highly-efficient architecture that combines fast ***** recurrence ***** and attention for sequence modeling | ||
| ELMO | 8 | |
| 2020.sustainlp-1.10 The second step identifies slot names only for slot tokens by using state-of-the-art pretrained contextual embeddings such as ***** ELMO ***** and BERT. | ||
| 2020.acl-srw.18 What do powerful models of word mean- ing created from distributional data (e.g. Word2vec (Mikolov et al., 2013) BERT (Devlin et al., 2019) and ***** ELMO ***** (Peters et al., 2018)) represent? | ||
| R19-1151 We compare wide range of methods including machine learning on bag-of-words representation, bidirectional recurrent neural networks as well as the most recent pre-trained architectures ***** ELMO ***** and BERT. | ||
| 2021.conll-1.7 We present a systematic study of the linear geometry of contextualized word representations in ***** ELMO ***** and BERT. | ||
| W19-4712 Future work will investigate how fine-tuning deep contextualized embedding models, such as ***** ELMO *****, might be used for similar tasks with greater contextual information | ||
| differential | 8 | |
| L12-1266 The challenge is that emergency dialogues are more complex on many levels than standard information negotiation dialogues, different resources are needed for ***** differential ***** investigation, and resources for this kind of corpus are rare. | ||
| 2021.privatenlp-1.4 Secondly, we apply ***** differential ***** privacy (DP) while the models are being trained in each client instance. | ||
| 2020.privatenlp-1.2 While this allows the perturbation to admit the required metric ***** differential ***** privacy, often the utility of downstream tasks modeled on this perturbed data is low because the spherical noise does not account for the variability in the density around different words in the embedding space. | ||
| 2020.privatenlp-1.5 When training a language model on sensitive information, ***** differential ***** privacy (DP) allows us to quantify the degree to which our private data is protected. | ||
| 2021.emnlp-main.628 Specifically, CAPE firstly applies calibrated noise through ***** differential ***** privacy to maintain the privacy of text representations by preserving the encoded semantic links while obscuring sensitive information | ||
| prompting | 8 | |
| L08-1359 As a result, the purposes of messages frequently end up not being fulfilled, ***** prompting ***** prolonged communication and stalling the disconnected workflow that is characteristic of email. | ||
| 2021.acl-long.353 Prefix-tuning draws inspiration from ***** prompting ***** for language models, allowing subsequent tokens to attend to this prefix as if it were “virtual tokens”. | ||
| W18-3503 Code-mixed content on social media is also on the rise, ***** prompting ***** the need for tools to automatically understand such content. | ||
| 2021.naacl-main.208 Results show that ***** prompting ***** is often worth 100s of data points on average across classification tasks. | ||
| 2021.blackboxnlp-1.20 We use a ***** prompting ***** methodology to simply ask BERT what the hypernym of a given word is | ||
| 1b | 8 | |
| 2021.smm4h-1.6 For Task ***** 1b ***** and 1c, we utilized the previous year's best solution based on the EnDR-BERT model with additional corpora. | ||
| 2020.sdp-1.32 In Task ***** 1b *****, we use a logistic regression to classifying the discourse facets. | ||
| 2021.smm4h-1.14 In the case of NER, our submissions scored F1-score of 0.50 and 0.82 on ADE Span Detection (Task ***** 1b *****) and Profession span detection (Task 7b) respectively. | ||
| 2021.smm4h-1.8 This paper describes our approach for six classification tasks (Tasks 1a, 3a, 3b, 4 and 5) and one span detection task (Task ***** 1b *****) from the Social Media Mining for Health (SMM4H) 2021 shared tasks. | ||
| W18-6442 For task ***** 1b *****, we explore three approaches: (i) re-ranking based on cross-lingual word sense disambiguation (as for task 1), (ii) re-ranking based on consensus of NMT n-best lists from German-Czech, French-Czech and English-Czech systems, and (iii) data augmentation by generating English source data through machine translation from French to English and from German to English followed by hypothesis selection using a multimodal-reranker | ||
| calibrated | 8 | |
| 2021.acl-long.103 We further find that the ***** calibrated ***** attention weights are more uniform at lower layers to collect multiple information while more concentrated on the specific inputs at higher layers. | ||
| 2020.emnlp-main.667 Finally, to motivate the utility of calibration for KGE from a practitioner's perspective, we conduct a unique case study of human-AI collaboration, showing that ***** calibrated ***** predictions can improve human performance in a knowledge graph completion task. | ||
| 2021.naacl-main.269 This work studies NER under a noisy labeled setting with ***** calibrated ***** confidence estimation. | ||
| 2021.ranlp-1.97 Specifically, we use the uncertainty of ***** calibrated ***** question answering models as a proxy of human-perceived difficulty. | ||
| 2020.acl-main.593 To achieve this, we add classifiers to different layers of BERT and use their ***** calibrated ***** confidence scores to make early exit decisions | ||
| AD | 8 | |
| W16-4211 Significant differences were confirmed for the usage of impersonal pronouns in the ***** AD ***** group. | ||
| D18-1304 Unfortunately, datasets for ***** AD ***** assessment are often sparse and incomplete. | ||
| 2020.clinicalnlp-1.19 We propose a method for DLB detection by using mental health record (MHR) documents from a (3-month) period before a patient has been diagnosed with DLB or ***** AD *****. | ||
| 2020.emnlp-main.313 Different from the advertising copywriting for a single product, an advertisement (***** AD *****) post includes an attractive topic that meets the customer needs and description copywriting about several products under its topic. | ||
| 2020.bionlp-1.14 The proposed technique also identifies the exact speeches that reflect linguistic biomarkers for early stage ***** AD ***** | ||
| Alongside | 8 | |
| L16-1549 ***** Alongside ***** audio and video recordings, our data-set consists of large amount of temporally aligned sensory data and system behavior provided by the environment and its interactive components. | ||
| 2020.lrec-1.462 ***** Alongside *****, we report on the methods of constructing such corpora using tools enabled by recent advances in machine translation and cross-lingual retrieval using deep neural network based methods. | ||
| L14-1122 ***** Alongside ***** these tasks, which were made easier through the adaptation and reuse of existing tools for closely related languages, a casting for voice talents among the speaking community was conducted and the first speech database for speech synthesis was recorded for Mirandese. | ||
| 2020.emnlp-tutorials.3 ***** Alongside ***** these descriptions, we will walk through source code that creates and visualizes interpretations for a diverse set of NLP tasks. | ||
| 2020.sltu-1.22 ***** Alongside ***** many textual features or representations,adjectives could be used in order to detect sentiment, even on a sentence or comment level | ||
| Basic | 8 | |
| L06-1067 The project focused on identifying the state of the art of LRs in the region, assessing priority requirements through consultations with language industry and communication players, and establishing a protocol for developing and identifying a ***** Basic ***** Language Resource Kit (BLARK) for Arabic, and to assess first priority requirements. | ||
| W19-6147 We conclude with a discussion on the benefits of a well-thought-out BLARK design (***** Basic ***** Language Resource Kit), making tools like SHARP possible. | ||
| L06-1457 We propose a ***** Basic ***** Language and Speech Kit (BLAST) as an extension to BLARK and suggest a strategy for integrating the kit into the Natural Language Toolkit (NLTK). | ||
| L06-1086 Additional topics include a strategy to incorporate OWL and RDFS semantics in one schema such that both RDF(S) infrastructure and OWL infrastructure can interpret the information correctly, problems encountered in understanding the Prolog source files and the description of the two versions that are provided (***** Basic ***** and Full) to accommodate different usages of WordNet | ||
| L16-1062 The aim of this paper is to study the effect that the use of *****Basic***** English versus common English has on information extraction from online resources . | ||
| searchable | 8 | |
| 2004.amta-papers.4 The adapted BWT embeds the necessary information to retrieve matched training instances without requiring any additional space and can be instantiated in a compressed form which reduces disk space and memory requirements by about 40% while still remaining ***** searchable ***** without decompression. | ||
| L10-1511 The annotation produces tree representations, in form of labelled parenthesis, that are integrally ***** searchable ***** with CorpusSearch, a search engine for parsed corpora (Randall, 2005-2007). | ||
| L12-1615 Projects CARDS and FLY have the main goal of making available an online electronic edition of each letter, which is completely open source, ***** searchable ***** and available. | ||
| 2021.naacl-main.151 In this work, we present an end-to-end framework that exploits compositionality to learn ***** searchable ***** hidden representations at intermediate stages of a sequence model using decomposed sub-tasks. | ||
| W18-4502 The dictionary encompasses comprehensive cross-referencing mechanisms, including linking entries to an online scanned edition of Crum's Coptic Dictionary, internal cross-references and etymological information, translated ***** searchable ***** definitions in English, French and German, and linked corpus data which provides frequencies and corpus look-up for headwords and multiword expressions | ||
| RSS | 8 | |
| L12-1359 Titles can be obtained through Web Search and ***** RSS ***** News feed collections so that download of the full documents is not needed. | ||
| R17-1096 A huge body of continuously growing written knowledge is available on the web in the form of social media posts, ***** RSS ***** feeds, and news articles. | ||
| L14-1196 Popularity index is based on the analysis of ***** RSS ***** streams. | ||
| 2008.amta-govandcom.13 We have modified an existing Open Source ***** RSS ***** reader, Sage, for cross-language use, permitting English-speakers to discover, subscribe to, update, and browse ***** RSS ***** feeds in ten languages. | ||
| 2005.mtsummit-ebmt.8 We describe our use of ***** RSS ***** news feeds to quickly assemble a parallel English-Japanese corpus | ||
| clausal | 8 | |
| W19-1201 To measure similarity between two DRSs, they are represented in a ***** clausal ***** form, i.e. as a set of tuples. | ||
| W17-2713 We present a formalization of a decompositional analysis of events in which each participant in a ***** clausal ***** event has their own temporally extended subevent, and the subevents are related through causal and other interactions. | ||
| P19-1384 We introduce here a collection of large, dependency-parsed written corpora in 17 languages, that allow us, for the first time, to capture ***** clausal ***** embedding through dependency graphs and assess their distribution. | ||
| N19-2022 We introduce a TOI pooling layer to replace traditional pooling layer for processing the nested phrasal or ***** clausal ***** elements in insurance policies | ||
| C18-1249 We extend the coverage of an existing grammar customization system to *****clausal***** modifiers , also referred to as adverbial clauses . | ||
| middleware | 8 | |
| 2020.cmlc-1.5 It takes the form of a ***** middleware ***** system between user front-ends and optional database or text indexing solutions as back-ends. | ||
| 2020.stoc-1.4 Specifically, this research is aimed at building a privacy preserving data publishing ***** middleware ***** for unstructured social media data without compromising the true analytical value of those data. | ||
| L06-1115 This work builds on the existing Heart of Gold ***** middleware ***** system, and previous work on Robust Minimal Recursion Semantics (RMRS) as part of an inter-component interface. | ||
| L04-1267 For all these components existing ***** middleware ***** seems to be available, however, it has to be checked how they can interact with each other | ||
| 2020.eamt-1.59 The MICE project ( 2018 - 2020 ) will deliver a *****middleware***** layer for improving the output quality of the eTranslation system of EC 's Connecting Europe Facility through additional services , such as domain adaptation and named entity recognition . | ||
| typing | 8 | |
| W18-1113 For user attribute prediction, the best approach is to combine the two, suggesting that extralinguistic factors are disclosed to a larger degree in written text, while author identity is better transmitted in ***** typing ***** behavior. | ||
| 2010.amta-government.3 Nevertheless, Google translation offers translators an option to work on its rough draft for the benefit of saving time and pain in ***** typing *****. | ||
| 2020.winlp-1.17 In the following, we present a system for assisted ***** typing ***** in LS whose accuracy and speed is largely due to the deployment of real time natural-language processing enabling efficient prediction and context-sensitive grammar support. | ||
| 2021.acl-srw.18 In order to improve entity and event ***** typing *****, we utilize context-aware representations aggregated from the detected mentions of the corresponding entities and events across the entire document | ||
| 2020.semeval-1.171 Multilingual people , who are well versed in their native languages and also English speakers , tend to code - mix using English - based phonetic *****typing***** and the insertion of anglicisms in their main language . | ||
| NAACL | 8 | |
| W18-0537 Workshop on Innovative Use of NLP for Building Educational Applications (BEA) at ***** NAACL ***** 2018. | ||
| W19-2701 Co-located with ***** NAACL ***** 2019 in Minneapolis, the workshop's aim was to bring together researchers working on corpus-based and computational approaches to discourse relations. | ||
| L14-1423 We have developed SWIFT Aligner, a free, portable software that allows for visual representation and editing of aligned corpora from several most commonly used formats: TALP, GIZA, and ***** NAACL *****. | ||
| W18-0907 We report on the shared task on metaphor identification on the VU Amsterdam Metaphor Corpus conducted at the ***** NAACL ***** 2018 | ||
| W19-3646 This paper has been accepted in ***** NAACL ***** 2019 | ||
| LightGBM | 8 | |
| 2021.semeval-1.71 The proposed system is made up of a ***** LightGBM ***** model fed with features obtained from many word frequency lists, published lexical norms and psychometric data. | ||
| 2020.lt4hala-1.19 A gradient boosting machine (***** LightGBM *****) is used for POS tagging, mainly fed with pre-computed word embeddings of a window of seven contiguous tokens—the token at hand plus the three preceding and following ones—per target feature value. | ||
| 2021.semeval-1.14 Our submission system stacks all previous models with a ***** LightGBM ***** at the top. | ||
| 2020.semeval-1.131 In the shared task of assessing the funniness of edited news headlines, which is a part of the SemEval 2020 competition, we preprocess datasets by replacing abbreviation, stemming words, then merge three models including Light Gradient Boosting Machine (***** LightGBM *****), Long Short-Term Memory (LSTM), and Bidirectional Encoder Representation from Transformer (BERT) by taking the average to perform the best | ||
| 2021.cmcl-1.10 A *****LightGBM***** model fed with target word lexical characteristics and features obtained from word frequency lists , psychometric data and bigram association measures has been optimized for the 2021 CMCL Shared Task on Eye - Tracking Data Prediction . | ||
| verbalizing | 8 | |
| Q16-1036 We propose two models for ***** verbalizing ***** numbers, a key component in speech recognition and synthesis systems. | ||
| R19-1095 We present LD2NL, a framework that allows ***** verbalizing ***** the three key languages of the Semantic Web, i.e., RDF, OWL, and SPARQL. | ||
| N19-1248 However, their presence is required for properly ***** verbalizing ***** Arabic and is hence essential for applications such as text to speech. | ||
| 2021.mmsr-1.2 Therefore, gestures are an inseparable part of the language system: they may add clarity to discourse, can be employed to facilitate lexical retrieval and retain a turn in conversations, assist in ***** verbalizing ***** semantic content and facilitate speakers in coming up with the words they intend to say. | ||
| I17-3003 The WiseReporter generates a text report of a specific topic which is usually given as a keyword by ***** verbalizing ***** knowledge base facts involving the topic | ||
| neuropsychological | 8 | |
| 2021.eacl-main.230 For this, we use resources from research on grapheme–color synesthesia – a ***** neuropsychological ***** phenomenon where letters are associated with colors –, which give us insight into which characters are similar for synesthetes and how characters are organized in color space. | ||
| L16-1331 This pilot study was conducted on a corpus composed of spontaneous speech sample collected from 39 subjects, who underwent a ***** neuropsychological ***** screening for visuo-spatial abilities, memory, language, executive functions and attention. | ||
| W19-3016 Verbal memory is affected by numerous clinical conditions and most ***** neuropsychological ***** and clinical examinations evaluate it. | ||
| W19-3012 The Semantic Verbal Fluency (SVF) task is a classical ***** neuropsychological ***** assessment where persons are asked to produce words belonging to a semantic category (e.g., animals) in a given time. | ||
| E17-1030 We present the first steps taken towards automatic ***** neuropsychological ***** evaluation based on narrative discourse analysis, presenting a new automatic sentence segmentation method for impaired speech | ||
| unweighted | 8 | |
| S19-2218 Each class in the dataset is represented as directed ***** unweighted ***** graphs. | ||
| W19-5933 In the present study, we distinguish addressees in two settings (a conversation between several people and a spoken dialogue system, and a conversation between several adults and a child) and introduce the first competitive baseline (***** unweighted ***** average recall equals 0.891) for the Voice Assistant Conversation Corpus that models the first setting. | ||
| S18-1095 Each class in the dataset is represented as directed ***** unweighted ***** graphs. | ||
| W18-3506 We achieve 59.6 and 55.0 ***** unweighted ***** accuracy scores in the Friends and the EmotionPush test sets, respectively. | ||
| W18-3509 The model achieved ***** unweighted ***** accuracy of 55.38% on Friends test dataset and 56.73% on EmotionPush test dataset | ||
| DS | 8 | |
| 2020.emnlp-main.300 However, the existing success of ***** DS ***** cannot be directly transferred to more challenging document-level relation extraction (DocRE), as the inevitable noise caused by ***** DS ***** may be even multiplied in documents and significantly harm the performance of RE. | ||
| 2021.naacl-main.199 We provide a ground truth for evaluation created by philosophy experts and a blueprint for using ***** DS ***** models in a sound methodological setup. | ||
| 2021.acl-long.483 The journey of reducing noise from distant supervision (***** DS *****) generated training data has been started since the ***** DS ***** was first introduced into the relation extraction (RE) task. | ||
| N19-1107 Distant supervision (***** DS *****) is an important paradigm for automatically extracting relations. | ||
| W19-3654 We introduce a shift on the *****DS***** method over the domain of crime - related news from Peru , attempting to find the culprit , victim and location of a crime description from a RE perspective . | ||
| dimension | 8 | |
| L14-1703 We employ the random indexing technique to model terms' surrounding words, which we call the context window, in a vector space at reduced ***** dimension *****. | ||
| L06-1253 We argue that being mutually exclusive is not a good criterion for a set of dialogue act types to constitute a ***** dimension *****, even though the description of an object in a multi***** dimension *****al space should never assign more than one value per ***** dimension *****. | ||
| 2020.lrec-1.67 Using the 17 English dialogs of the DialogBank as gold standard, our preliminary experiments have shown that including the mapped dialogs during the training phase leads to improved performance while recognizing communicative functions in the Task ***** dimension *****. | ||
| 2020.gamnlp-1.1 Both lexicons perform comparably well on our evaluation dialogues, but the game-specific extension performs slightly better on the dominance ***** dimension ***** for dialogue segments and the arousal ***** dimension ***** for full dialogues. | ||
| D19-1346 Applications such as textual entailment, plagiarism detection or document clustering rely on the notion of semantic similarity, and are usually approached with ***** dimension ***** reduction techniques like LDA or with embedding-based neural approaches | ||
| MLP | 8 | |
| 2019.icon-1.2 The representations obtained from these two models are fed into a Multi-layer Perceptron Model (***** MLP *****) for the final classification. | ||
| 2020.semeval-1.52 According to the results, these embeddings can improve the performance of the typical ***** MLP ***** and LSTM classifiers as downstream models of both subtasks compared to regular tokenised statements. | ||
| 2021.semeval-1.151 We address the multi-modal multi-label classification of memes defined in subtask 3 by utilizing a ResNet50 based image model, DistilBERT based text model, and a multi-modal architecture based on multikernel CNN+LSTM and ***** MLP ***** model. | ||
| S18-1157 The ***** MLP ***** takes both the flatten CNN maps and inputs to predict the labels | ||
| D17-1117 Experimenting with a new dataset of 1.6 M user comments from a news portal and an existing dataset of 115 K Wikipedia talk page comments , we show that an RNN operating on word embeddings outpeforms the previous state of the art in moderation , which used logistic regression or an *****MLP***** classifier with character or word n - grams . | ||
| formatted | 8 | |
| L10-1492 annotated with Selection and Coercion relations among verb-noun pairs ***** formatted ***** in XML according to the Generative Lexicon Mark-up Language (GLML) format (Pustejovsky et al., 2008). | ||
| L08-1169 The purpose of the GALE Distillation evaluation is to quantify the amount of relevant and non-redundant information a distillation engine is able to produce in response to a specific, ***** formatted ***** query; and to compare that amount of information to the amount of information gathered by a bilingual human using commonly available state-of-the-art tools. | ||
| L10-1067 Under the DARPA Global Autonomous Language Exploitation (GALE) program, Distillation provides succinct, direct responses to the ***** formatted ***** queries using the outputs of automated transcription and translation technologies. | ||
| 2020.wmt-1.138 The ability of machine translation (MT) models to correctly place markup is crucial to generating high-quality translations of ***** formatted ***** input. | ||
| 2021.emnlp-demo.4 For many use cases, it is required that MT does not just translate raw text, but complex ***** formatted ***** documents (e.g. websites, slides, spreadsheets) and the result of the translation should reflect the formatting | ||
| precedence | 8 | |
| 2000.iwpt-1.22 In our work the ***** precedence ***** relations and word order constraints are defined locally for each clause. | ||
| 1998.amta-papers.16 In this paper we employ linguistic knowledge such as subcategorization, linear ***** precedence ***** and lexical functions for the analysis and the transfer of the constructions of this sort. | ||
| W18-0801 This simple proposal requires minimal additional up-front costs for researchers; the lay summary, at least, has significant ***** precedence ***** in the medical literature and other areas of science; and the proposal is aimed to supplement, rather than replace, existing approaches for encouraging researchers to consider the ethical implications of their work, such as those of the Collaborative Institutional Training Initiative (CITI) Program and institutional review boards (IRBs). | ||
| W19-2915 Sentences like “Every child climbed a tree” have at least two interpretations depending on the ***** precedence ***** order of the universal quantifier and the indefinite. | ||
| W89-0247 A parser is described here based on the Cocke-Young-Kassami algorithm which uses immediate dominance and linear ***** precedence ***** rules together with various feature inheritance conventions | ||
| diatopic | 8 | |
| R19-2010 The paper describes three corpora of different varieties of BS that are currently being developed with the goal of providing data for the analysis of the ***** diatopic ***** and diachronic variation in non-standard Balkan Slavic. | ||
| 2020.lrec-1.114 First, from the linguistic point of view it gives account of the wide range of varieties in which Italian was articulated in that period, namely from a diastratic (educated vs. uneducated writers), diaphasic (low/informal vs. high/formal registers) and ***** diatopic ***** (regional varieties, dialects) points of view. | ||
| 2020.vardial-1.13 It has no official status in the country, it is not standardized and displays important ***** diatopic ***** variation resulting in a rich system of dialects. | ||
| L06-1469 The paper presents an on-line dialectal resource, ALT-Web, which gives access to the linguistic data of the Atlante Lessicale Toscano, a specially designed linguistic atlas in which lexical data have both a ***** diatopic ***** and diastratic characterisation. | ||
| 2020.lrec-1.369 Additional resources extracted from ENGLAWI, such as an inflectional lexicon, a lexicon of ***** diatopic ***** variants and the inclusion dates of headwords in Wiktionary's nomenclature are also provided | ||
| comparing | 8 | |
| D19-1251 To address this issue, we propose a numerical MRC model named as NumNet, which utilizes a numerically-aware graph neural network to consider the ***** comparing ***** information and performs numerical reasoning over numbers in the question and passage. | ||
| 2006.amta-papers.27 For the given experimental dataset, loss function analyses provided a clearer characterization of the engines' relative strength than did ***** comparing ***** the response rates to each other. | ||
| L12-1330 In this work we present our results ***** comparing ***** the influence of the different factors used. | ||
| 2021.emnlp-main.368 Because of this, ***** comparing ***** multiple versions of the same model during development leads to overestimation on the development data. | ||
| D19-1476 Experimental results show that the proposed model outperforms the ***** comparing ***** methods on all three datasets | ||
| fusional | 8 | |
| 2021.emnlp-main.793 We also evaluate the long-standing hypotheses that more frequent forms are more ***** fusional *****, and that paradigm size anticorrelates with degree of fusion. | ||
| 2021.bsnlp-1.1 Therefore, this paper presents the first ablation study focused on Polish, which, unlike the isolating English language, is a ***** fusional ***** language. | ||
| P17-1073 The designed procedure is verified on Polish, a ***** fusional ***** language with a relatively free word order, and contributes to building a Polish evaluation dataset. | ||
| W18-4808 Machine translation from polysynthetic to ***** fusional ***** languages is a challenging task, which gets further complicated by the limited amount of parallel text available. | ||
| 2020.lrec-1.879 We conduct an extensive quantitative and qualitative evaluation of this framework on 12 languages and show that the framework achieves state-of-the-art results across languages of different typologies (from ***** fusional ***** to polysynthetic and from high-resource to low-resource) | ||
| matched | 8 | |
| 2021.wmt-1.85 We used tags to mark and add the term translations into the ***** matched ***** sentences. | ||
| W17-5310 With this model we obtained test accuracies of 72.057% and 72.055% in the ***** matched ***** and mis***** matched ***** evaluation tracks respectively, outperforming the LSTM baseline, and obtaining performances similar to a model that relies on shared information between sentences (ESIM). | ||
| 2020.lrec-1.686 Our work present two new systems, CombiNMT995, which is a result of ***** matched ***** sentences with a cosine similarity of 0.995 or less, and CombiNMT98, which, similarly, runs on a cosine similarity of 0.98 or less. | ||
| 2002.amta-papers.3 The crux of the problem lies in greater variability of lengths and match types of the ***** matched ***** sentences. | ||
| D19-1019 After that, we propose a novel Context Clue Matching Mechanism (CCMM) to enhance the representations of all customer utterances with their ***** matched ***** context clues, i.e., sentiment and reasoning clues | ||
| ScienceIE | 8 | |
| S17-2168 This paper presents our relation extraction system for subtask C of SemEval-2017 Task 10: ***** ScienceIE *****. | ||
| S17-2164 This paper describes the system presented by the LABDA group at SemEval 2017 Task 10 ***** ScienceIE *****, specifically for the subtasks of identification and classification of keyphrases from scientific articles. | ||
| S17-2166 In this paper, we present MayoNLP's results from the participation in the ***** ScienceIE ***** share task at SemEval 2017. | ||
| D17-1279 Both inductive and transductive semi-supervised learning strategies outperform state-of-the-art information extraction performance on the 2017 SemEval Task 10 ***** ScienceIE ***** task. | ||
| S17-2097 This paper describes our submission for the ***** ScienceIE ***** shared task (SemEval- 2017 Task 10) on entity and relation extraction from scientific papers | ||
| Parseval | 8 | |
| 2020.lrec-1.128 Although ***** Parseval ***** is commonly used, variations of evaluation differ from three aspects: micro vs. macro F1 scores, binary vs. multiway ground truth, and left-heavy vs. right-heavy binarization. | ||
| 1995.iwpt-1.26 Of the parsed sentences (1,899), the percentage of no-crossing sentences is 33.9%, and ***** Parseval ***** recall and precision are 73.43% and 72 .61%. | ||
| L08-1115 We directly evaluate parser output using both the ***** Parseval ***** and the Leaf Ancestor metrics. | ||
| D17-1136 We evaluate all these parsers with the standard ***** Parseval ***** procedure to provide a more accurate picture of the actual RST discourse parsers performance in standard evaluation settings. | ||
| L06-1060 The ***** Parseval ***** metrics are undefined when the words input to the parser do not match the words in the gold standard parse tree exactly, and word errors are unavoidable with automatic speech recognition (ASR) systems | ||
| cascading | 8 | |
| 2021.eacl-main.216 Our evaluation across a range of metrics capturing accuracy, latency, and consistency shows that our end-to-end models are statistically similar to ***** cascading ***** models, while having half the number of parameters. | ||
| 2020.emnlp-main.173 Furthermore, how to properly utilize the labels remains an issue due to the ***** cascading ***** errors between tasks. | ||
| W19-2803 The low precision and recall for this variable will lead to severe ***** cascading ***** errors. | ||
| E17-2088 Furthermore, We experiment with several neural models on the dataset and show that they are more effective in jointly modeling the overall position towards two related targets compared to independent predictions and other models of joint learning, such as ***** cascading ***** classification. | ||
| D19-1218 We introduce a learning approach focused on recovery from ***** cascading ***** errors between instructions, and modeling methods to explicitly reason about instructions with multiple goals | ||
| focal | 8 | |
| 2021.dravidianlangtech-1.21 Besides, we solved the class-imbalance problem existed in training data by class combination, class weights and ***** focal ***** loss. | ||
| 2021.emnlp-main.481 Our innovations are three-fold: (1) we utilize a deep convolution-based encoder with the squeeze-and-excitation networks and residual networks to aggregate the information across the document and learn meaningful document representations that cover different ranges of texts; (2) we explore multi-layer and sum-pooling attention to extract the most informative features from these multi-scale representations; (3) we combine binary cross entropy loss and ***** focal ***** loss to improve performance for rare labels. | ||
| 2021.acl-long.385 To alleviate this problem, ***** focal ***** loss penalty strategies are integrated into the loss functions. | ||
| 2020.emnlp-main.15 To enable intrinsic probing, we propose a novel framework based on a decomposable multivariate Gaussian probe that allows us to determine whether the linguistic information in word embeddings is dispersed or ***** focal *****. | ||
| W18-4412 Our model achieves competitive results over other strong baseline methods, which show its effectiveness and that ***** focal ***** loss exhibits significant improvement in such cases where class imbalance is a regular issue | ||
| residual | 8 | |
| W17-5025 We combine several approaches into an ensemble, based on spelling error features, a simple neural network using word representations, a deep ***** residual ***** network using word and character features, and a system based on a recurrent neural network. | ||
| P19-1001 The model performs matching by stacking multiple interaction blocks in which ***** residual ***** information from one time of interaction initiates the interaction process again. | ||
| D17-1151 Our experiments provide practical insights into the relative importance of factors such as embedding size, network depth, RNN cell type, ***** residual ***** connections, attention mechanism, and decoding heuristics. | ||
| W16-4816 The system uses only byte representations in a deep ***** residual ***** network (ResNet). | ||
| N18-1117 From the optimization perspective, ***** residual ***** connections are adopted to improve learning performance for both encoder and decoder in most of these deep architectures, and advanced attention connections are applied as well | ||
| compared | 8 | |
| W17-2326 In this paper, we present a system to automatically identify such comparative sentences and their components i.e. the ***** compared ***** entities, the scale of the comparison and the aspect on which the entities are being ***** compared *****. | ||
| 2021.acl-long.533 In MT, we identify two settings where metrics outperform humans due to a statistical advantage in variance: when the number of human judgments used is small, and when the quality difference between ***** compared ***** systems is small. | ||
| 2021.emnlp-main.546 The state-of-the-art graph neural network-based ED-GAT (Ma et al., 2020) only considers syntactic information while ignoring the critical semantic relations and the sentiments to the ***** compared ***** entities. | ||
| 2021.acl-demo.9 While automated MT evaluation metrics are commonly used to evaluate MT systems at a corpus-level, our platform supports fine-grained segment-level analysis and interactive visualisations that expose the fundamental differences in the performance of the ***** compared ***** systems. | ||
| 2021.semeval-1.123 We obtained 0.6251 f1 score with Waw-unet while 0.6390 and 0.6601 with the ***** compared ***** models respectively | ||
| demonstrative | 8 | |
| L08-1472 Based on the observation of the corpus, we proposed a multimodal behavior description for observation of ***** demonstrative ***** expressions. | ||
| 2020.coling-main.351 We discuss the challenges entailed in managing training input from languages without standard orthographies, we provide evidence of successful learning of Bribri grammar, and also examine the translations of structures that are infrequent in major Indo-European languages, such as positional verbs, ergative markers, numerical classifiers and complex ***** demonstrative ***** systems. | ||
| L14-1701 They contain coreference links between several types of pronouns (including elliptical, possessive, indefinite, ***** demonstrative *****, relative and personal clitic and non-clitic pronouns) and nominal phrases (including proper nouns). | ||
| S17-1029 In this paper, we introduce the task of question answering using natural language demonstrations where the question answering system is provided with detailed ***** demonstrative ***** solutions to questions in natural language | ||
| L12-1082 This contribution explores the subgroup of text structuring expressions with the form preposition + *****demonstrative***** pronoun , thus it is devoted to an aspect of the interaction of coreference relations and relations signaled by discourse connectives ( DCs ) in a text . | ||
| abductive | 8 | |
| 2021.acl-long.114 Comprehensive empirical results demonstrate that Reflective Decoding outperforms strong unsupervised baselines on both paraphrasing and ***** abductive ***** text infilling, significantly narrowing the gap between unsupervised and supervised methods. | ||
| W19-2403 In this paper, we summarize advances in deriving the logical form of the text, encoding commonsense knowledge, and technologies for scalable ***** abductive ***** reasoning. | ||
| 2021.acl-long.403 To fill this gap, we propose a variational autoencoder based model ege-RoBERTa, which employs a latent variable to capture the necessary commonsense knowledge from event graph for guiding the ***** abductive ***** reasoning task. | ||
| 2020.emnlp-main.58 We demonstrate that our approach is general and applicable to two nonmonotonic reasoning tasks: ***** abductive ***** text generation and counterfactual story revision, where DeLorean outperforms a range of unsupervised and some supervised methods, based on automatic and human evaluation. | ||
| P19-1615 In this paper we address QA with respect to the OpenBookQA dataset and combine state of the art language models with ***** abductive ***** information retrieval (IR), information gain based re-ranking, passage selection and weighted scoring to achieve 72.0% accuracy, an 11.6% improvement over the current state of the art | ||
| undesirable | 8 | |
| 2021.ranlp-1.63 Applying basic Deep Learning models, however, leads to ***** undesirable ***** results due to the unbalanced nature of the data and the extreme number of classes. | ||
| 2020.findings-emnlp.383 Biased media can influence people in ***** undesirable ***** directions and hence should be unmasked as such. | ||
| P19-1166 Word embeddings are often criticized for capturing ***** undesirable ***** word associations such as gender stereotypes. | ||
| 2021.naacl-main.343 Since these plots are compact and structured, it is easier to manipulate them to generate text with targeted ***** undesirable ***** properties, while at the same time maintain the grammatical correctness and naturalness of the generated sentences. | ||
| 2020.emnlp-main.602 We formulate **Controllable Debiasing**, a new revision task that aims to rewrite a given text to correct the implicit and potentially ***** undesirable ***** bias in character portrayals | ||
| shuffling | 8 | |
| S19-2165 Finally, we find that randomizing the order of word pieces dramatically reduces validation accuracy (to approximately 60%), but that ***** shuffling ***** groups of four or more word pieces maintains an accuracy of about 80%, indicating the model mainly gains value from local context. | ||
| 2021.acl-short.27 We show that the token representations and self-attention activations within BERT are surprisingly resilient to ***** shuffling ***** the order of input tokens, and that for several GLUE language understanding tasks, ***** shuffling ***** only minimally degrades performance, e.g., by 4% for QNLI. | ||
| 2021.ranlp-1.1 The latter is created by (1) ***** shuffling ***** two rhyming end-of-the-line words, (2) ***** shuffling ***** two rhyming lines, (3) replacing end-of-the-line word by a non-rhyming synonym. | ||
| 2021.emnlp-main.232 These models typically corrupt the given sequences with certain types of noise, such as masking, ***** shuffling *****, or substitution, and then try to recover the original input. | ||
| 2021.naacl-main.20 Using parallel data, our method aligns embeddings on the word level through the recently proposed Translation Language Modeling objective as well as on the sentence level via contrastive learning and random input ***** shuffling ***** | ||
| SentiMix | 8 | |
| 2020.semeval-1.121 In this paper, we present the results that the team IIITG-ADBU (codalab username `abaruah') obtained in the ***** SentiMix ***** task (Task 9) of the International Workshop on Semantic Evaluation 2020 (SemEval 2020). | ||
| 2020.semeval-1.100 In this paper, we present the results of the SemEval-2020 Task 9 on Sentiment Analysis of Code-Mixed Tweets (***** SentiMix ***** 2020). | ||
| 2020.semeval-1.178 SemEval-2020 Task 9, ***** SentiMix *****, was organized with the purpose of detecting the sentiment of a given code-mixed tweet comprising Hindi and English. | ||
| 2020.semeval-1.176 We explore the task of sentiment analysis on Hinglish (code-mixed Hindi-English) tweets as participants of Task 9 of the SemEval-2020 competition, known as the ***** SentiMix ***** task. | ||
| 2020.semeval-1.163 This paper describes our feature engineering approach to sentiment analysis in code-mixed social media text for SemEval-2020 Task 9: ***** SentiMix ***** | ||
| leakage | 8 | |
| 2020.acl-main.209 For example, facts can appear in different paraphrased textual variants, which can lead to test ***** leakage *****. | ||
| 2021.eacl-main.113 We identify ***** leakage ***** of training data into test data on several publicly available datasets used to evaluate NLP tasks, including named entity recognition and relation extraction, and study them to assess the impact of that ***** leakage ***** on the model's ability to memorize versus generalize. | ||
| 2021.acl-long.146 Furthermore, incorporating illustrative cases and external contexts improve knowledge prediction mainly due to entity type guidance and golden answer ***** leakage *****. | ||
| 2020.privatenlp-1.1 Privacy auditing tools for measuring ***** leakage ***** from sensitive datasets assess the total privacy ***** leakage ***** based on the adversary's predictions for datapoint membership. | ||
| 2021.eacl-main.207 We prove our algorithm's theoretical privacy guarantee and assess its privacy ***** leakage ***** under Membership Inference Attacks (MIA) on models trained with transformed data | ||
| TurkuNLP | 8 | |
| K18-2013 In this paper we describe the ***** TurkuNLP ***** entry at the CoNLL 2018 Shared Task on Multilingual Parsing from Raw Text to Universal Dependencies. | ||
| 2020.lrec-1.452 Our approach for the construction of the benchmark builds upon the wide-coverage multilingual sense inventory of BabelNet, the multilingual neural parsing pipeline ***** TurkuNLP *****, and the OPUS collection of translated texts from the web. | ||
| D19-5728 We present the approach taken by the ***** TurkuNLP ***** group in the CRAFT Structural Annotation task, a shared task on dependency parsing. | ||
| 2020.iwpt-1.17 We present the approach of the ***** TurkuNLP ***** group to the IWPT 2020 shared task on Multilingual Parsing into Enhanced Universal Dependencies | ||
| K17-3012 We present the *****TurkuNLP***** entry in the CoNLL 2017 Shared Task on Multilingual Parsing from Raw Text to Universal Dependencies . | ||
| handcrafting | 8 | |
| L16-1110 With the multi-layer architecture of the scoring function we can avoid ***** handcrafting ***** feature conjunctions. | ||
| I17-4031 In order to overcome this, we propose deep Convolutional Neural Network (CNN) and Recurrent Neural Network (RNN) based approaches that do not require ***** handcrafting ***** of features. | ||
| D18-1230 Recent advances in deep neural models allow us to build reliable named entity recognition (NER) systems without ***** handcrafting ***** features. | ||
| W17-5403 Most g2p systems are monolingual: they require language-specific data or ***** handcrafting ***** of rules. | ||
| E17-1042 Currently, developing task-oriented dialogue systems requires creating multiple components and typically this involves either a large amount of ***** handcrafting *****, or acquiring costly labelled datasets to solve a statistical learning problem for each component | ||
| insights | 8 | |
| 1999.mtsummit-1.20 The project builds on ***** insights ***** and resources in large-scale development of parallel LFG grammars. | ||
| L14-1355 We feel that ***** insights ***** obtained from this analysis will provide guidelines for creating machine translation systems of specific Indian language pairs. | ||
| 2020.findings-emnlp.93 We study a range of datasets including recent tweets related to COVID-19 to illustrate the superior performance of our model and report ***** insights ***** on public emotions during the on-going pandemic. | ||
| 2021.emnlp-main.476 We discuss linguistic properties that are related to stability, drawing out ***** insights ***** about correlations with affixing, language gender systems, and other features. | ||
| 2020.wosp-1.2 However, through ***** insights ***** from the model, we demonstrate that these entities are identifiable with a small number of guesses primarily by using a combination of self-citations, social, and common citations | ||
| segment | 8 | |
| Q14-1014 A tradeoff must be found for ***** segment ***** sizes. | ||
| 2020.findings-emnlp.260 By leveraging the powerful ability of the Transformer encoder, the proposed unified model can ***** segment ***** Chinese text according to a unique criterion-token indicating the output criterion. | ||
| 2021.emnlp-main.714 To train most AMR parsers, one needs to ***** segment ***** the graph into subgraphs and align each such subgraph to a word in a sentence; this is normally done at preprocessing, relying on hand-crafted rules. | ||
| L14-1308 Some candidates for ***** segment ***** boundaries - where the topic continues - are irrelevant | ||
| 2021.mwe-1.6 In lexical semantics , full - sentence segmentation and *****segment***** labeling of various phenomena are generally treated separately , despite their interdependence . | ||
| cluster | 8 | |
| D18-1486 With the advantages of capsules for feature ***** cluster *****ing, proposed task routing algorithm can ***** cluster ***** the features for each task in the network, which helps reduce the interference among tasks. | ||
| W17-0901 We incorporate crowdsourced alignments as prior knowledge and show that exploiting a small number of alignments results in a substantial improvement in ***** cluster ***** quality over state-of-the-art models and provides an appropriate basis for the induction of temporal order. | ||
| D19-1089 We study two methods for language ***** cluster *****ing: (1) using prior knowledge, where we ***** cluster ***** languages according to language family, and (2) using language embedding, in which we represent each language by an embedding vector and ***** cluster ***** them in the embedding space. | ||
| 2020.lrec-1.123 For instance, in a bilingual Czech-German text collection containing parallel texts (originals and translations in both directions, along with Czech and German translations from other languages), authors would not ***** cluster ***** across languages, since frequency word lists for any Czech texts are obviously going to be more similar to each other than to a German text, and the other way round. | ||
| 2020.lrec-1.535 The analysis reveals that the articles in our data set ***** cluster ***** into seven categories related to different topical aspects of flooding, and that the images accompanying the articles ***** cluster ***** into five categories related to the content they depict | ||
| contrasting | 8 | |
| 2021.emnlp-main.584 Specifically, we normalize these scores across various neighborhoods of closely ***** contrasting ***** questions and/or answers, adding a cross entropy loss term in addition to traditional maximum likelihood estimation. | ||
| W17-5031 We present a very simple model for text quality assessment based on a deep convolutional neural network, where the only supervision required is one corpus of user-generated text of varying quality, and one ***** contrasting ***** text corpus of consistently high quality. | ||
| L16-1593 After discussing the properties of the BOLT IR corpus, we provide a detailed description of the query creation process, ***** contrasting ***** the summary query format presented to systems at run time with the full query format created by annotators. | ||
| 2021.acl-long.361 We also provide two specific implementations of the interventions based on entity ranking and context ***** contrasting *****. | ||
| 2021.nodalida-main.1 We assess the merits of these models using cloze tests and the state-of-the-art UDify parser on Universal Dependencies data, ***** contrasting ***** performance with results using the multilingual BERT (mBERT) model | ||
| insufficient | 8 | |
| 2020.emnlp-main.354 This is not surprising as the learning signal is likely ***** insufficient ***** for deriving all aspects of phrase-structure syntax and gradient estimates are noisy. | ||
| 2021.acl-long.178 We find that using the same time budget, HPO often fails to outperform grid search due to two reasons: ***** insufficient ***** time budget and overfitting. | ||
| N18-1111 In this work, we propose a multinomial adversarial network (MAN) to tackle this real-world problem of multi-domain text classification (MDTC) in which labeled data may exist for multiple domains, but in ***** insufficient ***** amounts to train effective classifiers for one or more of the domains. | ||
| W18-2405 The challenge of NER for tweets lie in the ***** insufficient ***** information available in a tweet. | ||
| 2018.gwc-1.2 Information extraction in the medical domain is laborious and time-consuming due to the ***** insufficient ***** number of domain-specific lexicons and lack of involvement of domain experts such as doctors and medical practitioners | ||
| computing | 8 | |
| L14-1299 The key ingredient is ***** computing ***** an alignment between letter strings and phoneme strings, a standard technique in pronunciation modeling. | ||
| 2003.mtsummit-papers.15 We describe an experiment in rapid development of a statistical machine translation (SMT) system from scratch, using limited resources: under this heading we include not only training data, but also ***** computing ***** power, linguistic knowledge, programming effort, and absolute time. | ||
| 2021.eacl-srw.20 Most works in food ***** computing ***** focus on generating new recipes from scratch. | ||
| W19-4805 In this work, we define multi-granular ngrams as basic units for explanation, and organize all ngrams into a hierarchical structure, so that shorter ngrams can be reused while ***** computing ***** longer ngrams. | ||
| 2021.ranlp-1.101 We question the use of supervised state-of-the-art models in such a context, where resources such as time, ***** computing ***** power and human annotators are limited | ||
| adapt | 8 | |
| 2021.rocling-1.23 Due to the increase of aspect categories, the model must be retrained frequently to fast ***** adapt ***** to the newly added aspect category data. | ||
| 1997.mtsummit-papers.3 In making MT work for them, however, SAP has also had to substantially ***** adapt ***** the products that they received from MT companies. | ||
| C18-1305 But more importantly, we show how to use reinforcement learning (RL) to further ***** adapt ***** the ***** adapt *****ed translator, where translated sentences with more proper slot tags receive higher rewards. | ||
| 2021.eacl-main.26 Experiments show that: (1) Our IR-based retrieval method is able to collect high-quality candidates efficiently, thus enables our method ***** adapt ***** to large-scale KBs easily; (2) the BERT model improves the accuracy across all three sub-tasks; and (3) benefiting from multi-task learning, the unified model obtains further improvements with only 1/3 of the original parameters. | ||
| 2020.aacl-main.30 By doing so, our model can directly ***** adapt ***** to the unseen emotions in any modality since we have their pre-trained embeddings and modality mapping functions | ||
| plausible | 8 | |
| 2020.lt4hala-1.12 We first carry out our experiments on ***** plausible ***** artificial languages, without noise, in order to study the role of each parameter on the algorithms respective performance under almost perfect conditions. | ||
| 2021.acl-long.237 Instead of directly scoring each answer choice, our method first generates a set of ***** plausible ***** answers with generative models (e.g., GPT-2), and then uses these ***** plausible ***** answers to select the correct choice by considering the semantic similarity between each ***** plausible ***** answer and each choice. | ||
| 2020.acl-main.723 Building on measures developed for resource-bounded document retrieval, we introduce a well founded evaluation paradigm, and demonstrate using an expert-annotated test collection that meaningful improvements over ***** plausible ***** cascade model baselines can be achieved using an approach that jointly ranks individuals and their social media posts. | ||
| W19-5105 The model captures generalisations over this data and learns what combinations give rise to ***** plausible ***** compounds and which ones do not. | ||
| 2020.lrec-1.526 In this task, a system learns ***** plausible ***** positions of images in a given document | ||
| truecasing | 8 | |
| D19-1650 In this work, we perform a systematic analysis of solutions to this problem, modifying only the casing of the train or test data using lowercasing and ***** truecasing ***** methods. | ||
| 2020.wnut-1.19 In this paper, we investigate ***** truecasing ***** as an in- trinsic task and present several experiments on noisy user queries to a voice-controlled dia- log system. | ||
| 2014.iwslt-evaluation.18 Individual systems employ training data selection for domain adaptation, ***** truecasing *****, compound word splitting (for GermanEnglish), interpolated n-gram language models, and hypotheses rescoring using recurrent neural network language models. | ||
| 2020.nlpmc-1.8 ASR output typically undergoes automatic punctuation to enable users to speak naturally, without having to vocalize awkward and explicit punctuation commands, such as “period”, “add comma” or “exclamation point”, while ***** truecasing ***** enhances user readability and improves the performance of downstream NLP tasks. | ||
| N18-3015 Post-processing entails a host of transformations including punctuation restoration, ***** truecasing *****, marking sections and headers, converting dates and numerical expressions, parsing lists, etc | ||
| EACL | 8 | |
| W17-1507 The CORBON 2017 Shared Task, organised as part of the Coreference Resolution Beyond OntoNotes workshop at ***** EACL ***** 2017, presented a new challenge for multilingual coreference resolution: we offer a projection-based setting in which one is supposed to build a coreference resolver for a new language exploiting little or even no knowledge of it, with our languages of interest being German and Russian. | ||
| W17-1412 It was organised in the context of the 6th Balto-Slavic Natural Language Processing Workshop, co-located with the ***** EACL ***** 2017 conference. | ||
| 2021.dravidianlangtech-1.19 In this work, we participated in the ***** EACL ***** task to detect offensive content in the code-mixed social media scenario. | ||
| W17-1001 In this brief report we present an overview of the MultiLing 2017 effort and workshop, as implemented within ***** EACL ***** 2017 | ||
| 2021.dravidianlangtech-1.31 This paper introduces the related content of the task Offensive Language Identification in Dravidian LANGUAGES - *****EACL***** 2021 . | ||
| Neural generative | 8 | |
| P19-1004 ***** Neural generative ***** models have been become increasingly popular when building conversational agents. | ||
| W19-3413 ***** Neural generative ***** models have shown promising results for various text generation problems. | ||
| P19-1371 ***** Neural generative ***** models for open-domain chit-chat conversations have become an active area of research in recent years. | ||
| 2021.sigdial-1.1 ***** Neural generative ***** dialogue agents have shown an increasing ability to hold short chitchat conversations, when evaluated by crowdworkers in controlled settings. | ||
| 2020.acl-main.60 ***** Neural generative ***** models have achieved promising performance on dialog generation tasks if given a huge data set | ||
| standardized | 8 | |
| 1999.mtsummit-1.43 This study aims at ***** standardized ***** evaluation of MT for the WWW. | ||
| W16-5209 To address this issue, standardization is inevitable: ***** standardized ***** interfaces are necessary for language services as well as data format required for language resources. | ||
| L10-1311 It partially proves the value of an international standard like LAF/GrAF in the Web service context: an existing dependency parser can be, in a sense, ***** standardized *****, once wrapped by a data format conversion process. | ||
| 2020.lrec-1.97 At first glance, it looks too ***** standardized *****. | ||
| L16-1122 Voice quality concepts are fuzzily defined and poorly ***** standardized ***** however, which hinders scientific and clinical communication | ||
| prose | 8 | |
| W16-5108 However, most of these techniques focus on ***** prose *****, while much important biomedical data reside in tables. | ||
| W19-4702 Datasets of popular ***** prose ***** and poetry spanning across 1870-1920 and 1970-2019 have been created, and multiple experiments have been conducted to prove that ***** prose ***** and poetry in the latter period are more alike than they were in the former. | ||
| W19-2507 Using these features, we classify almost all surviving classical Greek literature as ***** prose ***** or verse with 97% accuracy and F1 score, and further classify a selection of the verse texts into the traditional genres of epic and drama. | ||
| P19-1111 The word ordering in a Sanskrit verse is often not aligned with its corresponding *****prose***** order . | ||
| D18-1099 The text in many web documents is organized into a hierarchy of section titles and corresponding *****prose***** content , a structure which provides potentially exploitable information on discourse structure and topicality . | ||
| sampling | 8 | |
| N18-1133 Because knowledge graphs typically only contain positive facts, ***** sampling ***** useful negative training examples is a nontrivial task. | ||
| 2021.wnut-1.4 We introduce negative ***** sampling ***** to adjust training loss, and conduct experiments under different scenarios. | ||
| P19-1278 By applying our method to various tasks, we also find that (1) our approach could effectively detect redundant relations extracted by open information extraction (Open IE) models, that (2) even the most competitive models for relational classification still make mistakes among very similar relations, and that (3) our approach could be incorporated into negative ***** sampling ***** and softmax classification to alleviate these mistakes. | ||
| 2021.sustainlp-1.11 Lastly, we show the trade-off between speed and performance for all ***** sampling ***** methods on three different datasets. | ||
| 2021.iwcs-1.10 We examine the performance of both models and discuss their adjustments, such as ***** sampling ***** of additional training instances from an unrelated domain and adding extra lexical and discourse features to input token representations | ||
| coordinate | 8 | |
| 2010.amta-government.7 The ADS translates large scale lists of names from foreign language to English and also pinpoints place names appearing in reports with their ***** coordinate ***** locations on maps. | ||
| 2020.acl-main.587 MAE is trained using a block ***** coordinate ***** descent algorithm that alternates between updating (1) the responsibilities of the experts and (2) their parameters. | ||
| N19-1343 For inference, we make use of probabilities of coordinators and conjuncts in the CKY parsing to find the optimal combination of ***** coordinate ***** structures. | ||
| W18-6009 This paper discusses the representation of ***** coordinate ***** structures in the Universal Dependencies framework for two head-final languages, Japanese and Korean. | ||
| 2021.eacl-main.67 In this paper, we address the representation of ***** coordinate ***** constructions in Enhanced Universal Dependencies (UD), where relevant dependency links are propagated from conjunction heads to other conjuncts | ||
| graphical models | 8 | |
| 2020.findings-emnlp.390 (2) Using the CoS-E and e-SNLI datasets, we evaluate two existing generative ***** graphical models ***** and two new approaches; one rationalizing method we introduce achieves roughly human-level LAS scores. | ||
| D17-1277 Our approach thereby combines benefits of deep learning with more traditional approaches such as ***** graphical models ***** and probabilistic mention-entity maps. | ||
| D17-1043 Advances in neural variational inference have facilitated the learning of powerful directed ***** graphical models ***** with continuous latent variables, such as variational autoencoders. | ||
| 2020.emnlp-main.406 More expressive ***** graphical models ***** are rarely used due to their prohibitive computational cost. | ||
| 2021.naacl-main.404 To explain the empirical success of these generic masks, we demonstrate a correspondence between the Masked Language Model (MLM) objective and existing methods for learning statistical dependencies in ***** graphical models *****. | ||
| automated extraction | 8 | |
| 2020.aespen-1.1 We describe our effort on ***** automated extraction ***** of socio-political events from news in the scope of a workshop and a shared task we organized at Language Resources and Evaluation Conference (LREC 2020). | ||
| L16-1218 This unique material opens way to the renewal of MRD-based methods, notably the ***** automated extraction ***** and acquisition of semantic relations. | ||
| L06-1216 In this paper we present on-going investigations on how complex syntactic annotation, combined with linguistic semantics, can possibly help in supporting the semi-automatic building of (shallow) ontologies from text by proposing an ***** automated extraction ***** of (possibly underspecified) semantic relations from linguistically annotated text. | ||
| L16-1059 The paper investigates the extent of the support semi-automatic analysis can provide for the specific task of assigning Hohfeldian relations of Duty, using the General Architecture for Text Engineering tool for the ***** automated extraction ***** of Duty instances and the bearers of associated roles. | ||
| 2021.acl-tutorials.2 This tutorial will provide audience with a systematic introduction of (i) knowledge representations of events, (ii) various methods for ***** automated extraction *****, conceptualization and prediction of events and their relations, (iii) induction of event processes and properties, and (iv) a wide range of NLU and commonsense understanding tasks that benefit from aforementioned techniques. | ||
| sentence length | 8 | |
| W19-3409 For genre identification, previous work had proposed three classes of features, viz., low-level (character-level and token counts), high-level (lexical and syntactic information) and derived features (type-token ratio, average word length or average ***** sentence length *****). | ||
| W19-4611 We investigate the effect of ***** sentence length ***** and embedding size on the learning process. | ||
| K19-1031 Further experiments on length-controlled training data reveal that absolute position actually causes overfitting to the ***** sentence length *****. | ||
| W19-8620 In this work, we further investigate this hypothesis on a new large scale multimodal Machine Translation (MMT) dataset, How2, which has 1.57 times longer mean ***** sentence length ***** than Multi30k and no repetition. | ||
| L12-1172 We examined several factors of text complexity (average ***** sentence length *****, Automated Readability Index, sentence complexity and passive voice) in the 20th century for two main English language varieties - British and American, using the `Brown family' of corpora. | ||
| completion | 8 | |
| 2021.naacl-industry.1 Accurate real-time phrase ***** completion ***** can save time and bolster productivity. | ||
| 2021.blackboxnlp-1.4 Unlike scoring-based methods for targeted syntactic evaluation, this technique makes it possible to explore ***** completion *****s that are not hypothesized in advance by the researcher. | ||
| D17-1074 We overview the theoretical motivation for a paradigmatic treatment of derivational morphology, and introduce the task of derivational paradigm ***** completion ***** as a parallel to inflectional paradigm ***** completion *****. | ||
| 2021.naacl-main.191 We propose a score to measure hurtful sentence ***** completion *****s in language models (HONEST). | ||
| 2021.naacl-main.202 Moreover, we investigate the effect of the temporal dataset's time granularity on temporal knowledge graph ***** completion *****. | ||
| records | 8 | |
| D19-1310 To address aforementioned problems, not only do we model each table cell considering other ***** records ***** in the same row, we also enrich table's representation by modeling each table cell in context of other cells in the same column or with historical (time dimension) data respectively. | ||
| W18-5621 Natural Language Processing (NLP) methods can be used to extract this data, in order to identify symptoms and treatments from mental health ***** records *****, and temporally anchor the first emergence of these. | ||
| L12-1427 This poster will present the Glottolog/Langdoc project, a comprehensive bibliography providing web access to 180k bibliographical ***** records ***** to (mainly) low visibility resources from low-density languages. | ||
| 2020.lrec-1.323 The audio ***** records ***** were transcribed using the written conventions specifically developed for the language and translated into French. | ||
| W19-3814 Bidirectional Encoder Representations from Transformers (BERT) has broken several NLP task ***** records ***** and can be used on GAP dataset. | ||
| integration | 8 | |
| W18-1401 The challenge for computational models of spatial descriptions for situated dialogue systems is the ***** integration ***** of information from different modalities. | ||
| W19-2912 Inspired by the literature on multisensory ***** integration *****, we develop a computational model to ground quantifiers in perception. | ||
| P17-1170 Despite its potential benefits, the ***** integration ***** of sense-level information into NLP systems has remained understudied. | ||
| W16-5201 Kathaa exposes an intuitive web based Interface for the users to interact with and modify complex NLP Systems; and a precise Module definition API to allow easy ***** integration ***** of new state of the art NLP components. | ||
| L10-1351 We describe an experimentalWizard-of-Oz-setup for the ***** integration ***** of emotional strategies into spoken dialogue management. | ||
| multimodal interaction | 8 | |
| W18-6903 This work outlines the building-blocks for providing an individual, ***** multimodal interaction ***** experience by shaping the robot's humor with the help of Natural Language Generation and Reinforcement Learning based on human social signals. | ||
| 2021.emnlp-main.189 We conclude that (SPT) along with parameter sharing can capture ***** multimodal interaction *****s with reduced model size and improved sample efficiency. | ||
| 2021.wnut-1.11 While many multimodal neural techniques have been proposed to incorporate images into the MNER task, the model's ability to leverage ***** multimodal interaction *****s remains poorly understood. | ||
| L06-1401 Third generation (3G) services boost mobile ***** multimodal interaction ***** offering users richer communication alternatives for accessing different applications and information services. | ||
| 2020.acl-main.306 To tackle the first issue, we propose a ***** multimodal interaction ***** module to obtain both image-aware word representations and word-aware visual representations. | ||
| spell checking | 8 | |
| L10-1557 STeP-1 (Standard Text Preparation for Persian language) performs a combination of tokenization, ***** spell checking *****, morphological analysis and POS tagging. | ||
| L12-1344 Available open-source ***** spell checking ***** resources for Arabic are too small and inadequate. | ||
| D17-1288 Statistical Machine Translation and ***** spell checking *****, with the help of a ranking mechanism tremendously improves over single-handed approaches. | ||
| 2021.americasnlp-1.11 Our analyzer will serve both as a tool to better document the Yine language and as a component of natural language processing (NLP) applications such as ***** spell checking ***** and correction. | ||
| L12-1423 This paper presents some novel results on Chinese ***** spell checking *****. | ||
| text and speech | 8 | |
| K19-1083 The proposed method achieved state of the art performance in both ***** text and speech ***** related tasks. | ||
| 2020.iwslt-1.34 In our analysis we – (i) detail two non-invasive ways of detecting translationese and (ii) compare translationese across human and machine translations from ***** text and speech *****. | ||
| L12-1670 The work is a step forward in the direction of development of standards for mobile ***** text and speech ***** data collection for Indian languages. | ||
| L16-1145 The corpus produced is by far the largest multi-lingual, multi-level and multi-genre annotation corpus of informal ***** text and speech *****. | ||
| W16-4017 We employ language processing tools to align ***** text and speech *****, to generate a null-model of how the poem would be spoken by a naïve reader, and to extract contrastive prosodic features used by the poet. | ||
| causal inference | 8 | |
| 2021.cinlp-1.8 We use propensity score stratification, a ***** causal inference ***** method for observational data, and estimate whether the amount of comments —as a measure of social support— increases or decreases the likelihood of posting again on SW. One hypothesis is that receiving more comments may decrease the likelihood of the user posting in SW in the future, either by reducing symptoms or because comments from untrained peers may be harmful. | ||
| 2021.emnlp-main.763 This motivates us to propose counterfactual IE (CFIE), a novel framework that aims to uncover the main causalities behind data in the view of ***** causal inference *****. | ||
| 2021.emnlp-main.748 While this idea has led to fruitful developments in the field of ***** causal inference *****, it is not widely-known in the NLP community. | ||
| 2021.cinlp-1.7 We anticipate this work to provide useful insights about publication trends and behavior and raise the awareness about the potential for ***** causal inference ***** in the computational linguistics and a broader scientific community. | ||
| 2021.cinlp-1.6 We posit that a better understanding of this problem will require the use of ***** causal inference ***** frameworks. | ||
| open source software | 8 | |
| L10-1592 We also report on our plans for making our custom-built software resources available to the community as ***** open source software *****, and introduce an initiative to collaborate with software developers outside LDC. | ||
| L10-1366 The software written to create this corpus was designed in MATLAB with help of hardware specific software provided by the hardware manufacturers and freely available ***** open source software *****. | ||
| L16-1572 We distribute the focused crawler as ***** open source software *****. | ||
| C16-2049 Ambient search is available as ***** open source software *****. | ||
| L08-1328 BART has been released as ***** open source software ***** and is available from http://www.sfs.uni-tuebingen.de/~versley/BART | ||
| requirements | 8 | |
| 2010.amta-commercial.5 In this paper, we discuss some of our further use cases, and the varying ***** requirements ***** each use case has for quality, customization, cost, and other factors. | ||
| L16-1569 To meet these ***** requirements *****, we have adopted a highly modular microservice-based architecture. | ||
| E17-5002 The technical differences between NMT and the previously dominant phrase-based statistical approach require that practictioners learn new best practices for building MT systems, ranging from different hardware ***** requirements *****, new techniques for handling rare words and monolingual data, to new opportunities in continued learning and domain adaptation.This tutorial is aimed at researchers and users of machine translation interested in working with NMT. | ||
| D19-6102 We highlight the differences of these approaches in terms of unlabeled data ***** requirements ***** and capability to overcome additional domain shift in the data. | ||
| W19-8613 One of the key ***** requirements ***** of QG is to generate a question such that it results in a target answer. | ||
| contextual features | 8 | |
| 2020.coling-main.20 Our models extend the architecture of BERT by incorporating both affective and ***** contextual features *****. | ||
| 2020.emnlp-main.487 However, existing studies have made limited efforts to leverage ***** contextual features ***** except for applying powerful encoders (e.g., bi-LSTM). | ||
| W18-0517 We submitted two systems: one system modeled each word to evaluate as a numeric vector populated with a set of lexical, semantic and ***** contextual features ***** while the other system relies on a word embedding representation and a distance metric. | ||
| C16-1231 In particular, we use a bi-directional gated recurrent neural network to capture syntactic and semantic information over tweets locally, and a pooling neural network to extract ***** contextual features ***** automatically from history tweets. | ||
| W17-5104 We also developed a set of ***** contextual features ***** that further improves the state-of-the-art for this task. | ||
| proper nouns | 8 | |
| L08-1479 However recognition of ***** proper nouns ***** is commonly considered as a difficult task. | ||
| D19-1328 We study the composition and quality of the test sets for five diverse languages from this dataset, with concerning findings: (1) a quarter of the data consists of ***** proper nouns *****, which can be hardly indicative of BDI performance, and (2) there are pervasive gaps in the gold-standard targets. | ||
| L08-1551 Diambiguation was carried out for all nouns, ***** proper nouns ***** and adjectives in the sample, all of which were assigned EuroWordNet (EWN) synsets. | ||
| P18-2011 In this paper, we propose an approach based on linguistic knowledge for identification of aliases mentioned using ***** proper nouns *****, pronouns or noun phrases with common noun headword. | ||
| D19-1553 However, the second step bears both the target sentiment addition and content reconstruction, thus resulting in a lack of specific information like ***** proper nouns ***** in the generated text. | ||
| policies | 8 | |
| N18-1197 Can this linguistic background knowledge improve the generality and efficiency of learned classifiers and control ***** policies *****? | ||
| 2021.sigdial-1.52 Neural models implicitly memorize task-specific dialog ***** policies ***** from the training data. | ||
| 2021.emnlp-main.573 In this paper, we perform a large-scale empirical study to investigate the effect of various masking ***** policies ***** in intermediate pre-training with nine selected tasks across three categories. | ||
| L14-1629 The NOMAD project (Policy Formulation and Validation through non Moderated Crowd-sourcing) is a project that supports policy making, by providing rich, actionable information related to how citizens perceive different ***** policies *****. | ||
| 2020.lrec-1.561 A popular application for that purpose is named entity recognition (NER), but the annotation ***** policies ***** of existing clinical corpora have not been standardized across clinical texts of different types. | ||
| temporal ordering | 8 | |
| 2021.acl-long.555 On the ***** temporal ordering ***** task, we show that our model is able to unscramble event sequences from existing datasets without access to explicitly labeled temporal training data, outperforming both a BERT-based pairwise model and a BERT-based pointer network. | ||
| W19-2404 In this paper, we advocate the use of Message Sequence Chart (MSC) as a knowledge representation to capture and visualize multi-actor interactions and their ***** temporal ordering *****. | ||
| I17-1085 While this paper focuses on ***** temporal ordering *****, its results are applicable to other areas that use sieve-based architectures. | ||
| L10-1029 Narrative schemas contain sets of related events (edit and publish), a ***** temporal ordering ***** of the events (edit before publish), and the semantic roles of the participants (authors publish books). | ||
| P19-1433 Data-driven models have demonstrated state-of-the-art performance in inferring the ***** temporal ordering ***** of events in text. | ||
| linguistic patterns | 8 | |
| I17-1015 However, little is known about ***** linguistic patterns ***** of morphology, syntax and semantics learned during the training of NMT systems, and more importantly, which parts of the architecture are responsible for learning each of these phenomenon. | ||
| N19-1148 Further, the attention weights in the learned model confirm that the model finds expected ***** linguistic patterns ***** for each category. | ||
| L14-1511 First, keywords are extracted using a hybrid approach mixing ***** linguistic patterns ***** with statistical information. | ||
| P17-1037 By applying these ***** linguistic patterns ***** to a collection of tweets, we extract statements agreeing and disagreeing with various topics. | ||
| R17-1074 Evaluation shows that phrase level ***** linguistic patterns ***** as well as the adopted features are highly active in capturing characters and their adjectives. | ||
| selectional preference | 8 | |
| L10-1434 Our second interest lies in the actual comparison of the models: How does a very simple distributional model compare to much more complex approaches, and which representation of ***** selectional preference *****s is more appropriate, using (i) second-order properties, (ii) an implicit generalisation of nouns (by clusters), or (iii) an explicit generalisation of nouns by WordNet classes within clusters? | ||
| C16-1266 This paper proposes a novel problem setting of ***** selectional preference ***** (SP) between a predicate and its arguments, called as context-sensitive SP (CSP). | ||
| L14-1254 While the current version of the dictionary concentrates on syntax, it already contains some semantic features, including semantically defined arguments, such as locative, temporal or manner, as well as control and raising, and work on extending it with semantic roles and ***** selectional preference *****s is in progress. | ||
| 2020.acl-main.337 Our results on ***** selectional preference ***** and WordNet datasets show that the centroid-based model will fail to achieve good enough performance, the geometry of the distribution and the existence of subgroups will have limited impact, and also the negative instances need to be considered for adequate modeling of the distribution. | ||
| 2019.lilt-17.1 The analysis has two key components (i) an underspecified category for the nominal and (ii) combinatorial constraints on the noun and light verb to specify ***** selectional preference *****s. | ||
| ambiguous word | 8 | |
| 2020.emnlp-main.283 In this paper, we propose a simple method to provide annotations for most un***** ambiguous word *****s in a large corpus. | ||
| W18-6304 We hypothesize that attention mechanisms pay more attention to context tokens when translating ***** ambiguous word *****s. | ||
| 2021.acl-long.65 In this paper, we ask several questions: What contexts do human translators use to resolve ***** ambiguous word *****s? | ||
| 2021.acl-long.406 Meanwhile, we enhance the context embedding learning with selected sentences from the same document, rather than utilizing only the sentence where each ***** ambiguous word ***** appears. | ||
| P19-1574 Empirical analysis of embeddings of ***** ambiguous word *****s is currently limited by the small size of manually annotated resources and by the fact that word senses are treated as unrelated individual concepts. | ||
| machine translation shared | 8 | |
| 2020.wmt-1.22 In this paper, we introduced our joint team SJTU-NICT `s participation in the WMT 2020 ***** machine translation shared ***** task. | ||
| 2020.findings-emnlp.375 Recent ***** machine translation shared ***** tasks have shown top-performing systems to tie or in some cases even outperform human translation. | ||
| 2021.americasnlp-1.27 We present the submission of REPUcs to the AmericasNLP ***** machine translation shared ***** task for the low resource language pair Spanish–Quechua. | ||
| 2021.wmt-1.38 This paper describes DUT-NLP Lab's submission to the WMT-21 triangular ***** machine translation shared ***** task. | ||
| 2020.wmt-1.135 We describe NITS-CNLP's submission to WMT 2020 unsupervised ***** machine translation shared ***** task for German language (de) to Upper Sorbian (hsb) in a constrained setting i.e, using only the data provided by the organizers. | ||
| composition | 8 | |
| L12-1283 This work is part of a project for MWE extraction and characterization using different techniques aiming at measuring the properties related to idiomaticity, as institutionalization, non-***** composition *****ality and lexico-syntactic fixedness. | ||
| 1997.mtsummit-workshop.4 (1) From a linguistic viewpoint, the expected benefits include a refinement of the aspectual model in (Olsen, 1994; Olsen, 1997) (which provides necessary but not sufficient conditions for aspectual com- position), and a refinement of the verb classifications in (Levin, 1993); we also expect our approach to eventually produce a systematic definition (in terms of LCSs and ***** composition *****al operations) of the precise meaning components responsible for Levin's classification. | ||
| 2020.repl4nlp-1.22 We introduce a novel metric, Polarity Sensitivity Scoring (PSS), which utilizes sentiment perturbations as a proxy for measuring ***** composition *****ality. | ||
| 2020.acl-main.119 At each time step, our model performs multiple rounds of attention, reasoning, and ***** composition ***** that aim to answer two critical questions: (1) which part of the input sequence to abstract; and (2) where in the output graph to construct the new concept. | ||
| W17-1713 We use word alignment variance as an indicator for the non-***** composition *****ality of German and English noun compounds. | ||
| opinion words | 8 | |
| 2021.emnlp-main.317 The aspect and ***** opinion words ***** are expected to be closer along such tree structure compared to the standard dependency parse tree. | ||
| 2020.acl-main.295 Extensive experiments are conducted on the SemEval 2014 and Twitter datasets, and the experimental results confirm that the connections between aspects and ***** opinion words ***** can be better established with our approach, and the performance of the graph attention network (GAT) is significantly improved as a consequence. | ||
| D19-1569 We propose a method based on neural networks to identify the sentiment polarity of ***** opinion words ***** expressed on a specific aspect of a sentence. | ||
| 2021.eacl-main.285 The intuition behind the posterior regularization is that if extracted ***** opinion words ***** from two documents are semantically similar, the posterior distributions of two documents should be similar. | ||
| N19-1259 Opinion target extraction and *****opinion words***** extraction are two fundamental subtasks in Aspect Based Sentiment Analysis ( ABSA ) . | ||
| opinion target | 8 | |
| 2020.coling-main.70 Aspect-level sentiment classification (ASC) aims to detect the sentiment polarity of a given ***** opinion target ***** in a sentence. | ||
| D17-1047 The weighted-memory mechanism not only helps us avoid the labor-intensive feature engineering work, but also provides a tailor-made memory for different ***** opinion target *****s of a sentence. | ||
| D19-1465 Aspect words, indicating ***** opinion target *****s, are essential in expressing and understanding human opinions. | ||
| 2020.lrec-1.203 An annotation scheme has been proposed for the annotation of emotion-related information including the emotion type, the emotion cause, the emotion reaction, the use of rhetorical question, the ***** opinion target ***** (i.e. | ||
| D19-5517 In order to tackle such situations, we applied a model that is reported to handle context in many natural language processing areas, to the problem of extracting references to the ***** opinion target ***** from text. | ||
| informal | 8 | |
| 2020.peoples-1.11 Emojis are a widely used tool for encoding emotional content in *****informal***** messages such as tweets , and predicting which emoji corresponds to a piece of text can be used as a proxy for measuring the emotional content in the text . | ||
| 2020.lrec-1.765 Swearing plays an ubiquitous role in everyday conversations among humans , both in oral and textual communication , and occurs frequently in social media texts , typically featured by *****informal***** language and spontaneous writing . | ||
| W18-6538 The TL;DR challenge fosters research in abstractive summarization of *****informal***** text , the largest and fastest - growing source of textual data on the web , which has been overlooked by summarization research so far . | ||
| 2020.findings-emnlp.212 Formality style transfer is the task of converting *****informal***** sentences to grammatically - correct formal sentences , which can be used to improve performance of many downstream NLP tasks . | ||
| 2021.wanlp-1.47 Sarcasm detection is one of the top challenging tasks in text classification , particularly for *****informal***** Arabic with high syntactic and semantic ambiguity . | ||
| autonomous | 8 | |
| 2020.nl4xai-1.2 Algorithmic - based decision making powered via AI and ( big ) data has already penetrated into almost all spheres of human life , from content recommendation and healthcare to predictive policing and *****autonomous***** driving , deeply affecting everyone , anywhere , anytime . | ||
| W17-5544 We present the implementation of an *****autonomous***** chatbot , SHIHbot , deployed on Facebook , which answers a wide variety of sexual health questions on HIV / AIDS . | ||
| 2020.acl-main.229 Learning to follow instructions is of fundamental importance to *****autonomous***** agents for vision - and - language navigation ( VLN ) . | ||
| 2020.emnlp-main.704 Text - based games present a unique challenge for *****autonomous***** agents to operate in natural language and handle enormous action spaces . | ||
| 2000.amta-papers.6 Research in computational linguistics , computer graphics and *****autonomous***** agents has led to the development of increasingly sophisticated communicative agents over the past few years , bringing new perspective to machine translation research . | ||
| Neural machine translation ( NMT ) | 8 | |
| W16-3717 *****Neural machine translation ( NMT )***** models have recently been shown to be very successful in machine translation ( MT ) . | ||
| 2020.coling-main.381 *****Neural machine translation ( NMT )***** models usually suffer from catastrophic forgetting during continual training where the models tend to gradually forget previously learned knowledge and swing to fit the newly added data which may have a different distribution , e.g. | ||
| D18-1510 *****Neural machine translation ( NMT )***** models are usually trained with the word - level loss using the teacher forcing algorithm , which not only evaluates the translation improperly but also suffers from exposure bias . | ||
| 2017.iwslt-1.17 *****Neural machine translation ( NMT )***** systems have demonstrated promising results in recent years . | ||
| 2021.mtsummit-research.10 *****Neural machine translation ( NMT )***** models are typically trained using a softmax cross - entropy loss where the softmax distribution is compared against the gold labels . | ||
| how | 8 | |
| W18-6543 We show ( i ) that rare items strongly impact performance ; ( ii ) that combining delexicalisation and copying yields the strongest improvement ; ( iii ) that copying underperforms for rare and unseen items and ( iv ) that the impact of these two mechanisms greatly varies depending on *****how***** the dataset is constructed and on how it is split into train , dev and test . | ||
| 2010.amta-government.14 Fully automatic alignment produces noisy data ( e.g. , containing OCR and alignment errors ) , and we are looking at the question of just *****how***** noisy noisy data can be and still produce translation improvements . | ||
| P19-1645 In contrast to previous segmentation models that treat word segmentation as an isolated task , our model unifies word discovery , learning how words fit together to form sentences , and , by conditioning the model on visual context , *****how***** words ' meanings ground in representations of nonlinguistic modalities . | ||
| L12-1311 The recordings in this data collection are taken from a WOZ experiment that allows to investigate *****how***** users interact with a companion system in a mundane situation with the need for planning , re - planning and strategy change . | ||
| J19-1005 The task involves matching a response candidate with a conversation context , the challenges for which include how to recognize important parts of the context , and *****how***** to model the relationships among utterances in the context . | ||
| multi - word | 8 | |
| W18-3008 We explore representations for *****multi - word***** names in text classification tasks , on Reuters ( RCV1 ) topic and sector classification . | ||
| L14-1720 Compounding is extremely productive in Icelandic and *****multi - word***** compounds are common . | ||
| 2020.udw-1.11 HDT - UD , the largest German UD treebank by a large margin , as well as the German - LIT treebank , currently do not analyze preposition - determiner contractions such as zum (= zu dem , to the ) as *****multi - word***** tokens , which is inconsistent both with UD guidelines as well as other German UD corpora ( GSD and PUD ) . | ||
| N18-4002 This paper presents two novel datasets and a random - forest classifier to automatically predict literal vs. non - literal language usage for a highly frequent type of *****multi - word***** expression in a low - resource language , i.e. , Estonian . | ||
| 2021.semeval-1.84 In this paper we describe our participation in the Lexical Complexity Prediction ( LCP ) shared task of SemEval 2021 , which involved predicting subjective ratings of complexity for English single words and *****multi - word***** expressions , presented in context . | ||
| pro - drop | 8 | |
| E17-2104 This paper presents a straightforward method to integrate co - reference information into phrase - based machine translation to address the problems of i ) elided subjects and ii ) morphological underspecification of pronouns when translating from *****pro - drop***** languages . | ||
| 2021.emnlp-main.197 Natural language generation ( NLG ) tasks on *****pro - drop***** languages are known to suffer from zero pronoun ( ZP ) problems , and the problems remain challenging due to the scarcity of ZP - annotated NLG corpora . | ||
| L12-1480 Thanks to their rich morphology , Italian and Spanish allow *****pro - drop***** pronouns , i.e. , non lexically - realized subject pronouns . | ||
| D19-1085 Zero pronouns ( ZPs ) are frequently omitted in *****pro - drop***** languages , but should be recalled in non - pro - drop languages . | ||
| W18-6519 We extend the classic Referring Expressions Generation task by considering zero pronouns in *****pro - drop***** languages such as Chinese , modelling their use by means of the Bayesian Rational Speech Acts model ( Frank and Goodman , 2012 ) . | ||
| Aspect - based sentiment | 8 | |
| 2021.acl-long.494 *****Aspect - based sentiment***** analysis is a fine - grained sentiment classification task . | ||
| 2020.emnlp-main.568 *****Aspect - based sentiment***** analysis of review texts is of great value for understanding user feedback in a fine - grained manner . | ||
| 2020.acl-main.588 *****Aspect - based sentiment***** classification is a popular task aimed at identifying the corresponding emotion of a specific aspect . | ||
| 2020.acl-main.295 *****Aspect - based sentiment***** analysis aims to determine the sentiment polarity towards a specific aspect in online reviews . | ||
| N19-1257 *****Aspect - based sentiment***** analysis involves the recognition of so called opinion target expressions ( OTEs ) . | ||
| noisy parallel | 8 | |
| W18-6475 The WMT 2018 Parallel Corpus Filtering Task aims to test various methods of filtering a *****noisy parallel***** corpus , to make it useful for training machine translation systems . | ||
| 1998.amta-papers.1 We present two problems for statistically extracting bilingual lexicon : ( 1 ) How can *****noisy parallel***** corpora be used ? | ||
| 2020.coling-main.418 We propose a novel method of automatic sentence alignment from *****noisy parallel***** documents . | ||
| W18-6478 In this work we introduce dual conditional cross - entropy filtering for *****noisy parallel***** data . | ||
| W19-5436 The WMT19 Parallel Corpus Filtering For Low - Resource Conditions Task aims to test various methods of filtering a *****noisy parallel***** corpora , to make them useful for training machine translation systems . | ||
| goal - oriented dialogue | 8 | |
| N19-1336 Recent research has demonstrated that *****goal - oriented dialogue***** agents trained on large datasets can achieve striking performance when interacting with human users . | ||
| 2021.naacl-main.239 Existing *****goal - oriented dialogue***** datasets focus mainly on identifying slots and values . | ||
| 2020.nlp4convai-1.12 Intent classification ( IC ) and slot filling ( SF ) are core components in most *****goal - oriented dialogue***** systems . | ||
| P19-1540 Multimodal dialogue systems have opened new frontiers in the traditional *****goal - oriented dialogue***** systems . | ||
| 2021.naacl-main.266 In *****goal - oriented dialogue***** systems , users provide information through slot values to achieve specific goals . | ||
| African | 8 | |
| D18-1008 To understand a sentence like whereas only 10 % of White Americans live at or below the poverty line , 28 % of *****African***** Americans do it is important not only to identify individual facts , e.g. , poverty rates of distinct demographic groups , but also the higher - order relations between them , e.g. , the disparity between them . | ||
| 2016.gwc-1.34 This paper presents a linguistic account of the lexical semantics of body parts in *****African***** WordNet , with special reference to Northern Sotho . | ||
| 2020.iwltp-1.11 Nowadays the scarcity and dispersion of open - source NLP resources and tools in and for *****African***** languages make it difficult for researchers to truly fit these languages into current algorithms of artificial intelligence , resulting in the stagnation of these numerous languages , as far as technological progress is concerned . | ||
| 2021.wmt-1.48 In this paper , we focus on the task of multilingual machine translation for *****African***** languages and describe our contribution in the 2021 WMT Shared Task : Large - Scale Multilingual Machine Translation . | ||
| 2020.rail-1.9 In a context where open - source NLP resources and tools in *****African***** languages are scarce and dispersed , it is difficult for researchers to truly fit African languages into current algorithms of artificial intelligence . | ||
| Machine learning | 8 | |
| 2021.naacl-main.143 *****Machine learning***** solutions are often criticized for the lack of explanation of their successes and failures . | ||
| 2020.emnlp-main.155 *****Machine learning***** techniques have been widely used in natural language processing ( NLP ) . | ||
| 2021.rocling-1.28 *****Machine learning***** methods for financial document analysis have been focusing mainly on the textual part . | ||
| 2020.inlg-1.34 *****Machine learning***** algorithms have been applied to achieve high levels of accuracy in tasks associated with the processing of natural language . | ||
| Q15-1029 *****Machine learning***** approaches to coreference resolution vary greatly in the modeling of the problem : while early approaches operated on the mention pair level , current research focuses on ranking architectures and antecedent trees . | ||
| patient | 8 | |
| W16-4712 Annotating medical text such as clinical notes with human phenotype descriptors is an important task that can , for example , assist in building *****patient***** profiles . | ||
| 2021.rocling-1.9 Automatic Speech Recognition ( ASR ) technology presents the possibility for medical professionals to document *****patient***** record , diagnosis , postoperative care , patrol records , and etc . | ||
| 2020.multilingualbio-1.1 Electronic Health Records are a valuable source of *****patient***** information which can be leveraged to detect Adverse Drug Events ( ADEs ) and aid post - mark drug - surveillance . | ||
| W19-1919 Past prescriptions constitute a central element in *****patient***** records . | ||
| L08-1282 In this paper , we investigate the use of a machine - learning based approach to the specific problem of scientific term detection in *****patient***** information . | ||
| sub - word | 8 | |
| W18-2307 Word2vec embeddings are limited to computing vectors for in - vocabulary terms and do not take into account *****sub - word***** information . | ||
| 2020.starsem-1.5 In this paper , we propose a novel method for learning cross - lingual word embeddings , that incorporates *****sub - word***** information during training , and is able to learn high - quality embeddings from modest amounts of monolingual data and a bilingual lexicon . | ||
| 2021.emnlp-main.779 Large pre - trained language models for textual data have an unconstrained output space ; at each decoding step , they can produce any of 10,000s of *****sub - word***** tokens . | ||
| 2021.emnlp-main.161 Neural language models typically tokenise input text into *****sub - word***** units to achieve an open vocabulary . | ||
| W18-6303 Recent neural machine translation ( NMT ) systems have been greatly improved by encoder - decoder models with attention mechanisms and *****sub - word***** units . | ||
| Multilingual Emoji | 8 | |
| S18-1068 This paper describes our submissions to Task 2 in SemEval 2018 , i.e. , *****Multilingual Emoji***** Prediction . | ||
| S18-1067 This paper describes our participation in SemEval 2018 Task 2 : *****Multilingual Emoji***** Prediction , in which participants are asked to predict a tweet 's most associated emoji from 20 emojis . | ||
| S18-1081 In this paper we present the system submitted to the SemEval2018 task2 : *****Multilingual Emoji***** Prediction . | ||
| S18-1003 This paper describes the results of the first Shared Task on *****Multilingual Emoji***** Prediction , organized as part of SemEval 2018 . | ||
| S18-1071 This paper describes our approach , called EPUTION , for the open trial of the SemEval- 2018 Task 2 , *****Multilingual Emoji***** Prediction . | ||
| various | 8 | |
| 2021.acl-long.72 Like word embeddings , sentence embeddings are typically learned on large text corpora and then transferred to *****various***** downstream tasks , such as clustering and retrieval . | ||
| L14-1111 The aim of the experiment was to assess the feasibility of crowdsourcing methods for a complex semantic task such as distinguishing the eventive interpretation of polysemous nominals taking into consideration *****various***** types of syntagmatic cues . | ||
| 2020.acl-main.373 We propose : ( 1 ) a new characterization of sexist content inspired by speech acts theory and discourse analysis studies , ( 2 ) the first French dataset annotated for sexism detection , and ( 3 ) a set of deep learning experiments trained on top of a combination of several tweet 's vectorial representations ( word embeddings , linguistic features , and *****various***** generalization strategies ) . | ||
| J18-4008 To address this issue , we organize microblog messages as conversation trees based on their reposting and replying relations , and propose an unsupervised model that jointly learns word distributions to represent : ( 1 ) different roles of conversational discourse , and ( 2 ) *****various***** latent topics in reflecting content information . | ||
| 2020.coling-main.553 Recently , automatic detection of personality traits from written messages has gained significant attention in computational linguistics and natural language processing communities , due to its applicability in *****various***** fields . | ||
| translation memory ( TM | 8 | |
| 2012.amta-tutorials.6 Several studies have recently reported significant productivity gains by human translators when besides *****translation memory ( TM***** ) matches they do also receive suggestions from a statistical machine translation ( SMT ) engine . | ||
| 2021.triton-1.14 The aim of this paper is to investigate the similarity measurement approach of *****translation memory ( TM***** ) in five representative computer - aided translation ( CAT ) tools when retrieving inflectional verb - variation sentences in Arabic to English translation . | ||
| 2021.acl-long.246 It is generally believed that a *****translation memory ( TM***** ) should be beneficial for machine translation tasks . | ||
| 2003.mtsummit-papers.21 The multilingual machine translation system described in the first part of this paper demonstrates that the *****translation memory ( TM***** ) can be used in a creative way for making the translation process more automatic ( in a way which in fact does not depend on the languages used ) . | ||
| L14-1321 The term advanced leveraging refers to extensions beyond the current usage of *****translation memory ( TM***** ) in computer - aided translation ( CAT ) . | ||
| Neural machine translation | 8 | |
| C18-1255 *****Neural machine translation***** systems require a number of stacked layers for deep models . | ||
| 2020.acl-main.319 *****Neural machine translation***** systems tend to fail on less decent inputs despite its significant efficacy , which may significantly harm the credibility of these systemsfathoming how and when neural - based systems fail in such cases is critical for industrial maintenance . | ||
| 2019.iwslt-1.32 *****Neural machine translation***** models have shown to achieve high quality when trained and fed with well structured and punctuated input texts . | ||
| I17-2051 *****Neural machine translation***** decoders are usually conditional language models to sequentially generate words for target sentences . | ||
| 2021.acl-short.115 *****Neural machine translation***** models are often biased toward the limited translation references seen during training . | ||
| Semantic Textual Similarity ( STS | 8 | |
| S17-2031 In this paper we report our attempt to use , on the one hand , state - of - the - art neural approaches that are proposed to measure *****Semantic Textual Similarity ( STS***** ) . | ||
| S17-2022 *****Semantic Textual Similarity ( STS***** ) devotes to measuring the degree of equivalence in the underlying semantic of the sentence pair . | ||
| S17-2001 *****Semantic Textual Similarity ( STS***** ) measures the meaning similarity of sentences . | ||
| C16-1009 *****Semantic Textual Similarity ( STS***** ) is a foundational NLP task and can be used in a wide range of tasks . | ||
| R19-1116 Calculating the *****Semantic Textual Similarity ( STS***** ) is an important research area in natural language processing which plays a significant role in many applications such as question answering , document summarisation , information retrieval and information extraction . | ||
| Swedish | 8 | |
| 2021.nodalida-main.20 We train and test five open - source taggers , which use different methods , on three *****Swedish***** corpora , which are of comparable size but use different tagsets . | ||
| 2020.semeval-1.30 We examine semantic differences between specific words in two corpora , chosen from different time periods , for English , German , Latin , and *****Swedish***** . | ||
| L14-1103 Relations between frames and constructions must be made explicit in FrameNet - style linguistic resources such as Berkeley FrameNet ( Fillmore & Baker , 2010 , Fillmore , Lee - Goldman & Rhomieux , 2012 ) , Japanese FrameNet ( Ohara , 2013 ) , and *****Swedish***** Constructicon ( Lyngfelt et al . , 2013 ) . | ||
| L06-1081 We describe the implementation of a FrameNet - based semantic role labeling system for *****Swedish***** text . | ||
| L12-1241 We present the first results on semantic role labeling using the *****Swedish***** FrameNet , which is a lexical resource currently in development . | ||
| open - domain question answering ( QA | 8 | |
| 2021.sustainlp-1.7 In simple *****open - domain question answering ( QA***** ) , dense retrieval has become one of the standard approaches for retrieving the relevant passages to infer an answer . | ||
| 2021.eacl-demos.2 Although *****open - domain question answering ( QA***** ) draws great attention in recent years , it requires large amounts of resources for building the full system and it is often difficult to reproduce previous results due to complex configurations . | ||
| 2021.eacl-main.234 This paper proposes a new problem of complementary evidence identification for *****open - domain question answering ( QA***** ) . | ||
| D18-1053 Recently , *****open - domain question answering ( QA***** ) has been combined with machine comprehension models to find answers in a large knowledge source . | ||
| 2021.ranlp-1.44 Large transformer models , such as BERT , achieve state - of - the - art results in machine reading comprehension ( MRC ) for *****open - domain question answering ( QA***** ) . | ||
| plot | 8 | |
| D17-1168 In this paper , we present a story comprehension model that explores three distinct semantic aspects : ( i ) the sequence of events described in the story , ( ii ) its emotional trajectory , and ( iii ) its *****plot***** consistency . | ||
| L16-1028 Characters form the focus of various studies of literary works , including social network analysis , archetype induction , and *****plot***** comparison . | ||
| C18-1244 Folksonomy of movies covers a wide range of heterogeneous information about movies , like the genre , *****plot***** structure , visual experiences , soundtracks , metadata , and emotional experiences from watching a movie . | ||
| D19-1180 According to screenwriting theory , turning points ( e.g. , change of plans , major setback , climax ) are crucial narrative moments within a screenplay : they define the *****plot***** structure , determine its progression and segment the screenplay into thematic units ( e.g. , setup , complications , aftermath ) . | ||
| 2020.nuse-1.12 We construct emotion arcs based on event affect and implied sentiments , which correspond to *****plot***** elements in the story . | ||
| game | 8 | |
| L12-1299 We argue that a bootstrapping approach comprising state - of - the - art NLP tools for parsing and semantic interpretation , in combination with a wiki - like interface for collaborative annotation of experts , and a *****game***** with a purpose for crowdsourcing , are the starting ingredients for fulfilling this enterprise . | ||
| 2020.emnlp-main.624 In contrast to previous text games with mostly synthetic texts , IF games pose language understanding challenges on the human - written textual descriptions of diverse and sophisticated *****game***** worlds and language generation challenges on the action command generation from less restricted combinatorial space . | ||
| 2020.inlg-1.28 We propose a shared task on methodologies and algorithms for evaluating the accuracy of generated texts , specifically summaries of basketball games produced from basketball box score and other *****game***** data . | ||
| 2020.inlg-1.36 In recent years , referring expression genera- tion algorithms were inspired by *****game***** theory and probability theory . | ||
| P16-5006 The development of *****game***** theory in the early 1940 's by John von Neumann was a reaction against the then dominant view that problems in economic theory can be formulated using standard methods from optimization theory . | ||
| Natural Language Understanding ( NLU | 8 | |
| 2021.emnlp-main.489 *****Natural Language Understanding ( NLU***** ) is an established component within a conversational AI or digital assistant system , and it is responsible for producing semantic understanding of a user request . | ||
| W19-3644 Commonsense can be vital in some applications like *****Natural Language Understanding ( NLU***** ) , where it is often required to resolve ambiguity arising from implicit knowledge and underspecification . | ||
| 2021.naacl-main.255 Recent progress in *****Natural Language Understanding ( NLU***** ) has seen the latest models outperform human performance on many standard tasks . | ||
| 2021.mrl-1.18 Predicting user intent and detecting the corresponding slots from text are two key problems in *****Natural Language Understanding ( NLU***** ) . | ||
| 2020.spnlp-1.12 Successful application of Knowledge Representation and Reasoning ( KR ) in *****Natural Language Understanding ( NLU***** ) is largely limited by the availability of a robust and general purpose natural language parser . | ||
| Natural language processing | 8 | |
| 2021.acl-long.111 *****Natural language processing***** techniques have demonstrated promising results in keyphrase generation . | ||
| 2020.acl-main.680 *****Natural language processing***** models often have to make predictions on text data that evolves over time as a result of changes in language use or the information described in the text . | ||
| 2020.findings-emnlp.138 *****Natural language processing***** systems often struggle with out - of - vocabulary ( OOV ) terms , which do not appear in training data . | ||
| D19-6218 *****Natural language processing***** techniques are being applied to increasingly diverse types of electronic health records , and can benefit from in - depth understanding of the distinguishing characteristics of medical document types . | ||
| L10-1120 *****Natural language processing***** technology has developed remarkably , but it is still difficult for computers to understand contextual meanings as humans do . | ||
| Aspect - level sentiment | 8 | |
| C18-1066 *****Aspect - level sentiment***** analysis aims to distinguish the sentiment polarity of each specific aspect term in a given sentence . | ||
| D19-1551 *****Aspect - level sentiment***** classification is a crucial task for sentiment analysis , which aims to identify the sentiment polarities of specific targets in their context . | ||
| 2020.emnlp-main.451 *****Aspect - level sentiment***** analysis aims to recognize the sentiment polarity of an aspect or a target in a comment . | ||
| 2020.coling-main.69 *****Aspect - level sentiment***** classification aims to distinguish the sentiment polarities over aspect terms in a sentence . | ||
| P19-1052 *****Aspect - level sentiment***** classification aims to determine the sentiment polarity of a sentence towards an aspect . | ||
| conversational AI | 8 | |
| W18-5025 Aiming to expand the current research paradigm for training *****conversational AI***** agents that can address real - world challenges , we take a step away from traditional slot - filling goal - oriented spoken dialogue systems ( SDS ) and model the dialogue in a way that allows users to be more expressive in describing their needs . | ||
| D19-6101 New conversation topics and functionalities are constantly being added to *****conversational AI***** agents like Amazon Alexa and Apple Siri . | ||
| 2021.reinact-1.1 The next generation of *****conversational AI***** systems need to : ( 1 ) process language incrementally , token - by - token to be more responsive and enable handling of conversational phenomena such as pauses , restarts and self - corrections ; ( 2 ) reason incrementally allowing meaning to be established beyond what is said ; ( 3 ) be transparent and controllable , allowing designers as well as the system itself to easily establish reasons for particular behaviour and tailor to particular user groups , or domains . | ||
| 2021.nlp4convai-1.17 Query rewrite ( QR ) is an emerging component in *****conversational AI***** systems , reducing user defect . | ||
| 2020.sigdial-1.5 We will demonstrate a deployed *****conversational AI***** system that acts as a host of a smart - building on a university campus . | ||
| Semantic Web | 8 | |
| W19-9005 The paper shows how a multilingual hierarchical thesaurus , or taxonomy , can be created and implemented in compliance with *****Semantic Web***** requirements by means of the data model SKOS ( Simple Knowledge Organization System ) . | ||
| L06-1123 In this paper we perform a preliminary evaluation on how *****Semantic Web***** technologies such as RDF and OWL can be used to perform textual encoding . | ||
| L08-1178 With the appearance of *****Semantic Web***** technologies , it becomes possible to develop novel , sophisticated question answering systems , where ontologies are usually used as the core knowledge component . | ||
| I17-1032 With the advent of the Internet , the amount of *****Semantic Web***** documents that describe real - world entities and their inter - links as a set of statements have grown considerably . | ||
| L14-1185 Detecting and classifying named entities has traditionally been taken on by the natural language processing community , whilst linking of entities to external resources , such as those in DBpedia , has been tackled by the *****Semantic Web***** community . | ||
| knowledge bases ( KBs | 8 | |
| 2020.acl-main.669 Unsupervised relation extraction ( URE ) extracts relations between named entities from raw text without manually - labelled data and existing *****knowledge bases ( KBs***** ) . | ||
| P17-1021 With the rapid growth of *****knowledge bases ( KBs***** ) on the web , how to take full advantage of them becomes increasingly important . | ||
| D19-1263 Formal query generation aims to generate correct executable queries for question answering over *****knowledge bases ( KBs***** ) , given entity and relation linking results . | ||
| W19-5903 Dialogue systems are increasingly using *****knowledge bases ( KBs***** ) storing real - world facts to help generate quality responses . | ||
| N19-1299 When answering natural language questions over *****knowledge bases ( KBs***** ) , different question components and KB aspects play different roles . | ||
| Pre - trained word | 8 | |
| N19-1098 *****Pre - trained word***** vectors are ubiquitous in Natural Language Processing applications . | ||
| 2020.lrec-1.587 *****Pre - trained word***** embeddings are widely used in various fields . | ||
| D19-1276 *****Pre - trained word***** embeddings like ELMo and BERT contain rich syntactic and semantic information , resulting in state - of - the - art performance on various tasks . | ||
| D18-1311 *****Pre - trained word***** embeddings and language model have been shown useful in a lot of tasks . | ||
| W17-4119 *****Pre - trained word***** embeddings improve the performance of a neural model at the cost of increasing the model size . | ||
| End - to - end | 8 | |
| 2021.naacl-main.151 *****End - to - end***** approaches for sequence tasks are becoming increasingly popular . | ||
| 2020.findings-emnlp.302 *****End - to - end***** models in NLP rarely encode external world knowledge about length of time . | ||
| P17-1062 *****End - to - end***** learning of recurrent neural networks ( RNNs ) is an attractive solution for dialog systems ; however , current techniques are data - intensive and require thousands of dialogs to learn simple behaviors . | ||
| I17-1015 *****End - to - end***** training makes the neural machine translation ( NMT ) architecture simpler , yet elegant compared to traditional statistical machine translation ( SMT ) . | ||
| 2020.repl4nlp-1.11 *****End - to - end***** models trained on natural language inference ( NLI ) datasets show low generalization on out - of - distribution evaluation sets . | ||
| large - scale pre - trained language | 8 | |
| 2021.acl-long.11 Nowadays , open - domain dialogue models can generate acceptable responses according to the historical context based on the *****large - scale pre - trained language***** models . | ||
| 2020.findings-emnlp.165 Recently , *****large - scale pre - trained language***** models have demonstrated impressive performance on several commonsense - reasoning benchmark datasets . | ||
| 2021.wnut-1.49 We introduce BERTweetFR , the first *****large - scale pre - trained language***** model for French tweets . | ||
| 2021.naacl-main.162 Early exit mechanism aims to accelerate the inference speed of *****large - scale pre - trained language***** models . | ||
| 2021.emnlp-main.462 Despite their recent successes in tackling many NLP tasks , *****large - scale pre - trained language***** models do not perform as well in few - shot settings where only a handful of training examples are available . | ||
| Machine Reading Comprehension ( MRC | 8 | |
| 2021.acl-short.131 Existing models on *****Machine Reading Comprehension ( MRC***** ) require complex model architecture for effectively modeling long texts with paragraph representation and classification , thereby making inference computationally inefficient for production use . | ||
| D19-5807 Despite the remarkable progress on *****Machine Reading Comprehension ( MRC***** ) with the help of open - source datasets , recent studies indicate that most of the current MRC systems unfortunately suffer from weak robustness against adversarial samples . | ||
| D19-1600 *****Machine Reading Comprehension ( MRC***** ) has become enormously popular recently and has attracted a lot of attention . | ||
| 2020.lrec-1.660 *****Machine Reading Comprehension ( MRC***** ) is the task of answering a question over a paragraph of text . | ||
| 2020.aacl-srw.21 Models developed for *****Machine Reading Comprehension ( MRC***** ) are asked to predict an answer from a question and its related context . | ||
| large - scale training | 8 | |
| P19-1135 Distant supervision is widely used in relation classification in order to create *****large - scale training***** data by aligning a knowledge base with an unlabeled corpus . | ||
| D19-1169 Though the community has made great progress on Machine Reading Comprehension ( MRC ) task , most of the previous works are solving English - based MRC problems , and there are few efforts on other languages mainly due to the lack of *****large - scale training***** data . In this paper , we propose Cross - Lingual Machine Reading Comprehension ( CLMRC ) task for the languages other than English . | ||
| P19-1204 Currently , no *****large - scale training***** data is available for the task of scientific paper summarization . | ||
| 2012.amta-papers.3 Discriminative training for MT usually involves numerous features and requires *****large - scale training***** set to reach reliable parameter estimation . | ||
| 2021.naacl-main.310 Neural machine translation ( NMT ) models are data - driven and require *****large - scale training***** corpus . | ||
| pretrained language models ( PLMs | 8 | |
| 2021.naacl-main.186 Recent research investigates factual knowledge stored in large *****pretrained language models ( PLMs***** ) . | ||
| 2021.emnlp-main.555 To obtain high - quality sentence embeddings from *****pretrained language models ( PLMs***** ) , they must either be augmented with additional pretraining objectives or finetuned on a large set of labeled text pairs . | ||
| 2021.conll-1.44 Recent work indicated that *****pretrained language models ( PLMs***** ) such as BERT and RoBERTa can be transformed into effective sentence and word encoders even via simple self - supervised techniques . | ||
| 2021.acl-long.469 In order to deeply understand the capability of pretrained language models in text generation and conduct a diagnostic evaluation , we propose TGEA , an error - annotated dataset with multiple benchmark tasks for text generation from *****pretrained language models ( PLMs***** ) . | ||
| 2020.conll-1.45 How can *****pretrained language models ( PLMs***** ) learn factual knowledge from the training set ? | ||
| Convolutional Neural Networks ( CNNs | 8 | |
| D19-1464 Due to their inherent capability in semantic alignment of aspects and their context words , attention mechanism and *****Convolutional Neural Networks ( CNNs***** ) are widely applied for aspect - based sentiment classification . | ||
| S17-2094 In this paper we describe our attempt at producing a state - of - the - art Twitter sentiment classifier using *****Convolutional Neural Networks ( CNNs***** ) and Long Short Term Memory ( LSTMs ) networks . | ||
| W17-2341 Token sequences are often used as the input for *****Convolutional Neural Networks ( CNNs***** ) in natural language processing . | ||
| W18-5408 We present an analysis into the inner workings of *****Convolutional Neural Networks ( CNNs***** ) for processing text . | ||
| D17-1201 *****Convolutional Neural Networks ( CNNs***** ) are widely used in NLP tasks . | ||
| Czech | 8 | |
| L14-1553 This paper presents the fully automatic linking of two valency lexicons of *****Czech***** verbs : VALLEX and PDT - VALLEX . | ||
| R17-1082 In the paper , we introduce two software applications for automatic evaluation of coherence in *****Czech***** texts called EVALD Evaluator of Discourse . | ||
| 2021.ranlp-1.128 In this paper , we aim at improving *****Czech***** sentiment with transformer - based models and their multilingual versions . | ||
| L14-1416 In the present paper , we describe the development of the lexical network DeriNet , which captures core word - formation relations on the set of around 266 thousand *****Czech***** lexemes . | ||
| L06-1058 EngValLex is the name of an FGD - compliant valency lexicon of English verbs , built from the PropBank - Lexicon and following the structure of Vallex , the FGD - based lexicon of *****Czech***** verbs . | ||
| named - entity | 8 | |
| K19-1016 We propose a method called reverse mapping bytepair encoding , which maps *****named - entity***** information and other word - level linguistic features back to subwords during the encoding procedure of bytepair encoding ( BPE ) . | ||
| W17-2310 We introduce an end - to - end system capable of *****named - entity***** detection , normalization and relation extraction for extracting information about bacteria and their habitats from biomedical literature . | ||
| P19-1014 We study a variant of domain adaptation for *****named - entity***** recognition where multiple , heterogeneously tagged training sets are available . | ||
| P17-1095 Lexical resources such as dictionaries and gazetteers are often used as auxiliary data for tasks such as part - of - speech induction and *****named - entity***** recognition . | ||
| W19-3714 We report on the participation of the JRC Text Mining and Analysis Competence Centre ( TMA - CC ) in the BSNLP-2019 Shared Task , which focuses on *****named - entity***** recognition , lemmatisation and cross - lingual linking . | ||
| E - | 8 | |
| 2020.lrec-1.321 Automatic short answer grading is a significant problem in *****E -***** assessment . | ||
| L16-1106 We present *****E -***** TIPSY , a search query corpus annotated with named Entities , Term Importance , POS tags , and SYntactic parses . | ||
| S18-1053 Task 1 in the International Workshop SemEval 2018 , Affect in Tweets , introduces five subtasks ( El - reg , El - oc , V - reg , V - oc , and *****E -***** c ) to detect the intensity of emotions in English , Arabic , and Spanish tweets . | ||
| 2020.coling-demos.16 *****E -***** mail is a communication tool widely used by people of all ages on the Internet today , often in business and formal situations , especially in Japan . | ||
| 2020.emnlp-main.313 As the *****E -***** commerce thrives , high - quality online advertising copywriting has attracted more and more attention . | ||
| relative | 8 | |
| 2020.lrec-1.519 We propose a new functional definition and construction method for core vocabulary sets for multiple applications based on the *****relative***** coverage of a target concept in thousands of bilingual dictionaries . | ||
| 1963.earlymt-1.20 The paper will investigate a few major construction types in several related European languages : *****relative***** clauses , attributive phrases , and certain instances of coordinate conjunction involving these constructions . | ||
| 2020.acl-main.363 In Ordinal Classification tasks , items have to be assigned to classes that have a *****relative***** ordering , such as positive , neutral , negative in sentiment analysis . | ||
| L16-1709 In particular , we discuss the generalizability of the Syntactic Atlas of Italy , a linguistic project that builds on a long standing tradition of collecting and analyzing linguistic corpora , on a more recent project that focuses on the synchronic and diachronic analysis of the syntax of Italian and Portuguese *****relative***** clauses . | ||
| P18-3022 This paper presents a system that automatically generates multiple , natural language questions using *****relative***** pronouns and relative adverbs from complex English sentences . | ||
| chemical | 8 | |
| W19-5035 Chemical patents are an important resource for *****chemical***** information . | ||
| D19-5701 One of the biomedical entity types of relevance for medicine or biosciences are *****chemical***** compounds and drugs . | ||
| 2020.aacl-main.19 By predicting chemical compound structures from their names , we can better comprehend *****chemical***** compounds written in text and identify the same chemical compound given different notations for database creation . | ||
| 2021.triton-1.9 The domain - specialised application of Named Entity Recognition ( NER ) is known as Biomedical NER ( BioNER ) , which aims to identify and classify biomedical concepts that are of interest to researchers , such as genes , proteins , *****chemical***** compounds , drugs , mutations , diseases , and so on . | ||
| U19-1014 Extracting *****chemical***** reactions from patents is a crucial task for chemists working on chemical exploration . | ||
| Frequently | 7 | |
| W19-5049 Recognizing textual inference relations and question similarity can address the issue of answering new consumer health questions by mapping them to ***** Frequently ***** Asked Questions on reputed websites like the NIH. | ||
| W19-5910 ***** Frequently *****, dialogue state tracking systems assume independence between slot values within a frame. | ||
| 2021.sigdial-1.44 Automated ***** Frequently ***** Asked Question (FAQ) retrieval provides an effective procedure to provide prompt responses to natural language based queries, providing an efficient platform for large-scale service-providing companies for presenting readily available information pertaining to customers' questions. | ||
| 2012.amta-government.6 ***** Frequently ***** the link may be a person or place, so something as simple as a mistranslated name will cause a search to miss relevant documents. | ||
| L14-1069 ***** Frequently *****, the problem is due to machines not being able to recognise the many implicit relationships between office artefacts, and also due to them not being aware of the context surrounding them | ||
| experimenting | 7 | |
| 2021.iwpt-1.17 This year we have focused on ***** experimenting ***** with new ideas on a limited time budget. | ||
| W17-4101 We present a case study on Czech, a morphologically-rich language, ***** experimenting ***** with different input and output representations. | ||
| D19-3006 This paper introduces a novel orchestration framework, called CFO (Computation Flow Orchestrator), for building, ***** experimenting ***** with, and deploying interactive NLP (Natural Language Processing) and IR (Information Retrieval) systems to production environments. | ||
| 2021.bea-1.16 In this paper, we compare the performance of previous literature with Transformer models ***** experimenting ***** on a public and a private dataset. | ||
| D19-5215 We employ Transformer architecture ***** experimenting ***** with multilingual models and methods for low-resource languages | ||
| Authorship attribution | 7 | |
| 2021.eval4nlp-1.18 ***** Authorship attribution ***** is the task of assigning an unknown document to an author from a set of candidates. | ||
| C18-1238 ***** Authorship attribution ***** typically uses all information representing both content and style whereas attribution based only on stylistic aspects may be robust in cross-domain settings. | ||
| 2020.acl-main.203 ***** Authorship attribution ***** aims to identify the author of a text based on the stylometric analysis. | ||
| W17-2401 ***** Authorship attribution ***** is a natural language processing task that has been widely studied, often by considering small order statistics. | ||
| E17-1107 ***** Authorship attribution ***** is associated with important applications in forensics and humanities research | ||
| interpret | 7 | |
| W18-2315 Disease phrase matching, i.e., deciding whether two given disease phrases ***** interpret ***** each other, is a basic but crucial preprocessing step for the above tasks. | ||
| 2021.acl-demo.10 Our tool also provides additional useful information including explanations, to help the regulatory staff ***** interpret ***** the prediction results, and similar past cases as well as non-compliance to regulations, to support the decision making. | ||
| L12-1131 This may suffice to explain much of the process by which speakers ***** interpret ***** the IS of utterances in discourse. | ||
| L10-1012 Contrary to other approaches, we directly ***** interpret ***** the original entities as time slices in order to (i) avoid a duplication of the original ontology and (ii) to prevent a knowledge engineer from ontology rewriting. | ||
| N18-2110 More importantly, we next ***** interpret ***** what these neural models have learned about the linguistic characteristics of AD patients, via analysis based on activation clustering and first-derivative saliency techniques | ||
| Linear SVM | 7 | |
| 2020.semeval-1.294 We use a ***** Linear SVM ***** with document vectors computed from pre-trained word embeddings, and we explore the effectiveness of lexical, part of speech, dependency, and named entity (NE) features. | ||
| S17-2100 The first space is a bag-of-words model and has a ***** Linear SVM ***** as base classifier. | ||
| 2021.ltedi-1.15 We use TF-IDF character n-grams and pretrained MuRIL embeddings for text representation and Logistic Regression and ***** Linear SVM ***** for classification. | ||
| S19-2177 Our submitted model is a ***** Linear SVM ***** that solely relies on the negative sentiment of a document. | ||
| W19-6122 Of these, a ***** Linear SVM ***** provides predicts stance best, with 0.76 accuracy / 0.42 macro F1 | ||
| Particular | 7 | |
| L08-1081 ***** Particular ***** emphasis is devoted to the level of naturalness of interaction. | ||
| L04-1263 ***** Particular ***** reference is given to some of the practical implications of acquiring appropriate data in under-developed communities. | ||
| L14-1300 ***** Particular ***** attention is drawn on the use of NLP deep semantic methods to help in data processing. | ||
| L10-1574 ***** Particular ***** uses of PNs with sense extension are focussed on and inspected taking into account the presence of PNs in lexical semantic databases and electronic corpora. | ||
| L04-1268 ***** Particular ***** attention is paid to the validation of the produced lexica and the lessons learnt during pre-validation | ||
| ideally | 7 | |
| 2021.emnlp-main.844 The lexical substitution task aims at generating a list of suitable replacements for a target word in context, ***** ideally ***** keeping the meaning of the modified text unchanged. | ||
| 2021.naacl-main.248 These sub-questions pertain to lower level visual concepts in the image that models ***** ideally ***** should understand to be able to answer the reasoning question correctly. | ||
| L16-1402 Researchers in Natural Language Processing rely on availability of data and software, ***** ideally ***** under open licenses, but little is done to actively encourage it. | ||
| W19-3817 Neural machine learning systems perform far from ***** ideally ***** in this task, reaching as low as 73% F1 scores on modern benchmark datasets. | ||
| E17-1107 A crucial point in this field is to quantify the personal style of writing, ***** ideally ***** in a way that is not affected by changes in topic or genre | ||
| Approximately | 7 | |
| L10-1217 ***** Approximately ***** seven percent of the automatic transcription was manually corrected. | ||
| L10-1160 ***** Approximately ***** 30 minutes of speech material per speaker and per session was recorded. | ||
| 2020.calcs-1.7 ***** Approximately ***** 11 hours of untranscribed multilingual speech was transcribed automatically using four bilingual CS transcription systems operating in English-isiZulu, English-isiXhosa, English-Setswana and English-Sesotho. | ||
| 2020.lrec-1.877 ***** Approximately ***** 90% of such items are judged well-formed, surpassing the rate of manually-produced items. | ||
| 1999.mtsummit-1.84 ***** Approximately ***** half of these sentences are evaluated and the results are given | ||
| utilization | 7 | |
| C18-1032 The exponential increase in the usage of Wikipedia as a key source of scientific knowledge among the researchers is making it absolutely necessary to metamorphose this knowledge repository into an integral and self-contained source of information for direct ***** utilization *****. | ||
| 2020.emnlp-main.605 The framework reveals insights about differences in face act ***** utilization ***** between asymmetric roles in persuasion conversations. | ||
| L06-1427 Efficiency of pure manual extractionprocedure is significantly improved by ***** utilization ***** of automatic statistical methods based lexical association measures. | ||
| D18-1355 The experiments show that the integration provides two major benefits: (1) the integrated model outperforms multiple state-of-the-art baseline models for sentence simplification in the literature (2) through analysis of the rule ***** utilization *****, the model seeks to select more accurate simplification rules. | ||
| P19-1535 For the sake of rational knowledge ***** utilization ***** and coherent conversation flow, a dialogue strategy which controls knowledge selection is instantiated and continuously adapted via reinforcement learning | ||
| enriching | 7 | |
| W19-9006 Current and upcoming activities are directed at: 1/ ***** enriching ***** the English corpus of didactic materials on EU history and culture, 2/ translating the texts into (the) other official EU languages and aligning the translations with the English texts; 3/ developing new test modules. | ||
| 2021.acl-long.437 Conditional Variational AutoEncoder (CVAE) effectively increases the diversity and informativeness of responses in open-ended dialogue generation tasks through ***** enriching ***** the context vector with sampled latent variables. | ||
| E17-2099 A discriminative classifier can overcome these problems, in particular when ***** enriching ***** standard lexical features with features geared towards verbal inflection. | ||
| 2021.law-1.9 In this paper, we present a first attempt at ***** enriching ***** German Universal Dependencies (UD) treebanks with enhanced dependencies. | ||
| 2020.lrec-1.776 We believe that further investigations and processing, as well as the application of novel algorithms and methods, can strengthen ***** enriching ***** computerized understanding and processing of low resource languages | ||
| merge | 7 | |
| 2020.textgraphs-1.3 In this paper, we present GraphNEMR, a graph-based model that uses graph convolutional networks to jointly ***** merge ***** text segments and recognize named entities. | ||
| P19-1585 Unlike previous work, our ***** merge ***** and label approach predicts real-valued instead of discrete segmentation structures, which allow it to combine word and nested entity embeddings while maintaining differentiability. | ||
| D17-1226 Our event coreference approach alternates between WD and CD clustering and combines arguments from both event clusters after every ***** merge *****, continuing till no more ***** merge ***** can be made. | ||
| L12-1160 The tool is the result of a ***** merge ***** of automatic error detection and classification of Hjerson (Popovic̈, 2011) and Addicter (Zeman et al., 2011) into the pipeline and web visualization of Addicter. | ||
| 2002.amta-papers.7 The auxiliary lexical file (ALF) has to be revised before a ***** merge ***** into the core lexicons | ||
| revising | 7 | |
| W19-8606 We broaden this focus to include the earlier ***** revising ***** stage, where sentences require adjustment to the information included or major rewriting and propose Sentence-level Revision (SentRev) as a new writing assistance task. | ||
| L16-1264 The strategies for upgrading TANL to the use of Universal Dependencies range from a minimalistic approach consisting of introducing pre/post-processing steps into the native pipeline to ***** revising ***** the whole pipeline. | ||
| 2021.sigdial-1.41 Based on these results, we suggest (a) using the term “small talk” instead of “open-domain” for the current chatbots which are not that “open” in terms of conversational abilities yet, and (b) ***** revising ***** the evaluation methods to test the chatbot conversations against other speech events. | ||
| I17-2074 The model is trained using logs of the revisions made professional editors ***** revising ***** draft newspaper articles written by journalists. | ||
| C18-1277 In this work, we propose to conduct pattern extraction and entity linking first, and put forward pattern ***** revising ***** procedure to mitigate the error propagation problem | ||
| feasibility | 7 | |
| N19-1016 It employs multiple context-free grammars and incorporates many refinements to achieve ***** feasibility *****. | ||
| L12-1506 We present the first results that indicate ***** feasibility ***** and development time improvements for creating a medium to large coverage precision grammar. | ||
| L14-1312 We can confirm the general ***** feasibility ***** of the approach by reporting satisfactory values between 0.694 and 0.755 in inter-annotator agreement using Krippendorff's $\alpha$. | ||
| 1993.eamt-1.3 We list the types of information which we consider the necessary minimum for a successful processing of MWLs, and report on some ***** feasibility ***** studies aimed at the automatic extraction of German verbal multiword lexemes from text corpora and machine-readable dictionaries. | ||
| S18-1121 Still, the best observed accuracy (0.712) underlines the principle ***** feasibility ***** of identifying warrants | ||
| CF | 7 | |
| L10-1120 We present the result of comparing words that were estimated both by our proposed system (VNACD) and three baseline systems (VACD, NACD, and ***** CF *****). | ||
| N19-1212 Collaborative filtering (***** CF *****) is a core technique for recommender systems. | ||
| W89-0241 While LFG can make special use of its ***** CF ***** backbone, the algorithm employed is not restricted to grammars having a ***** CF ***** backbone and is equally suited to complex-feature-based formalisms. | ||
| 2020.emnlp-main.394 In practice, staged multi-domain pre-training presents performance deterioration in the form of catastrophic forgetting (***** CF *****) when evaluated on a generic benchmark such as GLUE. | ||
| 2021.eacl-main.304 FEs convey a communicative function (***** CF *****), i.e. `showing the aim of the paper' in the above-mentioned example | ||
| summarising | 7 | |
| 2020.lrec-1.242 Automatic extraction of the reports' intervention content, population, settings and their results etc. are essential in synthesising and ***** summarising ***** the literature. | ||
| 2020.fnp-1.1 This paper presents the results and findings of the Financial Narrative Summarisation shared task (FNS 2020) on ***** summarising ***** UK annual reports. | ||
| N18-6002 Each text production task raises a slightly different communication goal (e.g, how to take the dialogue context into account when producing a dialogue turn; how to detect and merge relevant information when ***** summarising ***** a text; or how to produce a well-formed text that correctly capture the information contained in some input data in the case of data-to-text generation). | ||
| W19-5016 Knowledge base construction is crucial for ***** summarising *****, understanding and inferring relationships between biomedical entities. | ||
| L12-1592 We conclude by ***** summarising ***** the first edition of the challenge and by giving an outlook to future work | ||
| enhancement | 7 | |
| 2020.coling-main.14 First, the aspect ***** enhancement ***** module in METNet improves the representation learning of the aspect with contextual semantic features, which gives the aspect more abundant information. | ||
| 2021.starsem-1.8 In this setting, InferBert succeeds to learn general inference patterns, from a relatively small number of training instances, while not hurting performance on the original NLI data and substantially outperforming prior knowledge ***** enhancement ***** models on the challenge data. | ||
| C18-1236 Finally, we report on applications that consider both the process perspective and its ***** enhancement ***** through NLP. | ||
| N18-4006 Variations include word embeddings trained using context windows from Stanford and Universal dependencies at several levels of ***** enhancement ***** (ranging from unlabeled, to Enhanced++ dependencies). | ||
| C16-2017 Anita's simplification module features a state-of-the-art system that adapts texts according to the needs of individual users, and its ***** enhancement ***** module allows the user to search for a word's definitions, synonyms, translations, and visual cues through related images | ||
| Sentence simplification | 7 | |
| 2020.emnlp-main.415 ***** Sentence simplification ***** aims to make sentences easier to read and understand. | ||
| C18-1039 ***** Sentence simplification ***** aims to improve readability and understandability, based on several operations such as splitting, deletion, and paraphrasing. | ||
| 2020.winlp-1.23 ***** Sentence simplification ***** aims to convert a complex sentence into its simpler form such that it is easily comprehensible. | ||
| N19-1317 ***** Sentence simplification ***** is the task of rewriting texts so they are easier to understand. | ||
| N18-2013 ***** Sentence simplification ***** aims to simplify the content and structure of complex sentences, and thus make them easier to interpret for human readers, and easier to process for downstream NLP applications | ||
| sketch | 7 | |
| L08-1385 We also ***** sketch ***** how we developed a new integrated tool that allows the session recordings of the EI data to be analyzed with a widely-used automatic speech recognition (ASR) engine. | ||
| Q19-1012 While NPI has been commonly trained with the ``gold'' program or its ***** sketch *****, for realistic KBQA applications such gold programs are expensive to obtain. | ||
| 2020.findings-emnlp.357 When a new client registers with its ***** sketch *****, it gets immediate accuracy benefits. | ||
| P18-2023 After delving into Chinese lexical knowledge, we ***** sketch ***** 68 implicit morphological relations and 28 explicit semantic relations. | ||
| L10-1069 Our next goal is to identify the great variety of toponyms (e.g. names of mountains and valleys, glaciers and rivers, trails and cabins) in this corpus, and we ***** sketch ***** how a large gazetteer of Swiss topographical names can be exploited for this purpose | ||
| verb semantics | 7 | |
| P17-1150 To enable human-robot communication and collaboration, previous works represent grounded ***** verb semantics ***** as the potential change of state to the physical world caused by these verbs. | ||
| 2020.coling-main.423 We present the first evaluation of the applicability of a spatial arrangement method (SpAM) to a typologically diverse language sample, and its potential to produce semantic evaluation resources to support multilingual NLP, with a focus on ***** verb semantics *****. | ||
| 2017.lilt-15.1 More specifically, we will explore the effect that lexical factorization in ***** verb semantics ***** has on the suppression or expression of semantic features within the sentence. | ||
| L06-1354 In particular, after a brief description of the two resources, their different approaches to the ***** verb semantics ***** are described; an accurate comparison of a set of verbal entries belonging to Speech Act semantic class is carried out aiming at evaluate the possibilities and the advantages of a semiautomatic link. | ||
| 2000.amta-papers.15 The paper deals with the question whether representations of ***** verb semantics ***** formulated on the basis of a lexically and syntactically restricted domain (weather forecasts) can apply to other, less restricted textual domains | ||
| slots | 7 | |
| D19-1097 However, most existing models fail to fully utilize cooccurrence relations between ***** slots ***** and intents, which restricts their potential performance. | ||
| L12-1628 The relationship between ***** slots ***** is such that a chain like disambiguation process is possible. | ||
| 2021.naacl-main.63 In this paper, we propose a novel approach to model long-term slot context and to fully utilize the semantic correlation between ***** slots ***** and intents. | ||
| N18-2112 This, combined with an information sharing mechanism between ***** slots *****, increases the scalability to large domains. | ||
| 2020.acl-main.5 However, in multi-domain scenarios, ellipsis and reference are frequently adopted by users to express values that have been mentioned by ***** slots ***** from other domains | ||
| solved | 7 | |
| 2020.wac-1.7 Part of speech tagging is a fundamental NLP task often regarded as ***** solved ***** for high-resource languages such as English. | ||
| 2020.coling-main.141 Language identification is considered a ***** solved ***** task in many cases; however, in the case of very closely related languages, or in an unsupervised scenario (where the languages are not known in advance), performance is still poor. | ||
| W16-3923 The results of the participated systems shows that the task is far to be considered as a ***** solved ***** one and methods with stellar performance in normal texts need to be revised. | ||
| 2021.dash-1.8 While generally a ***** solved ***** problem for documents of sufficient length and languages with ample training data, the proliferation of microblogs and other social media has made it increasingly common to encounter use-cases that *don't* satisfy these conditions. | ||
| 2020.coling-main.579 LangID is largely treated as ***** solved ***** in the literature, with models reported that achieve over 90% average F1 on as many as 1,366 languages | ||
| profiling | 7 | |
| P19-1249 We further establish the state of the art's ***** profiling ***** performance by evaluating the winning approaches submitted to the PAN gender prediction tasks in a transfer learning experiment. | ||
| 2000.iwpt-1.19 This article presents an approach to grammar and system engineering, termed competence & performance ***** profiling *****, that makes systematic experimentation and the precise empirical study of system properties a focal point in development. | ||
| 2020.lrec-1.883 In the second part of the paper, we demonstrate the effectiveness of these features in a number of theoretical and applicative studies in which they were successfully used for text and author ***** profiling *****. | ||
| 2020.restup-1.1 Although we already started addressing the problem of detecting hate speech when targets are immigrants or women at the HatEval shared task in SemEval-2019, and when targets are women also in the Automatic Misogyny Identification tasks at IberEval-2018, Evalita-2018 and Evalita-2020, it was not done from an author ***** profiling ***** perspective. | ||
| 2020.acl-main.308 Predicting the political bias and the factuality of reporting of entire news outlets are critical elements of media ***** profiling *****, which is an understudied but an increasingly important research direction | ||
| averaging | 7 | |
| I17-2024 Using averaged word embeddings is a simple way to leverage unlabelled corpora to build text representations but this approach can be prone to noise either coming from the embedding themselves or the ***** averaging ***** procedure. | ||
| 2020.emnlp-main.360 We show that the pre-tokenization of MWEs as single tokens performs better than ***** averaging ***** the embeddings of the individual tokens of the MWE. | ||
| 2021.wmt-1.40 In addition, we use the Transformer, a robust translation model, as our baseline and integrate several techniques, ***** averaging ***** checkpoints, model ensemble, and re-ranking. | ||
| 2021.acl-short.53 While ***** averaging ***** is the most commonly used efficient sentence encoder, Discrete Cosine Transform (DCT) was recently proposed as an alternative that captures the underlying syntactic characteristics of a given text without compromising practical efficiency compared to ***** averaging *****. | ||
| 2020.findings-emnlp.188 Second, we demonstrate that ***** averaging ***** the background knowledge of multiple, potentially biased annotators or corpora greatly improves summaryscoring performance | ||
| masked | 7 | |
| 2020.emnlp-main.403 BERT set many state-of-the-art results over varied NLU benchmarks by pre-training over two tasks: ***** masked ***** language modelling (MLM) and next sentence prediction (NSP), the latter of which has been highly criticized. | ||
| 2021.semeval-1.83 The results indicate that information from ***** masked ***** language models and character-level encoders can be combined to improve lexical complexity prediction. | ||
| 2021.wmt-1.99 Starting from an XLM-R checkpoint, we perform continued training by modifying the learning objective, switching from ***** masked ***** language modeling to QE oriented signals, before finetuning and ensembling the models. | ||
| 2020.blackboxnlp-1.13 We explore the imprint of two specific linguistic alternations, namely passivization and negation, on the representations generated by neural models trained with two different objectives: ***** masked ***** language modeling and translation. | ||
| 2021.emnlp-main.573 Prior work has shown that inserting an intermediate pre-training stage, using heuristic masking policies for ***** masked ***** language modeling (MLM), can significantly improve final performance | ||
| transformer architectures | 7 | |
| 2021.nuse-1.8 The best human-crafted stories exhibit coherent plot, strong characters, and adherence to genres, attributes that current states-of-the-art still struggle to produce, even using ***** transformer architectures *****. | ||
| 2021.wat-1.6 The RNN-based encoder-decoder model with attention mechanism and ***** transformer architectures ***** have been carried out for our experiment. | ||
| 2021.ranlp-1.142 We evaluate the approach on a toxic dataset of the Portuguese language, outperforming several graph-based methods and achieving competitive results compared to ***** transformer architectures *****. | ||
| 2020.wmt-1.97 We implement our system with model ensemble technique on different ***** transformer architectures ***** (Deep, Hybrid, Big, Large Transformers). | ||
| 2020.findings-emnlp.239 We obtain our representations with a hierarchical encoder based on ***** transformer architectures *****, for which we extend two well-known pre-training objectives | ||
| uncommon | 7 | |
| W17-3005 A paradigmatic example of this situation is abusive online behavior, with social networks and media platforms struggling to effectively combat ***** uncommon ***** or non-blacklisted hate words. | ||
| P19-1349 Most approaches train attention models from a coarse-grained association between sentences and images, which tends to fail on small objects or ***** uncommon ***** concepts. | ||
| 2021.emnlp-main.586 Our study shows that existing models struggle with producing answers that are frequently updated or from ***** uncommon ***** locations. | ||
| 2021.woah-1.18 We use a simple word frequency divergence to identify ***** uncommon ***** words overrepresented in a given community, not as a proxy for harmful speech but as a linguistic signature of the community | ||
| 2021.blackboxnlp-1.43 An important question concerning contextualized word embedding ( CWE ) models like BERT is how well they can represent different word senses , especially those in the long tail of *****uncommon***** senses . | ||
| multivariate | 7 | |
| W17-4201 We perform a ***** multivariate ***** analysis on a dataset manually annotated with news values and emotions, discovering interesting correlations among them. | ||
| L10-1625 The second pilot study goes one step in the direction of taking this complexity into account by demonstrating the potential of the enriched treebank for building a ***** multivariate ***** model of relative clause extraposition as a syntactic alternation. | ||
| L12-1152 Finally, ***** multivariate ***** linear and non-linar regression methods are applied for predicting the motion capture variables based on combinations of computer vision descriptors. | ||
| W19-3006 Machine learning techniques that harness the power of ***** multivariate ***** statistics and non-linear data analysis hold promise for modeling this heterogeneity, but many models require enormous datasets, which are unavailable for most psychiatric conditions (including ASD). | ||
| 2020.lrec-1.741 HMMs have been the one of the first models to be applied for sign recognition and have become the baseline models due to their success in modeling sequential and ***** multivariate ***** data | ||
| thresholds | 7 | |
| 2020.semeval-1.278 We have achieved an F1 score of only 56.86%, but after experimenting with various label assignment ***** thresholds ***** in the pre-processing steps, the F1 score improved to 64%. | ||
| L14-1024 Each extracted sentence pair is associated with a cross-lingual lexical similarity score based on which, several evaluations have been conducted to estimate the similarity ***** thresholds ***** which allow the extraction of the most useful data for training three-language pairs SMT systems. | ||
| 2020.acl-main.97 Then we design Co-attention Self-attention networks (CaSa) to make the selected evidence interact with claims, which is for 1) training DTE to determine the optimal decision ***** thresholds ***** and obtain more powerful evidence; and 2) utilizing the evidence to find the false parts in the claim. | ||
| L08-1156 Dynamically calculated ***** thresholds ***** are preferable over fixed similarity ***** thresholds ***** as fixed ***** thresholds ***** are inherently imprecise, that is, there is no similarity boundary beyond which any two strings always describe the same concept | ||
| 2020.lrec-1.626 Based on the realisation that hate speech is not a clear - cut category to begin with , appears to belong to a continuum of discriminatory discourse and is often realised through the use of indirect linguistic means , it is argued that annotation schemes for its detection should refrain from directly including the label ` hate speech , ' as different annotators might have different *****thresholds***** as to what constitutes hate speech and what not . | ||
| loosely | 7 | |
| 2020.acl-main.661 Over its three decade history, speech translation has experienced several shifts in its primary research themes; moving from ***** loosely ***** coupled cascades of speech recognition and machine translation, to exploring questions of tight coupling, and finally to end-to-end models that have recently attracted much attention. | ||
| L06-1369 To this end, we propose an algorithm that (a) extracts information from ***** loosely ***** labelled dependency structures that encode only basic and broadly accepted syntactic relations, namely Head/Dependent and the distinction of dependents into Argument vs. Adjunct, and (b) derives a possible set of word classes. | ||
| 2000.iwpt-1.26 Motivated by these concerns, [25] proposed a grammar partitioning and top-down parser composition mechanism for ***** loosely ***** restricted Context-Free Grammars (CFGs). | ||
| N19-1333 We employ an iterative decoding strategy that is tailored to the ***** loosely ***** supervised nature of our constructed corpora | ||
| 2020.fnp-1.23 This task focuses on summarizing annual financial reports which poses two main challenges as compared to typical news document summarization tasks : i ) annual reports are more lengthier ( average length about 80 pages ) as compared to typical news documents , and ii ) annual reports are more *****loosely***** structured e.g. | ||
| intelligible | 7 | |
| 2010.amta-government.3 But Google's Chinese is far from ***** intelligible *****, especially at the sentence level, primarily because of serious problems with word order and sentence parsing. | ||
| L12-1181 The translation requests, collected through the popular translation portal http://reverso.net, provide a most variated sample of real-world machine translation (MT) usage, from complete sentences to units of one or two words, from well-formed to hardly ***** intelligible ***** texts, from technical documents to colloquial and slang snippets. | ||
| 2008.amta-govandcom.11 We show that although the language used in this type of legal text is complex and specialized, an SMT system can produce ***** intelligible ***** and useful translations, provided that the system can be trained on a vast amount of legal text. | ||
| L10-1559 Enabling semantic technologies and intelligent linking and search are a big step forward, but they still do not succeed in making the content of old rare books ***** intelligible ***** to the broad public or specialists in other domains or languages. | ||
| L06-1441 It turns out that the most ***** intelligible ***** system (diphone-based) is far from being the one which obtains the best mean opinion score | ||
| numeral | 7 | |
| 2021.deelio-1.14 We also propose methods to reflect not only the symbolic aspect but also the quantitative aspect of ***** numeral *****s in the training of language models, using a loss function that depends on the magnitudes of the ***** numeral *****s and a regression model for the masked ***** numeral ***** prediction task. | ||
| 2021.rocling-1.28 In light of this, the purpose of this research is to identify the linking between the target cashtag and the target ***** numeral ***** in financial tweets, which is more challenging than analyzing news and official documents. | ||
| P17-1039 Specifically, we define three main syntactic token types, namely time token, modifier, and ***** numeral *****, to group time-related regular expressions over tokens. | ||
| P19-1635 In this paper, we attempt to answer the question of whether neural network models can learn numeracy, which is the ability to predict the magnitude of a ***** numeral ***** at some specific position in a text description. | ||
| 2020.findings-emnlp.235 We then represent the embedding of a ***** numeral ***** as a weighted average of the prototype number embeddings | ||
| vanilla | 7 | |
| 2021.acl-long.474 When evaluated on the BBC extreme summarization task, two state-of-the-art models augmented with Focus Attention generate summaries that are closer to the target and more faithful to their input documents, outperforming their ***** vanilla ***** counterparts on ROUGE and multiple faithfulness measures. | ||
| D19-5314 Our experiments show improved results compared to ***** vanilla ***** word embeddings, retrofitting and concatenation techniques using the same information, on a variety of data-sets of word similarities. | ||
| 2021.emnlp-main.523 ReLA achieves translation performance comparable to several strong baselines, with training and decoding speed similar to that of the ***** vanilla ***** attention. | ||
| D18-1232 We first demonstrate that ***** vanilla ***** knowledge distillation applied to answer span prediction is effective for reading comprehension systems. | ||
| K19-1077 Without breaking the end to end architecture, DivCNN Seq2Seq achieves a higher level of comprehensiveness compared to ***** vanilla ***** models and strong baselines | ||
| sadness | 7 | |
| 2021.wassa-1.8 We introduce FEEL-IT, a novel benchmark corpus of Italian Twitter posts annotated with four basic emotions: anger, fear, joy, ***** sadness *****. | ||
| W18-3507 Quantitative evaluation with jointly trained network, augmented with linguistic features, reports best accuracies for emotion prediction; namely joy, ***** sadness *****, anger, and neutral emotion in text. | ||
| 2020.winlp-1.34 Conversely, when ***** sadness ***** is expressed with authority-vice, the tweet is more likely to be retweeted. | ||
| L14-1478 It is observed that emotions less easy to recognize are joy and disgust, whereas the most easy to detect are anger, ***** sadness ***** and the neutral state. | ||
| 2021.semeval-1.115 Therefore, we describe here an ensemble method to identify toxicity and classify the emotions expressed on a corpus of annotated posts published by Task 5 of SemEval 2021–our analysis shows that the majority of such posts express anger, ***** sadness ***** and fear | ||
| amplify | 7 | |
| 2020.acl-main.396 We find that context can both ***** amplify ***** or mitigate the perceived toxicity of posts. | ||
| 2020.argmining-1.9 Recent research has shown that machine learning models trained on respective data may not only adopt, but even ***** amplify ***** the bias. | ||
| 2021.blackboxnlp-1.16 We also study the self-reinforcing mechanism of text degeneration, explaining why the mistakes ***** amplify *****. | ||
| 2021.mrqa-1.9 Using our metrics, we find that open-domain QA models ***** amplify ***** biases more than their closed-domain counterparts and propose that biases in the retriever surface more readily due to greater freedom of choice. | ||
| 2021.eacl-main.188 Recent studies in the field of Machine Translation (MT) and Natural Language Processing (NLP) have shown that existing models ***** amplify ***** biases observed in the training data | ||
| orchestration | 7 | |
| 2020.ldl-1.8 The realization of this alignment ***** orchestration ***** effort has been performed through two main phases: we first described its API as an OpenAPI specification (a la API-first), which we then exploited to generate server stubs and compliant client libraries. | ||
| N18-3005 Crowdsourcing provides a scalable and inexpensive way of data collection but collecting high quality data efficiently requires thoughtful ***** orchestration ***** of the crowdsourcing jobs. | ||
| D19-3006 This paper introduces a novel ***** orchestration ***** framework, called CFO (Computation Flow Orchestrator), for building, experimenting with, and deploying interactive NLP (Natural Language Processing) and IR (Information Retrieval) systems to production environments. | ||
| 2020.lrec-1.284 The key contribution of this paper is a workflow manager that enables the flexible ***** orchestration ***** of workflows based on a portfolio of Natural Language Processing and Content Curation services as well as a Multilingual Legal Knowledge Graph that contains semantic information and meaningful references to legal documents. | ||
| J17-4005 We discuss how such ***** orchestration ***** choices affect the scope of MWE-aware systems | ||
| articulation | 7 | |
| L12-1398 The results of the analysis show that: i) the pronunciation variations found frequently in Korean children's speech are devoicing and changing of ***** articulation ***** place or/and manner; and ii) they largely correspond to those of general Korean learners' speech presented in previous studies, despite some differences. | ||
| W17-2318 Approximately 80% to 95% of patients with Amyotrophic Lateral Sclerosis (ALS) eventually develop speech impairments, such as defective ***** articulation *****, slow laborious speech and hypernasality. | ||
| 1999.mtsummit-1.96 A crucial aspect of the framework is a careful ***** articulation ***** of a software architecture, a linguistic architecture and an incremental development process of linguistic knowledge. | ||
| L14-1473 The HESITA database is the output of a project in the speech-processing field for European Portuguese held by an interdisciplinary group in intimate ***** articulation ***** between engineering tools and experience and the linguistic approach. | ||
| P19-2047 Examining sentiments in social media poses a challenge to natural language processing because of the intricacy and variability in the dialect ***** articulation *****, noisy terms in form of slang, abbreviation, acronym, emoticon, and spelling error coupled with the availability of real-time content | ||
| contrary | 7 | |
| 2020.conll-1.26 Furthermore, frequent messages tend to be longer than infrequent ones, a pattern ***** contrary ***** to the Zipf Law of Abbreviation (ZLA) observed in all natural languages. | ||
| W19-2905 These results strongly suggest that morphological processing tracks morphemes incrementally from left to right and parses them into hierarchical syntactic structures, ***** contrary ***** to “amorphous” and finite-state models of morphological processing. | ||
| 2021.naacl-main.403 We resolve this by showing, ***** contrary ***** to previous studies, that the representations do not occupy a narrow cone, but rather drift in common directions. | ||
| D19-1509 Counterfactual reasoning requires predicting how alternative events, ***** contrary ***** to what actually happened, might have resulted in different outcomes. | ||
| 2021.emnlp-main.17 In addition, ***** contrary ***** to what previous work has claimed, our auxiliary experiments suggest that relation prediction is contributory to named entity prediction in a non-negligible way | ||
| semantic verb | 7 | |
| 2020.coling-main.423 Starting from a shared sample of 825 English verbs, translated into Chinese, Japanese, Finnish, Polish, and Italian, we apply a two-phase annotation process which produces (i) ***** semantic verb ***** classes and (ii) fine-grained similarity scores for nearly 130 thousand verb pairs. | ||
| L16-1338 We propose a novel ***** semantic verb ***** relation scheme and design a multi-step annotation approach for scaling-up the annotations using crowdsourcing. | ||
| L06-1192 We describe a gold standard for ***** semantic verb ***** classes which is based on human associations to verbs. | ||
| L16-1425 We evaluate the quality of the resulting resource on a manually annotated sample of 1000 ***** semantic verb ***** relations. | ||
| L08-1222 The two corpora consist mainly of newspaper texts annotated at different levels of linguistic description: morphological (PoS and lemmas), syntactic (constituents and functions), and semantic (argument structures, thematic roles, ***** semantic verb ***** classes, named entities, and WordNet nominal senses) | ||
| recognisers | 7 | |
| L16-1533 The newly created named entity ***** recognisers ***** are evaluated, with F-scores of between 0.64 and 0.77, and error analysis is performed to identify possible avenues for improving the quality of the systems. | ||
| L14-1190 The project aims to collect, share and reuse audiovisual language resources from broadcasters and subtitling companies to develop large vocabulary continuous speech ***** recognisers ***** in specific domains and new languages, with the purpose of solving the automated subtitling needs of the media industry. | ||
| L06-1004 We present an overview of Regulus, an Open Source platform that supports corpus-based derivation of efficient domain-specific speech ***** recognisers ***** from general linguistically motivated unification grammars. | ||
| L14-1316 Currently available speech ***** recognisers ***** do not usually work well with elderly speech. | ||
| 2021.bucc-1.5 We propose a novel approach for rapid prototyping of named entity ***** recognisers ***** through the development of semi-automatically annotated datasets | ||
| IDF | 7 | |
| S17-2071 Further analysis showed that ***** IDF ***** is the most useful characteristic, whereas the count of words with which the given word has high NPMI has a negative effect on performance. | ||
| S17-2017 ***** IDF ***** weighting and Part-of-Speech tagging are applied on the examined sentences to support the identification of words that are highly descriptive in each sentence. | ||
| 2021.naacl-main.215 We represent a story block using the term frequencies (TF) of semantic frames in it, normalized by each frame's inverse document frequency (***** IDF *****). | ||
| W17-1303 ***** IDF ***** weighting and Part-of-Speech tagging are applied on the examined sentences to support the identification of words that are highly descriptive in each sentence. | ||
| 2021.ranlp-1.178 Based on the assumption that inverse document frequency (***** IDF *****) measures how important a word is, we further leverage the ***** IDF ***** weights in our embedding-level reconstructor | ||
| localized | 7 | |
| L14-1130 Analysis of the terms proves that, in general, in the normative terminology work in Latvia ***** localized ***** terms are coined according to these guidelines. | ||
| 2011.mtsummit-tutorials.4 In addition, software development cycles are short, forcing translation to start while the product is still undergoing changes, so that ***** localized ***** products can reach global markets in a timely fashion. | ||
| 2010.amta-commercial.9 This paper discusses how to measure the impact of online content ***** localized ***** by machine translation in meeting the business need of commercial users, i.e., reducing the volume of telephone calls to the Call Center (call deflection). | ||
| 2021.acl-long.145 Furthermore, we also propose an approach to use our probe to investigate ***** localized ***** linguistic information in the linguistic graphs using perturbation analysis. | ||
| 2021.emnlp-main.699 This finding emphasizes the importance of pretraining on closely related, ***** localized ***** languages to achieve more efficient learning and faster inference at very low-resource languages like Javanese and Sundanese | ||
| characteristics | 7 | |
| W16-4001 I start this talk by sketching some sample scenarios of Digital Humanities projects which involve various Humanities and Social Science disciplines, noting that the potential for a meaningful contribution to higher-level questions is highest when the employed language technological models are carefully tailored both (a) to ***** characteristics ***** of the given target corpus, and (b) to relevant analytical subtasks feeding the discipline-specific research questions. | ||
| 2020.tacl-1.14 We analyzed the acoustic-prosodic and linguistic ***** characteristics ***** of language trusted and mistrusted by raters and compared these to ***** characteristics ***** of actual truthful and deceptive language to understand how perception aligns with reality. | ||
| 1999.mtsummit-1.40 Some tools, such as the word counter, the repetition detector, the sentence length estimator and the sentence simplicity checker look at ***** characteristics ***** of the text itself. | ||
| 2003.mtsummit-semit.11 We formulate an original model for statistical machine translation (SMT) inspired by ***** characteristics ***** of the Arabic-English translation task. | ||
| 2021.eacl-main.118 These observations suggest that the poor calibration of many neural models may stem from ***** characteristics ***** of a specific subset of tasks rather than general ill-suitedness of such models for language generation | ||
| gloss | 7 | |
| 2020.coling-main.525 We also demonstrate the problem in current methods that rely on ***** gloss ***** supervision. | ||
| P19-1293 In this work we explore this intuition by breaking translation into a two step process: generating a rough ***** gloss ***** by means of a dictionary and then `translating' the resulting pseudo-translation, or `Translationese' into a fully fluent translation. | ||
| 2021.gwc-1.7 The feature set utilised lemma properties, ***** gloss ***** similarities, graph distances and polysemy patterns. | ||
| L12-1503 The corpus contains weather forecasts recorded from German public TV which are manually annotated using ***** gloss *****es distinguishing sign variants, and time boundaries have been marked on the sentence and the ***** gloss ***** level. | ||
| 2021.mtsummit-at4ssl.7 We approach ***** gloss ***** translation as a low-resource machine translation task and investigate two popular methods for improving translation quality: hyperparameter search and backtranslation | ||
| typographical | 7 | |
| 2020.eamt-1.23 We find that the post-editing effort for MT segments is only higher in two out of three language pairs, and that the number of segments with wrong terminology, omissions, and ***** typographical ***** problems is similar in HT. | ||
| L10-1136 Folksonomies are unsystematic, unsophisticated collections of keywords associated by social bookmarking users to web content and, despite their inconsistency problems (***** typographical ***** errors, spelling variations, use of space or punctuation as delimiters, same tag applied in different context, synonymy of concepts, etc.), their popularity is increasing among Web 2.0 application developers. | ||
| 2020.latechclfl-1.12 However, ***** typographical ***** conventions vary across languages, and as a result, almost all approaches to this problem have been monolingual. | ||
| 2020.lt4hala-1.15 Rule-based systems that work reasonably well for modern languages struggle with (the lack of) ***** typographical ***** conventions in 19th-century literature. | ||
| W19-8606 Studies on writing assistance, such as grammatical error correction (GEC), have mainly focused on sentence editing and proofreading, where surface-level issues such as ***** typographical ***** errors, spelling errors, or grammatical errors should be corrected | ||
| designing | 7 | |
| 2021.acl-short.64 Most existing work tackles ABSA in a discriminative manner, ***** designing ***** various task-specific classification networks for the prediction. | ||
| W18-4903 He specializes in broad-coverage semantic analysis: ***** designing ***** linguistic meaning representations, annotating them in corpora, and automating them with statistical natural language processing techniques. | ||
| 2021.emnlp-main.178 To investigate this, we carry out a study for improving multiple task-oriented dialogue downstream tasks through ***** designing ***** various tasks at the further pre-training phase. | ||
| 2021.emnlp-main.417 Paraphrase generation has benefited extensively from recent progress in the ***** designing ***** of training objectives and model architectures. | ||
| P19-1324 We investigate how an LSTM language model deals with lexical ambiguity in English, ***** designing ***** a method to probe its hidden representations for lexical and contextual information about words | ||
| measurable | 7 | |
| L10-1282 The documentation must be of a kind that it enables the user to compare different tools offering the same service, hence the descriptions must contain ***** measurable ***** values. | ||
| 2020.findings-emnlp.246 To better distinguish word pairs in a hypernym relation from other relations such as co-hypernyms, we also propose a new ***** measurable ***** function that takes into account both the difference in the generality of meaning and similarity of meaning between words. | ||
| L10-1264 These results reveal ***** measurable ***** cues contributing to word boundary location. | ||
| W19-1305 We find that the enhancement of the original lexicon led to ***** measurable ***** improvements in prediction accuracy for the selected NLP tasks. | ||
| W16-4108 The paper is concerned with syntactic complexity as ***** measurable ***** on the basis of the cognitive parser that incrementally builds up a syntactic representation to be used by the semantic component | ||
| Parallel Meaning | 7 | |
| 2020.udw-1.10 In an initial case study, we show promising results for converting English, German, Italian, and Dutch CCG derivations from the ***** Parallel Meaning ***** Bank into (unlabeled) UD-style dependency trees. | ||
| 2020.findings-emnlp.320 We perform monolingual as well as multilingual experiments on the ***** Parallel Meaning ***** Bank (Abzianidze et al., 2017). | ||
| W19-1203 We conduct experiments on the standard benchmark of the ***** Parallel Meaning ***** Bank (PMB 2.2). | ||
| 2021.cl-2.15 Experimental results on the ***** Parallel Meaning ***** Bank show that our proposal outperforms strong baselines by a wide margin and can be used to construct (silver-standard) meaning banks for 99 languages. | ||
| 2020.dmr-1.2 The resulting framework is similar to the meaning representations of Discourse Representation Theory employed in the ***** Parallel Meaning ***** Bank | ||
| verbose | 7 | |
| W19-8658 However, this mechanisms have several drawbacks: (1) it assumes the knowledge of the LaTeX, (2) it is slow, since LaTeX is ***** verbose ***** and (3) it is error-prone since LATEX is a typographical language. | ||
| 2020.acl-srw.13 Most methods proposed for this task rely on labeled or paired corpora (containing pairs of ***** verbose ***** and compressed sentences), which is often expensive to collect. | ||
| 2020.acl-main.26 While online reviews of products and services become an important information source, it remains inefficient for potential consumers to exploit ***** verbose ***** reviews for fulfilling their information need. | ||
| K18-1040 In sentence compression, the task of shortening sentences while retaining the original meaning, models tend to be trained on large corpora containing pairs of ***** verbose ***** and compressed sentences. | ||
| 2020.findings-emnlp.66 Privacy policy documents are long and ***** verbose ***** | ||
| fused | 7 | |
| S18-2012 From experiments, we empirically demonstrate that the proposed ElBiS matching function outperforms the concatenation-based or heuristic-based matching functions on natural language inference and paraphrase identification, while maintaining the ***** fused ***** representation compact. | ||
| 2020.coling-main.93 For the shallow fusion part, we use crossmodal coattention mechanism to obtain bidirectional context information of each two modals to get the ***** fused ***** shallow representations. | ||
| W18-3924 The best performance was achieved by the ***** fused ***** system which combines four systems together, with F1 micro of 68.77%. | ||
| 2021.emnlp-main.329 The results on two prediction tasks show that our ***** fused ***** model with different data outperforms the state-of-the-art method without clinical notes, which illustrates the importance of our fusion method and the clinical note features. | ||
| D19-6502 We obtain consistent and statistically significant improvements in terms of BLEU and METEOR and we observe how the ***** fused ***** systems are able to handle synonyms to propose more adequate translations as well as help the system to disambiguate among several translation candidates for a word | ||
| reconstruct | 7 | |
| 2020.coling-main.391 In many domains, dialogue systems need to work collaboratively with users to successfully ***** reconstruct ***** the meaning the user had in mind. | ||
| L16-1712 In order to make the full corpora available to the public despite such restrictions, the stand-off format presented here allows anybody to locally ***** reconstruct ***** the full corpora with the least possible computational effort. | ||
| 2020.emnlp-main.534 To enable responses that are more meaningful and context-specific, we propose to improve generative dialogue systems from the scenario perspective, where both dialogue history and future conversation are taken into account to implicitly ***** reconstruct ***** the scenario knowledge. | ||
| L14-1511 We ***** reconstruct ***** the socio-semantic landscape of the domain by inferring a co-authorship and a semantic network from the analysis of the corpus. | ||
| 2020.acl-main.366 A GAE is employed to aggregate the two sets of features by learning a latent representation which can jointly ***** reconstruct ***** them | ||
| etymology | 7 | |
| 2020.lrec-1.397 We predict the ***** etymology ***** of a word across the full range of ***** etymology ***** types and languages in Wiktionary, showing improvements over a strong baseline. | ||
| 2020.lrec-1.327 Its complexity, however, is a barrier to most speakers, since it does not necessarily reflect the particular phonetic evolution in ZC, but favours ***** etymology ***** instead. | ||
| 2021.lchange-1.9 Additionally, we label the English cognates according to their ***** etymology *****, separating them into two groups: old borrowings and recent borrowings. | ||
| 2020.ldl-1.3 We present two extended examples, one taken from the Oxford English Dictionary, the other from a work on ***** etymology *****, to show how our approach can handle different kinds of temporal information often found in lexical resources | ||
| 2020.coling-main.413 We extend the Yawipa Wiktionary Parser ( Wu and Yarowsky , 2020 ) to extract and normalize translations from *****etymology***** glosses , and morphological form - of relations , resulting in 300 K unique translations and over 4 million instances of 168 annotated morphological relations . | ||
| Moses | 7 | |
| 2007.iwslt-1.8 First, we use two decoders namely the open source ***** Moses ***** and an in-home syntax-based decoder to generate N-best lists. | ||
| 2012.amta-tutorials.6 The tutorial will provide some high-level theoretical background in domain adaptation, it will discuss practical application cases, and finally show how the presented methods can be applied with two widely used software tools: ***** Moses ***** and IRSTLM. | ||
| 2009.iwslt-evaluation.17 Both systems are based on the ***** Moses ***** statistical machine translation toolkit, with added components to address the rich morphology of the source languages. | ||
| 2007.iwslt-1.27 Our system is built on the open-source phrase-based statistical machine translation software ***** Moses *****. | ||
| 2010.iwslt-evaluation.25 Similar to previous years, our submitted systems are based on the ***** Moses ***** statistical machine translation toolkit | ||
| subordinate | 7 | |
| 2016.lilt-14.2 In general, modal subordination is concerned with more than two modalities, where the modality in subsequent sentences is interpreted in a context `***** subordinate *****' to the one created by the first modal expression. | ||
| L08-1309 We extract hyponymy relation candidates (HRCs) from the hierachical layouts in Wikipedia by regarding all ***** subordinate ***** items of an item x in the hierachical layouts as xs hyponym candidates, while Sumida and Torisawa (2008) extracted only direct ***** subordinate ***** items of an item x as xs hyponym candidates. | ||
| 2020.lrec-1.650 Sentences, however, are more deeply structured even on this side of constituent and dependency structure: they can consist of a main sentence and several ***** subordinate ***** clauses as well as further segments (e.g. inserts in parentheses); they can even recursively embed whole sentences and then contain multiple sentence beginnings and ends | ||
| W89-0224 A parse in the connectionist network contains information about role assignment , prepositional attachment , relative clause structure , and *****subordinate***** clause structure . | ||
| L10-1456 Our paper presents the details of a pilot study in which we tagged portions of the American National Corpus ( ANC ) for idioms composed of verb - noun constructions , prepositional phrases , and *****subordinate***** clauses . | ||
| behavioural | 7 | |
| L16-1701 Annotating and predicting ***** behavioural ***** aspects in conversations is becoming critical in the conversational analytics industry. | ||
| 2020.sigdial-1.38 The differences in decision making between ***** behavioural ***** models of voice interfaces are hard to capture using existing measures for the absolute performance of such models. | ||
| L14-1015 We discuss the various criteria for deciding values for each ***** behavioural ***** attributes which define the roles. | ||
| 2020.onion-1.5 For computers, however, emotion recognition is a complex problem: Thoughts and feelings are the roots of many ***** behavioural ***** responses and they are deeply entangled with neurophysiological changes within humans | ||
| W19-0503 Distributional semantics models ( DSMs ) are known to produce excellent representations of word meaning , which correlate with a range of *****behavioural***** data . | ||
| tutoring | 7 | |
| 1993.iwpt-1.8 Efficiency is also a concern, as ***** tutoring ***** applications typically run on personal computers, with the parser sharing memory with other components of the system. | ||
| L12-1120 We developed a dialogue-based ***** tutoring ***** system for teaching English to Japanese students and plan to transfer the current software ***** tutoring ***** agent into an embodied robot in the hope that the robot will enrich conversation by allowing more natural interactions in small group learning situations. | ||
| C16-1188 The joint inference system also performs much better than the pipeline system in the context of labeling modes that highlight important pedagogical steps in ***** tutoring *****. | ||
| C16-1181 Being able to measure cooperation has applications in many areas from the analysis - manual, semi and fully automatic - of natural language interactions to human-like virtual personal assistants, ***** tutoring ***** agents, sophisticated dialogue systems, and role-playing virtual humans. | ||
| W16-4310 Detecting depression or personality traits, ***** tutoring ***** and student behaviour systems, or identifying cases of cyber-bulling are a few of the wide range of the applications, in which the automatic detection of emotion is a crucial element | ||
| enabling | 7 | |
| P18-1093 More specifically, we propose an attention-based neural model that looks in-between instead of across, ***** enabling ***** it to explicitly model contrast and incongruity. | ||
| 2020.ngt-1.3 We propose a novel procedure for training multiple Transformers with tied parameters which compresses multiple models into one ***** enabling ***** the dynamic choice of the number of encoder and decoder layers during decoding. | ||
| P18-1097 Fluency boosting learning generates fluency-boost sentence pairs during training, ***** enabling ***** the error correction model to learn how to improve a sentence's fluency from more instances, while fluency boosting inference allows the model to correct a sentence incrementally with multiple inference steps until the sentence's fluency stops increasing. | ||
| L10-1479 Additionally, numerous work has been invested in order to keep the entry barrier with regards to extending the framework as low as possible, ***** enabling ***** developers to add additional functionality to the framework in as less time as possible. | ||
| P19-1606 Information need of humans is essentially multimodal in nature, ***** enabling ***** maximum exploitation of situated context | ||
| proportional | 7 | |
| 2020.acl-main.86 We show that a simple dynamic sampling strategy, selecting instances for training ***** proportional ***** to the multi-task model's current performance on a dataset relative to its single task performance, gives substantive gains over prior multi-task sampling strategies, mitigating the catastrophic forgetting that is common in multi-task learning. | ||
| Q18-1027 Our reinforcement learning model (Polite-RL) encourages politeness generation by assigning rewards ***** proportional ***** to the politeness classifier score of the sampled response. | ||
| D19-1072 LaSyn decouples direct dependence between successive latent variables, which allows its decoder to exhaustively search through the latent syntactic choices, while keeping decoding speed ***** proportional ***** to the size of the latent variable vocabulary. | ||
| 2021.starsem-1.16 We also present a comparison of task sampling methods and propose a competitive alternative to widespread ***** proportional ***** sampling strategies. | ||
| W17-2503 To compare two sentences we use a similarity matrix which has dimensions ***** proportional ***** to the size of the two sentences | ||
| nonparametric Bayesian | 7 | |
| Q17-1013 This paper presents a novel hybrid generative/discriminative model of word segmentation based on ***** nonparametric Bayesian ***** methods. | ||
| Q13-1021 This paper explores the use of Adaptor Grammars, a ***** nonparametric Bayesian ***** modelling framework, for minimally supervised morphological segmentation. | ||
| Q13-1007 We introduce a novel ***** nonparametric Bayesian ***** model for the induction of Combinatory Categorial Grammars from POS-tagged text. | ||
| D17-1192 To address the challenge, this work presents a novel ***** nonparametric Bayesian ***** formulation for the task. | ||
| P19-1645 Experiments show that the unconditional model learns predictive distributions better than character LSTM models, discovers words competitively with ***** nonparametric Bayesian ***** word segmentation models, and that modeling language conditional on visual context improves performance on both | ||
| weaknesses | 7 | |
| L12-1160 Also, the summary of errors can be displayed to provide an overall view of the MT system's ***** weaknesses *****. | ||
| L04-1244 Also, writing recommendations are used in technical contexts together with machine translation (MT) in order to circumvent the MT system's ***** weaknesses *****. | ||
| N19-1129 This includes a pilot study on paper ***** weaknesses ***** given by reviewers and on quality of author responses. | ||
| 2018.iwslt-1.9 Also, we analyze if and how the two methodologies can complement each other's ***** weaknesses *****. | ||
| L10-1076 Finally, we present the results of the evaluation in terms of the strengths, ***** weaknesses ***** and improvements identified for each of these features | ||
| pathology | 7 | |
| C18-1302 We use a broad coverage, linguistically precise English Resource Grammar (ERG) to detect negation scope in sentences taken from ***** pathology ***** reports. | ||
| 2020.lrec-1.209 Atypical speech productions, regardless of their origins (accents, learning, ***** pathology *****), need to be assessed with regard to “typical” or “expected” productions. | ||
| S17-2176 Clinical TempEval 2017 addressed the problem of temporal reasoning in the clinical domain by providing annotated clinical notes, ***** pathology ***** and radiology reports in line with Clinical TempEval challenges 2015/16, across two different evaluation phases focusing on cross domain adaptation. | ||
| L14-1221 Causality lies at the heart of biomedical knowledge, being involved in diagnosis, ***** pathology ***** or systems biology. | ||
| R17-1100 In this article we present a system that extracts information from ***** pathology ***** reports | ||
| incidental | 7 | |
| 2021.eacl-main.53 Therefore, we propose a new model, JEANS , which jointly represents multilingual KGs and text corpora in a shared embedding scheme, and seeks to improve entity alignment with ***** incidental ***** supervision signals from text. | ||
| 2021.emnlp-main.134 Real-world applications often require improved models by leveraging *a range of cheap ***** incidental ***** supervision signals*. | ||
| D19-1578 However, most language data reflect the public discourse at the time the data was produced, and hence NLP models are susceptible to learning ***** incidental ***** associations around named referents at a particular point in time, in addition to general linguistic meaning. | ||
| S18-1010 We present KOI ( Knowledge of Incidents ) , a system that given news articles as input , builds a knowledge graph ( KOI - KG ) of *****incidental***** events . | ||
| 2020.findings-emnlp.272 Many datasets have been shown to contain *****incidental***** correlations created by idiosyncrasies in the data collection process . | ||
| antonym | 7 | |
| W16-5007 We take a first step towards creating a negation dictionary by annotating all direct ***** antonym ***** pairs inWordNet using an existing typology of affixal negations. | ||
| L12-1390 We used synonym and ***** antonym ***** relations to expand the initial seed lexicon. | ||
| 2021.inlg-1.6 We address the task of ***** antonym ***** prediction in a context, which is a fill-in-the-blanks problem. | ||
| P17-1087 We present a new signed spectral normalized graph cut algorithm, signed clustering, that overlays existing thesauri upon distributionally derived vector representations of words, so that ***** antonym ***** relationships between word pairs are represented by negative weights. | ||
| E17-2012 We show that both linear models and neural networks improve on this task when they have access to a vector representing the semantic domain of the input word, e.g. a centroid of temperature words when predicting the ***** antonym ***** of `cold' | ||
| dependent | 7 | |
| D19-1321 Our model first plans a sequence of groups (each group is a subset of input items to be covered by a sentence) and then realizes each sentence conditioned on the planning result and the previously generated context, thereby decomposing long text generation into ***** dependent ***** sentence generation sub-tasks. | ||
| W19-1103 While type-theoretic semantics for natural language based on ***** dependent ***** type theory has been developed by many authors, how to assign semantic representations to interrogative sentences has been a non-trivial problem. | ||
| 2020.coling-main.109 We investigate whether Bert contains information on the selectional preferences of words, by examining the probability it assigns to the ***** dependent ***** word given the presence of a head word in a sentence. | ||
| 2020.cmcl-1.1 Additionally, dependency distance were found to be longer when the ***** dependent ***** was animate, when it was case-marked and when it was semantically similar to other preverbal ***** dependent *****s. | ||
| 2021.eval4nlp-1.13 To address this challenge, we present a metric in- ***** dependent ***** evaluation pipeline MIPE that significantly improves the correlation between evaluation metrics and human judgments on the generated code-mixed text | ||
| measurement | 7 | |
| 2021.gebnlp-1.2 Most studies in this field aim at ***** measurement ***** and debiasing methods with English as the target language. | ||
| W18-0512 Such models could have important implications for CALL systems of the future that effectively combine dialog management with ***** measurement ***** of learner conversational ability in real-time. | ||
| 2021.emnlp-main.785 However, bias scores based on these measures can suffer from ***** measurement ***** error. | ||
| 2020.signlang-1.11 The methodology has the potential to be used also as a variation ***** measurement ***** tool to quantify the difference in signing between different signers or sign languages in general | ||
| 1999.mtsummit-1.83 In this paper , I present a method for the evaluation of the quality of translated text , namely , a translation ability index , which shows the relative position of the translation ability of a Machine Translation ( MT ) system on a *****measurement***** scale . | ||
| scarce | 7 | |
| 2020.findings-emnlp.235 Existing word embedding methods do not learn numeral embeddings well because there are an infinite number of numerals and their individual appearances in training corpora are highly ***** scarce *****. | ||
| 2020.loresmt-1.12 We describe the available ***** scarce ***** parallel data suitable for training a neural machine translation model for Sorani Kurdish-English translation. | ||
| L16-1348 Pivot-based MT makes it possible to build dictionaries for language pairs that have ***** scarce ***** parallel data. | ||
| R19-1092 Finally, we regularise the loss of these models to better adapt to ***** scarce ***** data. | ||
| 2014.amta-workshop.2 For instance, the heterogeneity and the ***** scarce ***** availability of training data might contribute to significantly raise the bar | ||
| irrelevant | 7 | |
| 2012.amta-papers.12 MT errors also caused relevant sentences to appear ***** irrelevant ***** – 5-19% of sentences were relevant in human translation, but were judged ***** irrelevant ***** in MT. | ||
| 2020.emnlp-main.189 However, they sometimes result in predicting the correct answer text but in a context ***** irrelevant ***** to the given question. | ||
| D17-1097 In this paper, we make a simple observation that questions about images often contain premises – objects and relationships implied by the question – and that reasoning about premises can help Visual Question Answering (VQA) models respond more intelligently to ***** irrelevant ***** or previously unseen questions. | ||
| D19-1301 The proposed contrastive attention mechanism accommodates two categories of attention: one is the conventional attention that attends to relevant parts of the source sentence, the other is the opponent attention that attends to ***** irrelevant ***** or less relevant parts of the source sentence. | ||
| 2021.acl-long.437 However, due to the inherent one-to-many and many-to-one phenomena in human dialogues, the sampled latent variables may not correctly reflect the contexts' semantics, leading to ***** irrelevant ***** and incoherent generated responses | ||
| BLUE | 7 | |
| 2021.ecnlp-1.21 English-Hindi, English-German, English-French, and English-Czech show the ***** BLUE ***** scores of 35.09, 28.91, 34.68 and 14.52 which are the improvements of 1.61, 1.05, 1.63 and 1.94, respectively, over the baseline. | ||
| W19-5006 Inspired by the success of the General Language Understanding Evaluation benchmark, we introduce the Biomedical Language Understanding Evaluation (***** BLUE *****) benchmark to facilitate research in the development of pre-training language representations in the biomedicine domain. | ||
| 2021.bionlp-1.16 WE evaluate our model on the BLURB and ***** BLUE ***** biomedical NLP benchmarks. | ||
| L14-1201 Although Google's overall performance was better in the translation task (we have also calculated the ***** BLUE ***** and NIST scores), there are some error types that Moses was better at coping with, specially discourse level errors. | ||
| 2020.nlp4convai-1.2 Lastly, the proposed SSS framework gives an improvement of 7.95% on ***** BLUE ***** score over the baseline | ||
| aggregate | 7 | |
| D19-1245 However, the paragraph-paragraph relevance, which may ***** aggregate ***** the evidence among relevant paragraphs, can also be utilized to discover more useful paragraphs. | ||
| L08-1133 However, this approach has so far only been evaluated using automatic translation quality metrics, which are important, but ***** aggregate ***** many different factors. | ||
| 2021.emnlp-main.76 Slow emerging topic detection is a task between event detection, where we ***** aggregate ***** behaviors of different words on short period of time, and language evolution, where we monitor their long term evolution. | ||
| N18-1194 Most SLU components treats each utterance independently, and then the following components ***** aggregate ***** the multi-turn information in the separate phases. | ||
| L12-1030 In this paper we report on an extension to the keystroke logging program Inputlog in which we ***** aggregate ***** the logged process data from the keystroke (character) level to the word level | ||
| amortized variational | 7 | |
| 2020.findings-emnlp.325 Further, since we use ***** amortized variational ***** inference to train our model, we introduce two corresponding types of inference network for predicting the posterior on anchor words. | ||
| N19-1114 Since directly marginalizing over the space of latent trees is intractable, we instead apply ***** amortized variational ***** inference. | ||
| 2020.acl-main.316 Variational autoencoders (VAEs) combine latent variables with ***** amortized variational ***** inference, whose optimization usually converges into a trivial local optimum termed posterior collapse, especially in text modeling. | ||
| 2021.emnlp-main.743 For joint training, we use ***** amortized variational ***** inference and policy gradient methods. | ||
| 2020.acl-main.235 Variational Autoencoder (VAE) is widely used as a generative model to approximate a model's posterior on latent variables by combining the ***** amortized variational ***** inference and deep neural networks | ||
| chitchat | 7 | |
| 2021.sigdial-1.1 Neural generative dialogue agents have shown an increasing ability to hold short ***** chitchat ***** conversations, when evaluated by crowdworkers in controlled settings. | ||
| 2020.eval4nlp-1.5 On existing datasets for ***** chitchat ***** dialogue and open-ended sentence generation, we find that – on average – the quality estimation from a BLEU Neighbors model has a lower mean squared error and higher Spearman correlation with the ground truth than individual human annotators. | ||
| 2020.lrec-1.672 Indeed, while data-driven chatbots are typically user-friendly but not goal-oriented, QA systems tend to perform poorly at ***** chitchat *****. | ||
| 2020.sltu-1.50 The former consists of ELIZA style responses, ***** chitchat ***** expressions, and a dataset of general dialog, all of which are reusable across counselling domains. | ||
| 2020.acl-main.634 As bland and generic utterances usually dominate the frequency distribution in our daily ***** chitchat *****, avoiding them to generate more interesting responses requires complex data filtering, sampling techniques or modifying the training objective | ||
| representation models | 7 | |
| 2021.naacl-srw.3 The availability of multilingual pre-trained general ***** representation models ***** makes it possible to experiment with negation detection in languages that lack annotated data. | ||
| 2020.acl-main.422 This paper investigates contextual word ***** representation models ***** from the lens of similarity analysis. | ||
| W18-5446 We evaluate baselines that use ELMo (Peters et al., 2018), a powerful transfer learning technique, as well as state-of-the-art sentence ***** representation models *****. | ||
| W17-0811 We also present baseline scores for word ***** representation models ***** using state-of-the-art techniques for Urdu, Telugu and Marathi by evaluating them on newly created word similarity datasets. | ||
| 2020.semeval-1.53 We experimented with various state-of-the-art language ***** representation models ***** (LRMs). | ||
| Community Question | 7 | |
| S17-2060 This paper describes the systems we submitted to the task 3 ( *****Community Question***** Answering ) in SemEval 2017 which contains three subtasks on English corpora , i.e. , subtask A : Question - Comment Similarity , subtask B : Question - Question Similarity , and subtask C : Question - External Comment Similarity . | ||
| S17-2003 We describe SemEval2017 Task 3 on *****Community Question***** Answering . | ||
| S19-2198 Since the resources of *****Community Question***** Answering are abundant and information sharing becomes universal , it will be increasingly difficult to find factual information for questioners in massive messages . | ||
| S17-2047 We describe our system for participating in SemEval-2017 Task 3 on *****Community Question***** Answering . | ||
| S17-2009 In this paper we present the system for Answer Selection and Ranking in *****Community Question***** Answering , which we build as part of our participation in SemEval-2017 Task 3 . | ||
| business | 7 | |
| 2020.coling-main.195 We describe the collection of transcription corrections and grammatical error annotations for the CrowdED Corpus of spoken English monologues on *****business***** topics . | ||
| 2021.emnlp-main.509 For many *****business***** applications , we often seek to analyze sentiments associated with any arbitrary aspects of commercial products , despite having a very limited amount of labels or even without any labels at all . | ||
| N18-3016 We address the problem of determining entity - oriented polarity in *****business***** news . | ||
| L12-1659 On the Linguistic Data Consortium 's ( LDC ) 20th anniversary , this paper describes the changes to the language resource landscape over the past two decades , how LDC has adjusted its practice to adapt to them and how the *****business***** model continues to grow . | ||
| 2021.acl-long.314 We present an annotation approach to capturing emotional and cognitive empathy in student - written peer reviews on *****business***** models in German . | ||
| Cross - lingual word | 7 | |
| P19-1489 *****Cross - lingual word***** embeddings encode the meaning of words from different languages into a shared low - dimensional space . | ||
| 2020.emnlp-main.482 *****Cross - lingual word***** embeddings transfer knowledge between languages : models trained on high - resource languages can predict in low - resource languages . | ||
| D18-1027 *****Cross - lingual word***** embeddings are becoming increasingly important in multilingual NLP . | ||
| 2020.lrec-1.330 *****Cross - lingual word***** embeddings create a shared space for embeddings in two languages , and enable knowledge to be transferred between languages for tasks such as bilingual lexicon induction . | ||
| 2021.starsem-1.29 *****Cross - lingual word***** embeddings provide a way for information to be transferred between languages . | ||
| Online Reviews and | 7 | |
| S19-2221 We present a system for cross - domain suggestion mining , prepared for the SemEval-2019 Task 9 : Suggestion Mining from *****Online Reviews and***** Forums ( Subtask B ) . | ||
| S19-2220 This paper describes our system , Joint Encoders for Stable Suggestion Inference ( JESSI ) , for the SemEval 2019 Task 9 : Suggestion Mining from *****Online Reviews and***** Forums . | ||
| S19-2152 This paper presents our system to the SemEval-2019 Task 9 , Suggestion Mining from *****Online Reviews and***** Forums . | ||
| S19-2212 In this paper , we describe a suggestion mining system that participated in SemEval 2019 Task 9 , SubTask A - Suggestion Mining from *****Online Reviews and***** Forums . | ||
| S19-2218 This paper describes the suggestion miner system that participates in SemEval 2019 Task 9 - SubTask A - Suggestion Mining from *****Online Reviews and***** Forums . | ||
| neural dialogue | 7 | |
| 2021.acl-long.57 In this paper , we propose Inverse Adversarial Training ( IAT ) algorithm for training *****neural dialogue***** systems to avoid generic responses and model dialogue history better . | ||
| W19-8609 We investigate the impact of search strategies in *****neural dialogue***** modeling . | ||
| D19-1463 How to incorporate external knowledge into a *****neural dialogue***** model is critically important for dialogue systems to behave like real humans . | ||
| 2020.findings-emnlp.368 In this paper , we propose a meta - learning based semi - supervised explicit dialogue state tracker ( SEDST ) for *****neural dialogue***** generation , denoted as MEDST . | ||
| 2021.naacl-main.121 Knowledge is now starting to power *****neural dialogue***** agents . | ||
| Transformer - based language | 7 | |
| 2021.acl-long.70 *****Transformer - based language***** models benefit from conditioning on contexts of hundreds to thousands of previous tokens . | ||
| 2021.cl-2.14 Abstract *****Transformer - based language***** models have taken many fields in NLP by storm . | ||
| 2021.acl-long.289 *****Transformer - based language***** models pre - trained on large amounts of text data have proven remarkably successful in learning generic transferable linguistic representations . | ||
| 2020.coling-main.67 *****Transformer - based language***** models achieve high performance on various tasks , but we still lack understanding of the kind of linguistic knowledge they learn and rely on . | ||
| 2021.emnlp-main.112 Supervised systems have nowadays become the standard recipe for Word Sense Disambiguation ( WSD ) , with *****Transformer - based language***** models as their primary ingredient . | ||
| name | 7 | |
| N19-1383 We focus on improving *****name***** tagging for low - resource languages using annotations from related languages . | ||
| C16-1140 We propose an approach to Named Entity Disambiguation that avoids a problem of standard work on the task ( likewise affecting fully supervised , weakly supervised , or distantly supervised machine learning techniques ): the treatment of *****name***** mentions referring to people with no ( or very little ) coverage in the textual training data is systematically incorrect . | ||
| 2020.lrec-1.243 Most of the current cross - lingual transfer learning methods for Information Extraction ( IE ) have been only applied to *****name***** tagging . | ||
| W18-2319 We present a comprehensive evaluation framework including both intrinsic and extrinsic evaluation that can be expanded to named entities beyond drug *****name***** . | ||
| L14-1310 Word Segmentation is usually considered an essential step for many Chinese and Japanese Natural Language Processing tasks , such as *****name***** tagging . | ||
| Automatic Speech | 7 | |
| 2020.sltu-1.43 *****Automatic Speech***** Recognition for low - resource languages has been an active field of research for more than a decade . | ||
| L14-1200 In this paper we present 3 applications in the domain of *****Automatic Speech***** Recognition for Dutch , all of which are developed using our in - house speech recognition toolkit SPRAAK . | ||
| 2020.parlaclarin-1.8 Challenges of Applying *****Automatic Speech***** Recognition for Transcribing EUParliament Committee Meetings : A Pilot StudyHugo de Vos and Suzan VerberneInstitute of Public Administration and Leiden Institute of Advanced Computer Science , Leiden Universityh.p.de.vos@fgga.leidenuniv.nl , s.verberne@liacs.leidenuniv.nlAbstractWe tested the feasibility of automatically transcribing committee meetings of the European Union parliament with the use of AutomaticSpeech Recognition techniques . | ||
| 2016.jeptalnrecital-invite.2 Thirty years ago , in order to get past roadblocks in Machine Translation and *****Automatic Speech***** Recognition , DARPA invented a new way to organize and manage technological R&D : a common task is defined by a formal quantitative evaluation metric and a body of shared training data , and researchers join an open competition to compare approaches . | ||
| L10-1075 This paper describes the Alborada - I3A corpus of disordered speech , acquired during the recent years for the research in different speech technologies for the handicapped like *****Automatic Speech***** Recognition or pronunciation assessment . | ||
| Offensive Language | 7 | |
| 2020.semeval-1.261 In this paper , we describe the participation of IITP - AINLPML team in the SemEval-2020 SharedTask 12 on *****Offensive Language***** Identification and Target Categorization in English Twitter data . | ||
| 2021.dravidianlangtech-1.27 This paper introduces the system description of the HUB team participating in DravidianLangTech - EACL2021 : *****Offensive Language***** Identification in Dravidian Languages . | ||
| S19-2134 We describe our system ( TKaSt ) submitted for Task 6 : *****Offensive Language***** Classification , at SemEval 2019 . | ||
| 2021.dravidianlangtech-1.25 This paper demonstrates our work for the shared task on *****Offensive Language***** Identification in Dravidian Languages - EACL 2021 . | ||
| 2021.dravidianlangtech-1.20 This article introduces the system for the shared task of *****Offensive Language***** Identification in Dravidian Languages - EACL 2021 . | ||
| Deep Learning | 7 | |
| P19-2041 Using pre - trained word embeddings in conjunction with *****Deep Learning***** models has become the de facto approach in Natural Language Processing ( NLP ) . | ||
| 2021.deelio-1.11 This paper presents a way to inject and leverage existing knowledge from external sources in a *****Deep Learning***** environment , extending the recently proposed Recurrent Independent Mechnisms ( RIMs ) architecture , which comprises a set of interacting yet independent modules . | ||
| L16-1330 Lately , with the success of *****Deep Learning***** techniques in some computational linguistics tasks , many researchers want to explore new models for their linguistics applications . | ||
| C18-1278 We present a system for Answer Selection that integrates fine - grained Question Classification with a *****Deep Learning***** model designed for Answer Selection . | ||
| D19-5707 In this work , we introduce a *****Deep Learning***** architecture for pharmaceutical and chemical Named Entity Recognition in Spanish clinical cases texts . | ||
| Open - domain question | 7 | |
| N19-1030 *****Open - domain question***** answering remains a challenging task as it requires models that are capable of understanding questions and answers , collecting useful information , and reasoning over evidence . | ||
| 2021.emnlp-main.756 *****Open - domain question***** answering answers a question based on evidence retrieved from a large corpus . | ||
| 2021.acl-long.518 *****Open - domain question***** answering can be reformulated as a phrase retrieval problem , without the need for processing documents on - demand during inference ( Seo et al . , 2019 ) . | ||
| 2020.emnlp-main.550 *****Open - domain question***** answering relies on efficient passage retrieval to select candidate contexts , where traditional sparse vector space models , such as TF - IDF or BM25 , are the de facto method . | ||
| 2021.naacl-srw.9 *****Open - domain question***** answering aims at locating the answers to user - generated questions in massive collections of documents . | ||
| massive | 7 | |
| D18-1336 Autoregressive decoding is the only part of sequence - to - sequence models that prevents them from *****massive***** parallelization at inference time . | ||
| K17-1012 Word embeddings are widely used in Natural Language Processing , mainly due to their success in capturing semantic information from *****massive***** corpora . | ||
| W17-4301 Recent advances in GPU hardware have enabled neural networks to achieve significant gains over the previous best models , these models still fail to leverage GPUs ' capability for *****massive***** parallelism due to their requirement of sequential processing of the sentence . | ||
| 2021.blackboxnlp-1.7 Large scale language models encode rich commonsense knowledge acquired through exposure to *****massive***** data during pre - training , but their understanding of entities and their semantic properties is unclear . | ||
| W17-2620 End - to - end training of automated speech recognition ( ASR ) systems requires *****massive***** data and compute resources . | ||
| Named Entity Recognition ( NER ) | 7 | |
| L16-1089 In this paper we explain how we created a labelled corpus in English for a *****Named Entity Recognition ( NER )***** task from multi - source and multi - domain data , for an industrial partner . | ||
| W16-3927 Many of the existing *****Named Entity Recognition ( NER )***** solutions are built based on news corpus data with proper syntax . | ||
| 2021.acl-long.558 Recent years have seen the paradigm shift of *****Named Entity Recognition ( NER )***** systems from sequence labeling to span prediction . | ||
| 2021.bsnlp-1.10 This document describes our participation at the 3rd Shared Task on SlavNER , part of the 8th Balto - Slavic Natural Language Processing Workshop , where we focused exclusively in the *****Named Entity Recognition ( NER )***** task . | ||
| 2021.emnlp-main.18 To alleviate label scarcity in *****Named Entity Recognition ( NER )***** task , distantly supervised NER methods are widely applied to automatically label data and identify entities . | ||
| Greek | 7 | |
| 2021.nllp-1.6 In this work , we study the task of classifying legal texts written in the *****Greek***** language . | ||
| 2020.loresmt-1.13 In this paper we present a new ensemble method , Continuous Bag - of - Skip - grams ( CBOS ) , that produces high - quality word representations putting emphasis on the *****Greek***** language . | ||
| 1991.iwpt-1.5 These include technical notation such as subscripts , superscripts and numeric and algebraic expressions as well as *****Greek***** letters , italics , small capitals , brackets and punctuation marks . | ||
| 2021.eacl-srw.4 In this work , we present a methodology that aims at bridging the gap between high and low - resource languages in the context of Open Information Extraction , showcasing it on the *****Greek***** language . | ||
| 2020.semeval-1.262 This paper describes our participation in OffensEval challenges for English , Arabic , Danish , Turkish , and *****Greek***** languages . | ||
| Danish | 7 | |
| 2020.semeval-1.285 The OffensEval 2020 had three subtasks : A ) Identifying the tweets to be offensive ( OFF ) or non - offensive ( NOT ) for Arabic , *****Danish***** , English , Greek , and Turkish languages , B ) Detecting if the offensive tweet is targeted ( TIN ) or untargeted ( UNT ) for the English language , and C ) Categorizing the offensive targeted tweets into three classes , namely : individual ( IND ) , Group ( GRP ) , or Other ( OTH ) for the English language . | ||
| 2020.lrec-1.403 Although Denmark is one of the most digitized countries in Europe , no coordinated efforts have been made in recent years to support the *****Danish***** language with regard to language technology and artificial intelligence . | ||
| L08-1349 This paper presents a feasibility study of a merge between SprogTeknologisk Ordbase ( STO ) , which contains morphological and syntactic information , and DanNet , which is a *****Danish***** WordNet containing semantic information in terms of synonym sets and semantic relations . | ||
| 2019.gwc-1.16 In this paper we describe the merge of the *****Danish***** wordnet , DanNet , with Princeton Wordnet applying a two - step approach . | ||
| L12-1097 This paper discusses how information on properties in a currently developed *****Danish***** thesaurus can be transferred to the Danish wordnet , DanNet , and in this way enrich the wordnet with the highly relevant links between properties and their external arguments ( i.e. | ||
| Hate speech | 7 | |
| N18-2019 *****Hate speech***** detection is a critical , yet challenging problem in Natural Language Processing ( NLP ) . | ||
| 2021.nlp4if-1.3 *****Hate speech***** detection is an actively growing field of research with a variety of recently proposed approaches that allowed to push the state - of - the - art results . | ||
| W19-3516 *****Hate speech***** detectors must be applicable across a multitude of services and platforms , and there is hence a need for detection approaches that do not depend on any information specific to a given platform . | ||
| 2020.trac-1.7 *****Hate speech***** detection in social media communication has become one of the primary concerns to avoid conflicts and curb undesired activities . | ||
| 2020.semeval-1.292 *****Hate speech***** detection on social media platforms is crucial as it helps to avoid severe situations , and severe harm to marginalized people and groups . | ||
| neural NLP | 7 | |
| 2020.acl-main.267 Pooling is an important technique for learning text representations in many *****neural NLP***** models . | ||
| 2021.naacl-main.223 The input vocabulary and the representations learned are crucial to the performance of *****neural NLP***** models . | ||
| 2021.emnlp-main.251 Recent studies have shown that deep neural network - based models are vulnerable to intentionally crafted adversarial examples , and various methods have been proposed to defend against adversarial word - substitution attacks for *****neural NLP***** models . | ||
| D19-1235 Discourse parsing could not yet take full advantage of the *****neural NLP***** revolution , mostly due to the lack of annotated datasets . | ||
| 2020.findings-emnlp.24 Gradient - based analysis methods , such as saliency map visualizations and adversarial input perturbations , have found widespread use in interpreting *****neural NLP***** models due to their simplicity , flexibility , and most importantly , the fact that they directly reflect the model internals . | ||
| high | 7 | |
| 2021.emnlp-main.157 Accordingly , we propose a new computational task which is tuned to the available knowledge and interests in an Indigenous community , and which supports the construction of *****high***** quality texts and lexicons . | ||
| L08-1438 Although the World Wide Web has late become an important source to consult for the meaning of words , a number of technical terms related to *****high***** technology are not found on the Web . | ||
| 2004.amta-papers.7 Spoken Translation , Inc. ( STI ) of Berkeley , CA has developed a commercial system for interactive speech - to - speech machine translation designed for both *****high***** accuracy and broad linguistic and topical coverage . | ||
| W18-6113 In text classification , the problem of overfitting arises due to the *****high***** dimensionality , making regularization essential . | ||
| 2021.bea-1.23 Using data from a large - scale medical licensing exam , clustering methods identified items that were similar with respect to their relative difficulty and relative response - time intensiveness to create low response process complexity and *****high***** response process complexity item classes . | ||
| Pre - trained | 7 | |
| 2020.findings-emnlp.264 *****Pre - trained***** models like BERT ( ( Devlin et al . , 2018 ) have dominated NLP / IR applications such as single sentence classification , text pair classification , and question answering . | ||
| 2020.emnlp-main.57 *****Pre - trained***** Transformers have enabled impressive breakthroughs in generating long and fluent text , yet their outputs are often rambling without coherently arranged content . | ||
| 2021.emnlp-main.119 *****Pre - trained***** LMs have shown impressive performance on downstream NLP tasks , but we have yet to establish a clear understanding of their sophistication when it comes to processing , retaining , and applying information presented in their input . | ||
| P19-1127 *****Pre - trained***** embeddings such as word embeddings and sentence embeddings are fundamental tools facilitating a wide range of downstream NLP tasks . | ||
| 2020.emnlp-main.21 *****Pre - trained***** Transformers are now ubiquitous in natural language processing , but despite their high end - task performance , little is known empirically about whether they are calibrated . | ||
| resource - poor | 7 | |
| 2020.wildre-1.7 This paper presents the first dependency treebank for Bhojpuri , a *****resource - poor***** language that belongs to the Indo - Aryan language family . | ||
| 2018.gwc-1.54 WordNet or ontology development for *****resource - poor***** languages like Persian , requires composition of several strategies and employment of appropriate heuristics . | ||
| C16-1044 This work focuses on the development of linguistic analysis tools for *****resource - poor***** languages . | ||
| 2020.lrec-1.475 Language - independent tokenisation ( LIT ) methods that do not require labelled language resources or lexicons have recently gained popularity because of their applicability in *****resource - poor***** languages . | ||
| C16-1047 In this paper , we propose a novel hybrid deep learning archtecture which is highly efficient for sentiment analysis in *****resource - poor***** languages . | ||
| Detecting and Rating | 7 | |
| 2021.semeval-1.35 This paper describes our system participated in Task 7 of SemEval-2021 : *****Detecting and Rating***** Humor and Offense . | ||
| 2021.semeval-1.34 This paper introduces the result of Team Grenzlinie 's experiment in SemEval-2021 task 7 : HaHackathon : *****Detecting and Rating***** Humor and Offense . | ||
| 2021.semeval-1.165 This article introduces the submission of subtask 1 and subtask 2 that we participate in SemEval-2021 Task 7 : HaHackathon : *****Detecting and Rating***** Humor and Offense , we use a model based on ALBERT that uses ALBERT as the module for extracting text features . | ||
| 2021.semeval-1.154 This paper describes our contribution to SemEval-2021 Task 7 : *****Detecting and Rating***** Humor and Of - fense . This task contains two sub - tasks , sub - task 1and sub - task 2 . | ||
| 2021.semeval-1.158 This paper describes the winning system for SemEval-2021 Task 7 : *****Detecting and Rating***** Humor and Offense . | ||
| Recurrent Neural Networks ( RNNs | 7 | |
| P19-1233 Current state - of - the - art systems for sequence labeling are typically based on the family of *****Recurrent Neural Networks ( RNNs***** ) . | ||
| W19-4815 The generalized Dyck language has been used to analyze the ability of *****Recurrent Neural Networks ( RNNs***** ) to learn context - free grammars ( CFGs ) . | ||
| P18-2117 While *****Recurrent Neural Networks ( RNNs***** ) are famously known to be Turing complete , this relies on infinite precision in the states and unbounded computation time . | ||
| 2020.conll-1.13 *****Recurrent Neural Networks ( RNNs***** ) have been shown to capture various aspects of syntax from raw linguistic input . | ||
| P19-1385 In this paper , we present a novel approach for incorporating external knowledge in *****Recurrent Neural Networks ( RNNs***** ) . | ||
| Multimodal sentiment | 7 | |
| 2021.acl-long.412 *****Multimodal sentiment***** analysis is the challenging research area that attends to the fusion of multiple heterogeneous modalities . | ||
| P17-1081 *****Multimodal sentiment***** analysis is a developing area of research , which involves the identification of sentiments in videos . | ||
| 2020.ccl-1.101 *****Multimodal sentiment***** analysis aims to learn a joint representation of multiple features . | ||
| W18-3305 *****Multimodal sentiment***** classification in practical applications may have to rely on erroneous and imperfect views , namely ( a ) language transcription from a speech recognizer and ( b ) under - performing acoustic views . | ||
| 2021.emnlp-main.21 *****Multimodal sentiment***** analysis is a trending area of research , and multimodal fusion is one of its most active topic . | ||
| information retrieval ( IR | 7 | |
| C16-1162 Ordinal regression which is known with learning to rank has long been used in *****information retrieval ( IR***** ) . | ||
| D19-1540 A core problem of *****information retrieval ( IR***** ) is relevance matching , which is to rank documents by relevance to a user 's query . | ||
| L10-1242 This paper describes the development of a structured document collection containing user - generated text and numerical metadata for exploring the exploitation of metadata in *****information retrieval ( IR***** ) . | ||
| 2021.sdp-1.2 One of the challenges in *****information retrieval ( IR***** ) is the vocabulary mismatch problem , which happens when the terms between queries and documents are lexically different but semantically similar . | ||
| I17-2036 This paper presents an initial study on hyperspherical query likelihood models ( QLMs ) for *****information retrieval ( IR***** ) . | ||
| Political | 7 | |
| P19-1463 *****Political***** debates offer a rare opportunity for citizens to compare the candidates ' positions on the most controversial topics of the campaign . | ||
| W19-2112 *****Political***** discourse on social media microblogs , specifically Twitter , has become an undeniable part of mainstream U.S. politics . | ||
| 2021.eacl-main.309 *****Political***** discussions revolve around ideological conflicts that often split the audience into two opposing parties . | ||
| 2020.winlp-1.28 *****Political***** campaigns are full of political ads posted by candidates on social media . | ||
| 2021.calcs-1.1 *****Political***** discourse is one of the most interesting data to study power relations in the framework of Critical Discourse Analysis . | ||
| maximum | 7 | |
| 2020.emnlp-main.448 Despite strong performance on a variety of tasks , neural sequence models trained with *****maximum***** likelihood have been shown to exhibit issues such as length bias and degenerate repetition . | ||
| 2021.spnlp-1.5 Despite its wide use , recent studies have revealed unexpected and undesirable properties of neural autoregressive sequence models trained with *****maximum***** likelihood , such as an unreasonably high affinity to short sequences after training and to infinitely long sequences at decoding time . | ||
| 1999.mtsummit-1.46 In this paper we describe a language recognition algorithm for multilingual documents that is based on mixed - order n - grams , Markov chains , *****maximum***** likelihood , and dynamic programming . | ||
| 2020.inlg-1.18 In language generation models conditioned by structured data , the classical training via *****maximum***** likelihood almost always leads models to pick up on dataset divergence ( i.e. , hallucinations or omissions ) , and to incorporate them erroneously in their own generations at inference . | ||
| D17-1231 Previous work on dialog act ( DA ) classification has investigated different methods , such as hidden Markov models , *****maximum***** entropy , conditional random fields , graphical models , and support vector machines . | ||
| Multi - task learning ( MTL | 7 | |
| 2020.bionlp-1.22 *****Multi - task learning ( MTL***** ) has achieved remarkable success in natural language processing applications . | ||
| I17-2010 *****Multi - task learning ( MTL***** ) has recently contributed to learning better representations in service of various NLP tasks . | ||
| N19-1249 *****Multi - task learning ( MTL***** ) has been studied recently for sequence labeling . | ||
| 2020.acl-main.268 *****Multi - task learning ( MTL***** ) and transfer learning ( TL ) are techniques to overcome the issue of data scarcity when training state - of - the - art neural networks . | ||
| N19-1355 *****Multi - task learning ( MTL***** ) has achieved success over a wide range of problems , where the goal is to improve the performance of a primary task using a set of relevant auxiliary tasks . | ||
| relation extraction ( RE | 7 | |
| D19-1397 In recent years there is a surge of interest in applying distant supervision ( DS ) to automatically generate training data for *****relation extraction ( RE***** ) . | ||
| D19-1039 Distant supervision ( DS ) has been widely used to automatically construct ( noisy ) labeled data for *****relation extraction ( RE***** ) . | ||
| C16-1139 Distant supervision is an efficient approach that automatically generates labeled data for *****relation extraction ( RE***** ) . | ||
| W19-5004 Systematic comparison of methods for *****relation extraction ( RE***** ) is difficult because many experiments in the field are not described precisely enough to be completely reproducible and many papers fail to report ablation studies that would highlight the relative contributions of their various combined techniques . | ||
| 2021.emnlp-main.228 Most recent studies for *****relation extraction ( RE***** ) leverage the dependency tree of the input sentence to incorporate syntax - driven contextual information to improve model performance , with little attention paid to the limitation where high - quality dependency parsers in most cases unavailable , especially for in - domain scenarios . | ||
| set | 7 | |
| K18-2023 We augment the deep Biaffine ( BiAF ) parser ( Dozat and Manning , 2016 ) with novel features to perform competitively : we utilize an indomain version of ELMo features ( Peters et al . , 2018 ) which provide context - dependent word representations ; we utilize disambiguated , embedded , morphosyntactic features from lexicons ( Sagot , 2018 ) , which complements the existing feature *****set***** . | ||
| 2000.iwpt-1.25 This implementation combines three basic approaches : a single word tagger based on decision trees , a POS tagger based on variable memory Markov models , and a feature structures *****set***** of tags . | ||
| 2021.eacl-main.2 However , a naive model trained only using the targeted ( ` positive ' ) document set may generate too generic questions that cover a larger scope than delineated by the document *****set***** . | ||
| S17-2090 In the parsing subtask , participants were asked to produce Abstract Meaning Representation ( AMR ) ( Banarescu et al . , 2013 ) graphs for a *****set***** of English sentences in the biomedical domain . | ||
| 2020.coling-main.45 Given a finite set of candidate authors and corresponding labeled texts , the objective is to determine which of the authors has written another *****set***** of anonymous or disputed texts . | ||
| Spoken Language | 7 | |
| 2021.eacl-main.248 Typical ASR systems segment the input audio into utterances using purely acoustic information , which may not resemble the sentence - like units that are expected by conventional machine translation ( MT ) systems for *****Spoken Language***** Translation . | ||
| 2020.emnlp-main.588 *****Spoken Language***** Understanding infers semantic meaning directly from audio data , and thus promises to reduce error propagation and misunderstandings in end - user applications . | ||
| N19-2002 In this paper , we introduce an approach for leveraging available data across multiple locales sharing the same language to 1 ) improve domain classification model accuracy in *****Spoken Language***** Understanding and user experience even if new locales do not have sufficient data and 2 ) reduce the cost of scaling the domain classifier to a large number of locales . | ||
| 2010.iwslt-evaluation.10 This paper presents the submissions of the PRHLT group for the evaluation campaign of the International Workshop on *****Spoken Language***** Translation . | ||
| 2011.iwslt-papers.7 Punctuation prediction is an important task in *****Spoken Language***** Translation . | ||
| single | 7 | |
| W18-6301 Sequence to sequence learning models still require several days to reach state of the art performance on large benchmark datasets using a *****single***** machine . | ||
| W17-2711 By developing this dataset , we also introduce a new task , the StoryLine Extraction from news data , which aims at extracting and classifying events relevant for stories , from across news documents spread in time and clustered around a *****single***** seminal event or topic . | ||
| 2021.naacl-main.417 Existing works in multimodal affective computing tasks , such as emotion recognition and personality recognition , generally adopt a two - phase pipeline by first extracting feature representations for each *****single***** modality with hand crafted algorithms , and then performing end - to - end learning with extracted features . | ||
| S18-1066 This paper presents our *****single***** model to Subtask 1 of SemEval 2018 Task 2 : Emoji Prediction in English . | ||
| 2021.emnlp-main.171 Recently , the focus of dialogue state tracking has expanded from *****single***** domain to multiple domains . | ||
| SemEval 2019 Task | 7 | |
| S19-2206 The *****SemEval 2019 Task***** 8 on Fact - Checking in community question answering forums aimed to classify questions into categories and verify the correctness of answers given on the QatarLiving public forum . | ||
| S19-2208 This paper describes the participation of DBMS - KU team in the *****SemEval 2019 Task***** 9 , that is , suggestion mining from online reviews and forums . | ||
| S19-2068 This article describes the strategy submitted by the CiTIUS - COLE team to *****SemEval 2019 Task***** 5 , a task which consists of binary classi- fication where the system predicting whether a tweet in English or in Spanish is hateful against women or immigrants or not . | ||
| S19-2004 We describe our solutions for semantic frame and role induction subtasks of *****SemEval 2019 Task***** 2 . | ||
| S19-2183 This paper describes our system for the *****SemEval 2019 Task***** 4 on hyperpartisan news detection . | ||
| Product | 7 | |
| 2021.mtsummit-research.20 *****Product***** reviews provide valuable feedback of the customers and however and they are available today only in English on most of the e - commerce platforms . | ||
| 2021.acl-long.29 *****Product***** reviews contain a large number of implicit aspects and implicit opinions . | ||
| 2020.ecomnlp-1.2 *****Product***** descriptions in e - commerce platforms contain detailed and valuable information about retailers assortment . | ||
| 2020.ecnlp-1.9 *****Product***** reviews are a huge source of natural language data in e - commerce applications . | ||
| 2020.ecomnlp-1.7 *****Product***** matching , i.e. , being able to infer the product being sold for a merchant - created offer , is crucial for any e - commerce marketplace , enabling product - based navigation , price comparisons , product reviews , etc . | ||
| web - based annotation | 7 | |
| L12-1385 In recent months , LDC has developed a *****web - based annotation***** infrastructure centered around a tree model of annotations and a Ruby on Rails application called the LDC User Interface ( LUI ) . | ||
| L14-1636 We introduce GraPAT , a *****web - based annotation***** tool for building graph structures over text . | ||
| D19-3033 We introduce Redcoat , a *****web - based annotation***** tool that supports collaborative hierarchical entity typing . | ||
| W16-4011 We introduce the third major release of WebAnno , a generic *****web - based annotation***** tool for distributed teams . | ||
| 2020.lrec-1.854 This paper introduces TIARA , a new publicly available *****web - based annotation***** tool for discourse relations and sentence reordering . | ||
| healthcare | 7 | |
| 2020.wnut-1.55 COVID-19 pandemic has become the trending topic on twitter and people are interested in sharing diverse information ranging from new cases , *****healthcare***** guidelines , medicine , and vaccine news . | ||
| 2021.nlp4posimpact-1.16 Technologies for enhancing well - being , *****healthcare***** vigilance and monitoring are on the rise . | ||
| W18-4916 This paper presents a treebank for the *****healthcare***** domain developed at ezDI . | ||
| W19-3643 A common challenge in the healthcare industry today is physicians have access to massive amounts of *****healthcare***** data but have little time and no appropriate tools . | ||
| 2020.lrec-1.578 Natural Language Processing ( NLP ) can help unlock the vast troves of unstructured data in clinical text and thus improve *****healthcare***** research . | ||
| E - commerce | 7 | |
| 2020.ecomnlp-1.9 *****E - commerce***** sites include advertising slogans along with information regarding an item . | ||
| W18-6530 *****E - commerce***** platforms present products using titles that summarize product information . | ||
| 2020.emnlp-main.188 Product - related question answering platforms nowadays are widely employed in many *****E - commerce***** sites , providing a convenient way for potential customers to address their concerns during online shopping . | ||
| 2021.ecnlp-1.19 *****E - commerce***** stores collect customer feedback to let sellers learn about customer concerns and enhance customer order experience . | ||
| D19-1019 Customers ask questions and customer service staffs answer their questions , which is the basic service model via multi - turn customer service ( CS ) dialogues on *****E - commerce***** platforms . | ||
| Grammatical error | 7 | |
| D18-1541 *****Grammatical error***** correction , like other machine learning tasks , greatly benefits from large quantities of high quality training data , which is typically expensive to produce . | ||
| W19-4423 *****Grammatical error***** correction can be viewed as a low - resource sequence - to - sequence task , because publicly available parallel corpora are limited . To tackle this challenge , we first generate erroneous versions of large unannotated corpora using a realistic noising function . | ||
| I17-4012 *****Grammatical error***** diagnosis is an important task in natural language processing . | ||
| W16-4917 *****Grammatical error***** diagnosis is an essential part in a language - learning tutoring system . | ||
| W16-4907 *****Grammatical error***** diagnosis is an important task in natural language processing . | ||
| Distributed representations of | 7 | |
| N19-1331 *****Distributed representations of***** sentences have become ubiquitous in natural language processing tasks . | ||
| W19-5015 *****Distributed representations of***** text can be used as features when training a statistical classifier . | ||
| N18-1043 *****Distributed representations of***** words learned from text have proved to be successful in various natural language processing tasks in recent times . | ||
| C18-1216 *****Distributed representations of***** words play a major role in the field of natural language processing by encoding semantic and syntactic information of words . | ||
| D19-1450 *****Distributed representations of***** words which map each word to a continuous vector have proven useful in capturing important linguistic information not only in a single language but also across different languages . | ||
| Natural Language Generation ( NLG ) | 7 | |
| L16-1575 As data - driven approaches started to make their way into the *****Natural Language Generation ( NLG )***** domain , the need for automation of corpus building and extension became apparent . | ||
| 2020.findings-emnlp.17 As a crucial component in task - oriented dialog systems , the *****Natural Language Generation ( NLG )***** module converts a dialog act represented in a semantic form into a response in natural language . | ||
| W19-8648 Rating and Likert scales are widely used in evaluation experiments to measure the quality of *****Natural Language Generation ( NLG )***** systems . | ||
| 2020.evalnlgeval-1.1 The evaluation of *****Natural Language Generation ( NLG )***** systems has recently aroused much interest in the research community , since it should address several challenging aspects , such as readability of the generated texts , adequacy to the user within a particular context and moment and linguistic quality - related issues ( e.g. , correctness , coherence , understandability ) , among others . | ||
| W19-8643 Currently , there is little agreement as to how *****Natural Language Generation ( NLG )***** systems should be evaluated . | ||
| Fine - Grained Sentiment | 7 | |
| S17-2145 This paper describes the approach we used for SemEval-2017 Task 5 : *****Fine - Grained Sentiment***** Analysis on Financial Microblogs . | ||
| S17-2147 We present the system developed by the team DUTH for the participation in Semeval-2017 task 5 - *****Fine - Grained Sentiment***** Analysis on Financial Microblogs and News , in subtasks A and B. | ||
| S17-2142 This paper describes a supervised solution for detecting the polarity scores of tweets or headline news in the financial domain , submitted to the SemEval 2017 *****Fine - Grained Sentiment***** Analysis on Financial Microblogs and News Task . | ||
| S17-2152 This paper describes our systems submitted to the *****Fine - Grained Sentiment***** Analysis on Financial Microblogs and News task ( i.e. , Task 5 ) in SemEval-2017 . | ||
| S17-2151 This paper describes our submission to Task 5 of SemEval 2017 , *****Fine - Grained Sentiment***** Analysis on Financial Microblogs and News , where we limit ourselves to performing sentiment analysis on news headlines only ( track 2 ) . | ||
| Statistical Machine | 7 | |
| 2014.iwslt-papers.9 *****Statistical Machine***** Translation produces results that make it a competitive option in most machine - assisted translation scenarios . | ||
| 2005.mtsummit-papers.37 In *****Statistical Machine***** Translation , the use of reordering for certain language pairs can produce a significant improvement on translation accuracy . | ||
| L12-1173 The task of *****Statistical Machine***** Translation depends on large amounts of training corpora . | ||
| C16-1299 We present a novel fusion model for domain adaptation in *****Statistical Machine***** Translation . | ||
| L12-1585 In *****Statistical Machine***** Translation , words that were not seen during training are unknown words , that is , words that the system will not know how to translate . | ||
| suicidal | 7 | |
| 2021.acl-short.133 Social media has become a valuable resource for the study of *****suicidal***** ideation and the assessment of suicide risk . | ||
| 2021.eacl-main.205 Recent psychological studies indicate that individuals exhibiting *****suicidal***** ideation increasingly turn to social media rather than mental health practitioners . | ||
| N19-3019 Suicide is a leading cause of death among youth and the use of social media to detect *****suicidal***** ideation is an active line of research . | ||
| 2021.naacl-main.176 Recent psychological studies indicate that individuals exhibiting *****suicidal***** ideation increasingly turn to social media rather than mental health practitioners . | ||
| 2020.findings-emnlp.200 Using a suicide dictionary created by mental health experts is one of the effective ways to detect *****suicidal***** ideation . | ||
| aspect - based sentiment analysis ( ABSA | 7 | |
| 2021.acl-short.63 Existing works for *****aspect - based sentiment analysis ( ABSA***** ) have adopted a unified approach , which allows the interactive relations among subtasks . | ||
| 2020.lrec-1.840 We show how the general fine - grained opinion mining concepts of opinion target and opinion expression are related to *****aspect - based sentiment analysis ( ABSA***** ) and discuss their benefits for resource creation over popular ABSA annotation schemes . | ||
| 2021.eacl-main.170 Target - oriented opinion words extraction ( TOWE ) is a subtask of *****aspect - based sentiment analysis ( ABSA***** ) . | ||
| 2020.acl-main.293 The *****aspect - based sentiment analysis ( ABSA***** ) consists of two conceptual tasks , namely an aspect extraction and an aspect sentiment classification . | ||
| 2020.emnlp-main.572 The supervised models for *****aspect - based sentiment analysis ( ABSA***** ) rely heavily on labeled data . | ||
| Indonesian | 7 | |
| 2016.gwc-1.33 This paper describes our attempts to add *****Indonesian***** definitions to synsets in the Wordnet Bahasa ( Nurril Hirfana Mohamed Noor et al . , 2011 ; Bond et al . , 2014 ) , to extract semantic relations between lemmas and definitions for nouns and verbs , such as synonym , hyponym , hypernym and instance hypernym , and to generally improve Wordnet . | ||
| 2021.emnlp-main.833 We present IndoBERTweet , the first large - scale pretrained model for *****Indonesian***** Twitter that is trained by extending a monolingually - trained Indonesian BERT model with additive domain - specific vocabulary . | ||
| 2020.coling-main.66 Although the *****Indonesian***** language is spoken by almost 200 million people and the 10th most spoken language in the world , it is under - represented in NLP research . | ||
| L16-1129 In this paper we report our effort to construct the first ever *****Indonesian***** corpora for chat summarization . | ||
| W16-5415 This paper describes our attempt to build a sentiment analysis system for *****Indonesian***** tweets . | ||
| Arabic dialect | 7 | |
| W19-4634 Our submission to the MADAR shared task on *****Arabic dialect***** identification employed a language modeling technique called Prediction by Partial Matching , an ensemble of neural architectures , and sources of additional data for training word embeddings and auxiliary language models . | ||
| 2020.wanlp-1.10 *****Arabic dialect***** identification is a complex problem for a number of inherent properties of the language itself . | ||
| D18-1135 Recently , string kernels have obtained state - of - the - art results in various text classification tasks such as *****Arabic dialect***** identification or native language identification . | ||
| W19-4629 *****Arabic dialect***** identification is an inherently complex problem , as Arabic dialect taxonomy is convoluted and aims to dissect a continuous space rather than a discrete one . | ||
| L14-1086 Recent computational work on *****Arabic dialect***** identification has focused primarily on building and annotating corpora written in Arabic script . | ||
| mixed | 7 | |
| D19-1574 While this linguistic information has shown great promise in pre - neural parsing , results for neural architectures have been *****mixed***** . | ||
| 2016.gwc-1.60 Writing intended to inform frequently contains references to document entities ( DEs ) , a *****mixed***** class that includes orthographically structured items ( e.g. , illustrations , sections , lists ) and discourse entities ( arguments , suggestions , points ) . | ||
| 2021.ranlp-srw.3 Code - Mixed language plays a very important role in communication in multilingual societies and with the recent increase in internet users especially in multilingual societies , the usage of such *****mixed***** language has also increased . | ||
| 2021.sigdial-1.11 Many existing chatbots do not effectively support *****mixed***** initiative , forcing their users to either respond passively or lead constantly . | ||
| W18-3204 Most NLP applications today are still designed with the assumption of a single interaction language and are most likely to break given a CM utterance with multiple languages *****mixed***** at a morphological , phrase or sentence level . | ||
| Part - of - Speech | 7 | |
| L06-1243 Translation In this work we investigate new possibilities for improving the quality of statistical machine translation ( SMT ) by applying word reorderings of the source language sentences based on *****Part - of - Speech***** tags . | ||
| L12-1350 This paper evaluates the impact of external lexical resources into a CRF - based joint Multiword Segmenter and *****Part - of - Speech***** Tagger . | ||
| L08-1454 *****Part - of - Speech***** tagging is generally performed by Markov models , based on bigram or trigram models . | ||
| L08-1010 In our paper we present a methodology used for low - cost validation of quality of *****Part - of - Speech***** annotation of the Prague Dependency Treebank based on multiple re - annotation of data samples carefully selected with the help of several different Part - of - Speech taggers . | ||
| L06-1201 We used four *****Part - of - Speech***** taggers , which are available for research purposes and were originally trained on text to tag a corpus of transcribed multiparty spoken dialogues . | ||
| Native Language Identification ( NLI | 7 | |
| W18-0534 In this paper we present NLI - PT , the first Portuguese dataset compiled for *****Native Language Identification ( NLI***** ) , the task of identifying an author 's first language based on their second language writing . | ||
| W17-5026 We report on our experiments with N - gram and embedding based feature representations for *****Native Language Identification ( NLI***** ) as a part of the NLI Shared Task 2017 ( team name : NLI - ISU ) . | ||
| L14-1051 *****Native Language Identification ( NLI***** ) is a task aimed at determining the native language ( L1 ) of learners of second language ( L2 ) on the basis of their written texts . | ||
| J18-3003 Ensemble methods using multiple classifiers have proven to be among the most successful approaches for the task of *****Native Language Identification ( NLI***** ) , achieving the current state of the art . | ||
| W17-5023 Our teamUvic - NLPexplored and evaluated a variety of lexical features for *****Native Language Identification ( NLI***** ) within the framework of ensemble methods . | ||
| Figurative Language | 7 | |
| 2020.figlang-1.12 In this paper , we present the results obtained by BERT , BiLSTM and SVM classifiers on the shared task on Sarcasm Detection held as part of The Second Workshop on *****Figurative Language***** Processing . | ||
| 2020.figlang-1.13 Sarcasm Detection with Context , a shared task of Second Workshop on *****Figurative Language***** Processing ( co - located with ACL 2020 ) , is study of effect of context on Sarcasm detection in conversations of Social media . | ||
| 2020.figlang-1.36 We present an ensemble approach for the detection of sarcasm in Reddit and Twitter responses in the context of The Second Workshop on *****Figurative Language***** Processing held in conjunction with ACL 2020 . | ||
| W18-0915 We present an algorithm for detecting metaphor in sentences which was used in Shared Task on Metaphor Detection by First Workshop on *****Figurative Language***** Processing . | ||
| 2020.figlang-1.35 The LSTM BiRNN system participated in the shared task of metaphor identification that was part of the Second Workshop of *****Figurative Language***** Processing ( FigLang2020 ) held at the Annual Conference of the Association for Computational Linguistics ( ACL2020 ) . | ||
| reinforcement learning ( RL | 7 | |
| P18-1203 Training a task - completion dialogue agent via *****reinforcement learning ( RL***** ) is costly because it requires many interactions with real users . | ||
| W18-6021 We present a general approach with *****reinforcement learning ( RL***** ) to approximate dynamic oracles for transition systems where exact dynamic oracles are difficult to derive . | ||
| P18-1165 We present a study on *****reinforcement learning ( RL***** ) from human bandit feedback for sequence - to - sequence learning , exemplified by the task of bandit neural machine translation ( NMT ) . | ||
| 2021.emnlp-main.245 Text - Based Games ( TBGs ) have emerged as important testbeds for *****reinforcement learning ( RL***** ) in the natural language domain . | ||
| D17-1260 Hand - crafted rules and *****reinforcement learning ( RL***** ) are two popular choices to obtain dialogue policy . | ||
| co - | 7 | |
| P19-1575 The method is used to factor a complex neural model into its functional components , which are comprised of sets of *****co -***** firing neurons that cut across layers of the network architecture , and which we call neural pathways . | ||
| L14-1216 According to psychological learning theory an important principle governing language acquisition is *****co -***** occurrence . | ||
| 2020.acl-main.739 This paper introduces two tasks : determining ( a ) the duration of possession relations and ( b ) *****co -***** possessions , i.e. , whether multiple possessors possess a possessee at the same time . | ||
| K17-1016 Conventional word embeddings are trained with specific criteria ( e.g. , based on language modeling or *****co -***** occurrence ) inside a single information source , disregarding the opportunity for further calibration using external knowledge . | ||
| L16-1722 ROOT9 is a supervised system for the classification of hypernyms , *****co -***** hyponyms and random words that is derived from the already introduced ROOT13 ( Santus et al . , 2016 ) . | ||
| human - robot | 7 | |
| L12-1029 In this study , the use of alternative acoustic sensors in *****human - robot***** communication is investigated . | ||
| L04-1172 This paper deals with databases that combine different aspects : children 's speech , emotional speech , *****human - robot***** communication , cross - linguistics , and read vs. spontaneous speech : in a Wizard - of - Oz scenario , German and English children had to instruct Sony 's AIBO robot to fulfil specific tasks . | ||
| 2019.gwc-1.22 Within a larger frame of facilitating *****human - robot***** interaction , we present here the creation of a core vocabulary to be learned by a robot . | ||
| 2020.emnlp-main.266 Physical common sense plays an essential role in the cognition abilities of robots for *****human - robot***** interaction . | ||
| 2021.metanlp-1.1 Text - based games can be used to develop task - oriented text agents for accomplishing tasks with high - level language instructions , which has potential applications in domains such as *****human - robot***** interaction . | ||
| binarization | 6 | |
| 2020.sustainlp-1.4 In order to fully utilize the potential of end to end ***** binarization *****, both the input representations (vector embeddings of tokens statistics) and the classifier are binarized. | ||
| L10-1429 Conversion from a context free grammar treebank to a CCGbank is a four stage process: head finding, argument classification, ***** binarization *****, and category conversion. | ||
| 2021.acl-long.205 In this paper, we study the task of graph-based constituent parsing in the setting that ***** binarization ***** is not conducted as a pre-processing step, where a constituent tree may consist of nodes with more than two children. | ||
| 2020.lrec-1.128 We conclude that (1) the pre-trained context embedding provides effective solutions to deal with implicit semantics in Chinese texts, and (2) using multiway ground truth is helpful since different ***** binarization ***** approaches lead to significant differences in performance. | ||
| 2021.acl-long.334 In this paper, we propose BinaryBERT, which pushes BERT quantization to the limit by weight ***** binarization ***** | ||
| enhancing | 6 | |
| 2020.lt4gov-1.3 In this paper, we show the ***** enhancing ***** of the Demanded Skills Diagnosis (DiCoDe: Diagnöstico de Competencias Demandadas), a system developed by Mexico City's Ministry of Labor and Employment Promotion (STyFE: Secretaría de Trabajo y Fomento del Empleo de la Ciudad de Mëxico) that seeks to reduce information asymmetries between job seekers and employers. | ||
| P17-1030 It yields a principled Bayesian learning algorithm, adding gradient noise during training (***** enhancing ***** exploration of the model-parameter space) and model averaging when testing. | ||
| L12-1373 The WES base was used in a Question Answering system, ***** enhancing ***** significantly its performance. | ||
| P19-2023 Second, we advocate the use of Inverted Softmax (IS) and Cross-modal Local Scaling (CSLS) during inference to mitigate the so-called hubness problem in high-dimensional embedding space, ***** enhancing ***** scores of all metrics by a large margin. | ||
| W17-5006 High quality classroom discussion is important to student development, ***** enhancing ***** abilities to express claims, reason about other students' claims, and retain information for longer periods of time | ||
| localizing | 6 | |
| 2020.findings-emnlp.242 Existing systems rely on global visual features that represent the entire image, but ***** localizing ***** the relevant regions of the image will make it possible to recover a larger set of words, such as adjectives and verbs. | ||
| 2021.blackboxnlp-1.6 Specifically, given a piece of adversarial text, we hope to accomplish tasks such as ***** localizing ***** perturbed tokens, identifying the attacker's access level to the target model, determining the evasion mechanism imposed, and specifying the perturbation type employed by the attacking algorithm. | ||
| D18-1168 We propose a new model that explicitly reasons about different temporal segments in a video, and shows that temporal context is important for ***** localizing ***** phrases which include temporal language. | ||
| 2020.acl-main.585 We propose a video span ***** localizing ***** network (VSLNet), on top of the standard span-based QA framework, to address NLVL. | ||
| 2000.amta-workshop.6 We will examine the difficulties in ***** localizing ***** a dynamic website and discuss the challenges we have overcome to create a dynamic translation platform | ||
| extensions | 6 | |
| L14-1321 The term advanced leveraging refers to ***** extensions ***** beyond the current usage of translation memory (TM) in computer-aided translation (CAT). | ||
| L10-1542 We give an overview of the project workflow in automating the markup process, and consider what ***** extensions ***** to existing markup schema will be required to best support working taxonomists. | ||
| 2016.gwc-1.58 I present initial work to establish such a metric, and propose ways to move forward by looking at ***** extensions ***** to WordNet. | ||
| W18-1408 We ground our insights in, and present our ***** extensions ***** to, an existing lexico-semantic resource, covering 500 semantic classes of verbs, of which 219 fall within a spatial subset. | ||
| W19-2601 Experimental results not only show the feasibility of the framework on the biomedical dataset, but also indicate the effectiveness of our ***** extensions *****, because our extended model achieves significant and consistent improvements on distant supervised RE as compared with baselines | ||
| guessing | 6 | |
| W18-5015 We evaluate the proposed framework and the state adaptation technique in an image ***** guessing ***** game and achieve promising results. | ||
| D19-1014 This paper proposes a novel framework that alternatively trains a RL policy for image ***** guessing ***** and a supervised seq2seq model to improve dialog generation quality. | ||
| 2021.eacl-main.178 When training a model on referential dialogue ***** guessing ***** games, the best model is usually chosen based on its task success. | ||
| N18-5011 After a brief introduction to structural ambiguity, users are challenged to complete a sentence in a way that tricks the computer into ***** guessing ***** an incorrect interpretation. | ||
| 2020.alvr-1.4 With respect to the second metric, humans make questions at the end of the dialogue that are referring, confirming their guess before ***** guessing ***** | ||
| reconstructing | 6 | |
| 2020.textgraphs-1.13 Workshop organized a shared task on `Explanation Regeneration' that required ***** reconstructing ***** gold explanations for elementary science questions. | ||
| 2020.acl-main.510 In particular, the cross-lingual encoder of our model learns a shared representation, which is effective for both ***** reconstructing ***** input sentences of two languages and generating more representative views from the input for classification. | ||
| 2021.naacl-main.134 Given a bag of words from a disordered sentence, humans may still be able to understand what those words mean by reordering or ***** reconstructing ***** them. | ||
| D19-1612 This theoretically efficient approach achieves an 11x empirical speedup over baseline ILP methods, while better ***** reconstructing ***** gold constrained shortenings. | ||
| D18-1011 Then the textual and predicted perceptual representations are fused through ***** reconstructing ***** their original and associated embeddings | ||
| Modern | 6 | |
| L16-1679 Most Arabic natural language processing tools and resources are developed to serve ***** Modern ***** Standard Arabic (MSA), which is the official written language in the Arab World. | ||
| L16-1463 Forty five percent of the terms and expressions in the lexicon are Egyptian or colloquial while fifty five percent are ***** Modern ***** Standard Arabic. | ||
| D19-3044 This paper describes the design and use of the ONLP suite, a joint morpho-syntactic infrastructure for processing ***** Modern ***** Hebrew texts. | ||
| L12-1328 DA lives side-by-side with the official language, ***** Modern ***** Standard Arabic (MSA). | ||
| 2020.alta-1.6 We invited and collected acoustic data from one ***** Modern ***** Standard Arabic (MSA) lecture and four MSA students | ||
| demographics | 6 | |
| 2020.findings-emnlp.291 We then analyze two scenarios: 1) inducing negative biases for one demographic and positive biases for another demographic, and 2) equalizing biases between ***** demographics *****. | ||
| E17-3014 This paper presents an approach that detects various audience attributes, including author location, ***** demographics *****, behavior and interests. | ||
| 2021.rocling-1.37 However, the present analysis functions of YouTube only provide a few performance indicators such as average view duration, browsing history, variance in audience's ***** demographics *****, etc., and lack of sentiment analysis on the audience's comments. | ||
| 2021.naacl-main.357 In human-level NLP tasks, such as predicting mental health, personality, or ***** demographics *****, the number of observations is often smaller than the standard 768+ hidden state sizes of each layer within modern transformer-based language models, limiting the ability to effectively leverage transformers. | ||
| 2021.wassa-1.28 The solution is based on combining the frequency of words, lexicon-based information, ***** demographics ***** of the annotators and personality of the annotators into a linear model | ||
| introducing | 6 | |
| R19-1139 In this work, we aim to mitigate these issues by (a) releasing a new labelled dataset of more than 47K word vectors trained on the UK Web Archive over a short time-frame (2000-2013); (b) proposing a variant of Procrustes alignment to detect words that have undergone semantic shift; and (c) ***** introducing ***** a rank-based approach for evaluation purposes. | ||
| W19-4016 The vast amount of research ***** introducing ***** new corpora and techniques for semi-automatically annotating corpora shows the important role that datasets play in today's research, especially in the machine learning community. | ||
| 2021.emnlp-main.301 In this paper, we develop FiD-Ex, which addresses these shortcomings for seq2seq models by: 1) ***** introducing ***** sentence markers to eliminate explanation fabrication by encouraging extractive generation, 2) using the fusion-in-decoder architecture to handle long input contexts, and 3) intermediate fine-tuning on re-structured open domain QA datasets to improve few-shot performance. | ||
| D19-3044 Its accompanying demo further serves educational activities, ***** introducing ***** Hebrew NLP intricacies to researchers and non-researchers alike. | ||
| L16-1536 In this paper we investigate the usefulness of neural word embeddings in the process of translating Named Entities (NEs) from a resource-rich language to a language low on resources relevant to the task at hand, ***** introducing ***** a novel, yet simple way of obtaining bilingual word vectors | ||
| lexical ontology | 6 | |
| L16-1603 The aim of our project is to have both the largest possible coverage of causal phenomena in French, across all parts of speech, and have it linked to a general semantic framework such as FN, to benefit in particular from the relations between other semantic frames, e.g., temporal ones or intentional ones, and the underlying upper ***** lexical ontology ***** that enable some forms of reasoning. | ||
| L08-1272 In this paper we introduce an ongoing project on developing a ***** lexical ontology ***** for Persian called FarsNet. | ||
| L06-1511 In this paper LexikoNet, a large ***** lexical ontology ***** of German nouns is presented. | ||
| L16-1455 Considering valency as a distinctive feature of synsets was an essential step to transform the initial PolNet (first intended as a ***** lexical ontology *****) into a lexicon-grammar. | ||
| L10-1540 This paper describes some methods used in semi-automatic development of FarsNet; a ***** lexical ontology ***** for the Persian language | ||
| accessibility | 6 | |
| 2020.coling-demos.8 It incorporates several existing multilingual models that can be used interchangeably in the demo such as M-BERT and XLM-R. The M-GAAMA demo also improves language ***** accessibility ***** by incorporating the IBM Watson machine translation widget to provide additional capabilities to the user to see an answer in their desired language. | ||
| I17-2068 Complex word identification (CWI) is an important task in text ***** accessibility *****. | ||
| L10-1061 Additionally, first steps towards evaluating the proposed tool, by analyzing utterance annotations taken from two expressive speech corpora, are undertaken and some future goals including the open source ***** accessibility ***** of the tool are given. | ||
| 2020.cmcl-1.1 We investigate the influence of Head–Dependent Mutual Information (HDMI), similarity-based interference, ***** accessibility ***** and case-marking. | ||
| R17-1104 Complex Word Identification (CWI) is an important task in lexical simplification and text ***** accessibility ***** | ||
| Annotated corpora | 6 | |
| W16-5208 ***** Annotated corpora ***** are crucial language resources, and pre-annotation is an usual way to reduce the cost of corpus construction. | ||
| L12-1242 ***** Annotated corpora ***** such as treebanks are important for the development of parsers, language applications as well as understanding of the language itself. | ||
| L14-1453 ***** Annotated corpora ***** are essential resources for many applications in Natural Language Processing. | ||
| W19-4012 ***** Annotated corpora ***** of argument schemes, however, are scarce, small, and unrepresentative. | ||
| C18-1144 ***** Annotated corpora ***** enable supervised machine learning and data analysis | ||
| analysing | 6 | |
| 2021.wmt-1.29 The neural machine translation approach has gained popularity in machine translation because of its context ***** analysing ***** ability and its handling of long-term dependency issues. | ||
| W18-3513 Using this taxonomy, we have validated our approach ***** analysing ***** the Twitter activity of the main Spanish political parties during 2015 and 2016 Spanish general election and providing a study of their discourse. | ||
| W16-4119 We compare different Machine Learning classifiers for the task of readability assessment focusing on Portuguese and English texts, ***** analysing ***** the impact of variables like the feature inventory used in the resulting corpus. | ||
| E17-3003 The paper presents the Etymological DICtionary ediTOR (EDICTOR), a free, interactive, web-based tool designed to aid historical linguists in creating, editing, ***** analysing *****, and publishing etymological datasets. | ||
| 2021.eacl-main.15 Specifically, the framework ranks a set of atomic facts by integrating lexical relevance with the notion of unification power, estimated ***** analysing ***** explanations for similar questions in the corpus | ||
| advancements | 6 | |
| D19-1292 Automated fact verification has been progressing owing to ***** advancements ***** in modeling and availability of large datasets. | ||
| 2021.acl-long.330 Technology for language generation has advanced rapidly, spurred by ***** advancements ***** in pre-training large models on massive amounts of data and the need for intelligent agents to communicate in a natural manner. | ||
| 2021.dravidianlangtech-1.53 Internet ***** advancements ***** have made a huge impact on the communication pattern of people and their life style. | ||
| L16-1386 We present and summarize five years of progress on the development of the cloud and of ***** advancements ***** in open data in linguistics, and we describe recent community activities. | ||
| 2020.emnlp-main.673 In recent years, the task of generating realistic short and long texts have made tremendous ***** advancements ***** | ||
| advantages | 6 | |
| D18-1508 In addition to ***** advantages ***** in terms of interpretability, we show that our proposed architecture improves over standard baselines in emoji prediction, and does particularly well when predicting infrequent emojis. | ||
| C16-1044 Our approach is based on Recurrent Neural Networks (RNN) and has the following ***** advantages *****: (a) it does not use word alignment information, (b) it does not assume any knowledge about foreign languages, which makes it applicable to a wide range of resource-poor languages, (c) it provides truly multilingual taggers. | ||
| C16-3007 This tutorial examines the characteristics, ***** advantages ***** and limitations of Wikipedia relative to other existing, human-curated resources of knowledge; derivative resources, created by converting semi-structured content in Wikipedia into structured data; the role of Wikipedia and its derivatives in text analysis; and the role of Wikipedia and its derivatives in enhancing information retrieval. | ||
| 2021.acl-long.135 Seq2Seq-DU has the following ***** advantages *****. | ||
| W17-2630 However, under typical training procedures, ***** advantages ***** over classical methods emerge only with large datasets | ||
| academia | 6 | |
| N18-3005 Prior study of this topic have focused on tasks only in the ***** academia ***** settings with limited scope or only provide intrinsic dataset analysis, lacking indication on how it affects the trained model performance. | ||
| 2020.coling-main.232 However, how to engage commonsense effectively in question answering systems is still under exploration in both research ***** academia ***** and industry. | ||
| 2020.acl-main.42 Simultaneous translation has many important application scenarios and attracts much attention from both ***** academia ***** and industry recently. | ||
| 2020.alta-1.8 Around 60% of doctoral graduates worldwide ended up working in industry rather than ***** academia *****. | ||
| 2021.emnlp-main.758 By better understanding the epistemic heritage of QA, researchers, ***** academia *****, and industry can more effectively accelerate QA research | ||
| Multi30K | 6 | |
| 2020.acl-main.273 We evaluate our proposed encoder on the ***** Multi30K ***** datasets. | ||
| P19-1180 Finally, we also apply VG-NSL to multiple languages in the ***** Multi30K ***** data set, showing that our model consistently outperforms prior unsupervised approaches. | ||
| 2021.naacl-main.195 Furthermore, when multilingual annotations are available, our method outperforms recent baselines by a large margin in multilingual text-to-video search on VTT and VATEX; as well as in multilingual text-to-image search on ***** Multi30K *****. Our model and Multi-HowTo100M is available at http://github.com/berniebear/Multi-HT100M. | ||
| 2021.emnlp-main.673 We hypothesize that this might be caused by the nature of the commonly used evaluation benchmark, also known as ***** Multi30K *****, where the translations of image captions were prepared without actually showing the images to human translators. | ||
| D18-1400 Our approach achieves competitive state-of-the-art results on the ***** Multi30K ***** and the Ambiguous COCO datasets | ||
| filling | 6 | |
| 2020.coling-main.310 Furthermore, by leveraging BERT as an additional encoder, we establish new state-of-the-art results on SNIPS and ATIS datasets, where we get 99.33% and 98.28% in terms of accuracy on intent detection task as well as 97.20% and 96.41% in terms of F1 score on slot ***** filling ***** task, respectively. | ||
| 2020.sustainlp-1.10 The first step is designed to remove non-slot tokens (i.e., O labeled tokens), as they introduce noise in the input of slot ***** filling ***** models. | ||
| 2021.mrl-1.18 Furthermore, we present an application of our method for crisis informatics using a new human-annotated tweet dataset of slot ***** filling ***** in English and Haitian Creole, collected during the Haiti earthquake. | ||
| P19-1550 Recent state-of-the-art neural models have obtained F1-scores near 98% on the task of slot ***** filling *****. | ||
| 2021.emnlp-main.746 Experimental results show that our model achieves significant improvement on the unseen slots, while also set new state-of-the-arts on slot ***** filling ***** task | ||
| constrained decoding | 6 | |
| P19-1294 Comparative experiments show that our method is not only more effective than a state-of-the-art implementation of ***** constrained decoding *****, but is also as fast as constraint-free decoding. | ||
| 2021.mtsummit-research.23 In particular and we introduce a method and based on ***** constrained decoding ***** and which handles the inflected forms of lexical entries and does not require any modification to the training data or model architecture. | ||
| 2021.acl-long.9 These results are achieved with a much lower run-time than ***** constrained decoding ***** algorithms. | ||
| 2020.eamt-1.29 (2019), which uses inline annotation of the target terms in the source segment plus source factor embeddings during training and inference, and compare them to ***** constrained decoding *****. | ||
| I17-1006 For the test phase, ***** constrained decoding ***** is also used for completing partial trees | ||
| modules | 6 | |
| 2019.iwslt-1.3 Overall, our ***** modules ***** in the pipe-line are based on the transformer architecture which has recently achieved great results in various fields. | ||
| L06-1208 In addition to TermDB, a database used for terminology management and storage, we present the following ***** modules ***** that are used to populate the database: TerMine (recognition, extraction and normalisation of terms from literature), AcroTerMine (extraction and clustering of acronyms and their long forms), AnnoTerm (annotation and classification of terms), and ClusTerm (extraction of term associations and clustering of terms). | ||
| D19-1636 Our framework introduces incongruity into the literal input version through ***** modules ***** that: (a) filter factual content from the input opinion, (b) retrieve incongruous phrases related to the filtered facts and (c) synthesize sarcastic text from the incongruous filtered and incongruous phrases. | ||
| W19-1802 We show that motion ***** modules ***** help to ground motion-related words and also help to learn in appearance ***** modules ***** because modular neural networks resolve task interference between ***** modules *****. | ||
| E17-1061 In addition, information available across ***** modules ***** cannot be leveraged by all ***** modules ***** | ||
| conversely | 6 | |
| L10-1161 Affixes can take part in several word-formation rules and, ***** conversely *****, rules can be realised by means of a variety of affixes. | ||
| 2021.acl-short.5 A translation that fails to be predicted by most MT systems will be treated as a difficult one and assigned a large weight in the final score function, and ***** conversely *****. | ||
| 2020.acl-demos.4 Each card corresponds to a Wikipedia article, and ***** conversely *****, any article could be turned into a card. | ||
| 2021.econlp-1.5 Domain-specific pretraining from scratch, ***** conversely *****, seems to be less effective. | ||
| W19-4717 Our models reveal some interesting yet contrastive patterns of long-term change in multiple languages: Indo-European languages put more weight on subword units in newer words, while ***** conversely ***** Chinese puts less weights on the subwords, but more weight on the word as a whole | ||
| highlighting | 6 | |
| L10-1094 The three specialized resources are described ***** highlighting ***** the various kinds of lexical semantic relations linking each term to the others within the single terminological database and to the generic resources WordNet and ItalWordNet. | ||
| D19-5309 Detailed extended analyses of all submitted systems showed large relative improvements in accessing the most challenging multi-hop inference problems, while absolute performance remains low, ***** highlighting ***** the difficulty of generating detailed explanations through multi-hop reasoning. | ||
| 2020.coling-main.30 For the task of mention identification and coreference resolution, a best performance of 54.1 F1 is reported, ***** highlighting ***** the room for improvement. | ||
| L12-1620 We thus discuss several previous attempts, ***** highlighting ***** what we believe to be their weakest point: a lack of attention to context. | ||
| 2021.insights-1.15 Deep learning approaches are expensive, and we hope our insights ***** highlighting ***** the lack of benefits from introducing a resource-intensive component will aid future research to distill the effective elements from long and complex pipelines, thereby providing a boost to the wider research community | ||
| simulations | 6 | |
| 2020.lincr-1.4 Using a language model to create cloze task ***** simulations ***** would require significantly less time and conduct studies related to linguistic predictability. | ||
| 2021.eacl-main.185 Our approach outperforms state-of-the-art methods by over 8% in terms of cumulative profit and risk-adjusted returns in trading ***** simulations ***** on two benchmarks: English tweets and Chinese financial news spanning two major stock indexes and four global markets. | ||
| P19-1376 We find that (1) the model exhibits instability across multiple ***** simulations ***** in terms of its correlation with human data, and (2) even when results are aggregated across ***** simulations ***** (treating each simulation as an individual human participant), the fit to the human data is not strong—worse than an older rule-based model. | ||
| 2021.naacl-main.316 Our method outperforms state-of-the-art in terms of risk-adjusted returns in trading ***** simulations ***** on two benchmarks: Tweets (English) and financial news (Chinese) pertaining to two major indexes and four global stock markets. | ||
| 2021.emnlp-main.794 This trade-off, however, has not appeared in recent ***** simulations ***** of iterated language learning with neural network agents (Chaabouni et al., 2019b) | ||
| refine | 6 | |
| P17-2053 In order to consider answer information into question modeling, we first introduce novel group sparse autoencoders which ***** refine ***** question representation by utilizing group information in the answer set. | ||
| 2021.acl-short.125 To address these data biases, we first ***** refine ***** each test set by excluding seen entities from it, so as to better evaluate a model's generalization ability. | ||
| P19-1292 Automatic post-editing (APE) seeks to automatically ***** refine ***** the output of a black-box machine translation (MT) system through human post-edits. | ||
| 2021.ranlp-1.57 Our two unsupervised methods ***** refine ***** sense annotations produced by a knowledge-based WSD system via lexical translations in a parallel corpus. | ||
| 2021.bionlp-1.2 The transformer networks ***** refine ***** existing pre-trained models, and the online triplet mining makes training efficient even with hundreds of thousands of concepts by sampling training triples within each mini-batch | ||
| protocols | 6 | |
| 2021.acl-long.525 Our model achieves an F1 score of 54.53% for temporal and causal relations in ***** protocols ***** from our corpus, which is a significant improvement over previous models - DyGIE++:28.17%; spERT:27.81%. | ||
| 2021.gem-1.6 However, in style transfer papers, we find that ***** protocols ***** for human evaluations are often underspecified and not standardized, which hampers the reproducibility of research in this field and progress toward better human and automatic evaluation methods. | ||
| E17-1016 Recent work on evaluating representation learning architectures in NLP has established a need for evaluation ***** protocols ***** based on subconscious cognitive measures rather than manually tailored intrinsic similarity and relatedness tasks. | ||
| 2020.clinicalnlp-1.26 In drug development, ***** protocols ***** define how clinical trials are conducted, and are therefore of paramount importance. | ||
| 2020.emnlp-main.708 To do this, it is critical to ensure that our evaluation ***** protocols ***** are correct, and benchmarks are reliable | ||
| SimpleQuestions | 6 | |
| W18-5504 ***** SimpleQuestions ***** is a commonly used benchmark for single-factoid question answering (QA) over Knowledge Graphs (KG). | ||
| C18-1178 To address this problem, we present SimpleDBpediaQA, a new benchmark dataset for simple question answering over knowledge graphs that was created by mapping ***** SimpleQuestions ***** entities and predicates from Freebase to DBpedia. | ||
| 2020.coling-main.465 As in other natural language understanding tasks, a common practice for this task is to train and evaluate a model on a single dataset, and recent studies suggest that ***** SimpleQuestions *****, the most popular and largest dataset, is nearly solved under this setting. | ||
| C16-1236 WebQuestions and ***** SimpleQuestions ***** are two benchmark data-sets commonly used in recent knowledge-based question answering (KBQA) work | ||
| D18-1051 The *****SimpleQuestions***** dataset is one of the most commonly used benchmarks for studying single - relation factoid questions . | ||
| CA | 6 | |
| S18-1046 We propose an en-semble system including four different deep learning methods which are CNN, Bidirectional LSTM (BLSTM), LSTM-CNN and a CNN-based Attention model (***** CA *****). | ||
| L14-1253 In this paper, we focus particularly on data collected during a dialogue to discuss the application of conversation analysis (***** CA *****) to signed dialogues and signed conversations. | ||
| 2010.amta-commercial.11 This paper describes how ***** CA ***** Technologies tries to accomplish this long term goal with the deployment of MT systems to increase productivity with less cost, in a relatively short time. | ||
| 2004.amta-papers.7 Spoken Translation, Inc. (STI) of Berkeley, ***** CA ***** has developed a commercial system for interactive speech-to-speech machine translation designed for both high accuracy and broad linguistic and topical coverage. | ||
| 2012.amta-commercial.10 This document introduces the strategy implemented at *****CA***** Technologies to exploit Machine Translation ( MT ) at the corporate - wide level . | ||
| continuum | 6 | |
| 2021.acl-long.39 We also provide insights on why BERT fails to model words in the middle of the functionality ***** continuum *****. | ||
| C18-1164 In our further analyses, we validate the model's decision-making process, the philologically hypothesized ***** continuum ***** of fluency and investigate the relative importance of various features. | ||
| W16-4808 In my work, I am using a continuous vector representation of languages that allows modeling and exploring the language ***** continuum ***** in a very direct way. | ||
| 2021.emnlp-main.658 To this end, we propose a novel ***** continuum ***** model by extending the idea of neural ordinary differential equations (ODEs) to multi-relational graph convolutional networks | ||
| D19-1384 We demonstrate that complex linguistic behavior observed in natural language can be reproduced in this simple setting : i ) the outcome of contact between communities is a function of inter- and intra - group connectivity ; ii ) linguistic contact either converges to the majority protocol , or in balanced cases leads to novel creole languages of lower complexity ; and iii ) a linguistic *****continuum***** emerges where neighboring languages are more mutually intelligible than farther removed languages . | ||
| Bert | 6 | |
| 2021.semeval-1.85 We have been ranked first place in the competition using the pre-trained language models ***** Bert ***** and RoBERTa, with a Pearson correlation score of 0.788. | ||
| D19-6011 We find out that: (a) for task 1, first fine-tuning on larger datasets like RACE (Lai et al., 2017) and SWAG (Zellersetal.,2018), and then fine-tuning on the target task improve the performance significantly; (b) for task 2, we find out the incorporating a KG of commonsense knowledge, WordNet (Miller, 1995) into the ***** Bert ***** model (Devlin et al., 2018) is helpful, however, it will hurts the performace of XLNET (Yangetal.,2019), a more powerful pre-trained model. | ||
| 2020.wnut-1.39 This paper presents our teamwork on WNUT 2020 shared task-1: wet lab entity extract, that we conducted studies in several models, including a BiLSTM CRF model and a ***** Bert ***** case model which can be used to complete wet lab entity extraction. | ||
| 2020.semeval-1.82 We choose the ***** Bert ***** model | ||
| 2021.eacl-main.269 We evaluate the ability of *****Bert***** embeddings to represent tense information , taking French and Chinese as a case study . | ||
| unexplored | 6 | |
| 2020.findings-emnlp.442 More broadly, we seek to inspire more computational work around the topic of linguistic creativity, which we believe offers numerous ***** unexplored ***** opportunities. | ||
| 2021.spnlp-1.1 However, the reward function to be used within the reinforcement learning approach can play a key role for performance and is still partially ***** unexplored *****. | ||
| P19-1382 Uncovering such implications typically amounts to time-consuming manual processing by trained and experienced linguists, which potentially leaves key linguistic universals ***** unexplored *****. | ||
| 2021.emnlp-main.21 In this work, we investigate ***** unexplored ***** penalties and propose a set of new objectives that measure the dependency between modalities. | ||
| W17-3503 An ***** unexplored ***** question is how different these datasets are from English and, if there are any differences, what causes them to differ | ||
| naive | 6 | |
| 2020.acl-main.720 However, a straightforward implementation of this simple idea does not always work in practice: ***** naive ***** training of NER models using annotated data drawn from multiple languages consistently underperforms models trained on monolingual data alone, despite having access to more training data. | ||
| P18-1213 To facilitate research addressing this challenge, we introduce a new annotation framework to explain ***** naive ***** psychology of story characters as fully-specified chains of mental states with respect to motivations and emotional reactions. | ||
| 2021.nllp-1.23 In particular, we focus on an investigation of models to generate efficiencies in the triage process, but also the risks associated with ***** naive ***** use of model predictions, including fairness across different user demographics. | ||
| P18-1086 Towards this goal, this paper introduces a new task on ***** naive ***** physical action-effect prediction, which addresses the relations between concrete actions (expressed in the form of verb-noun pairs) and their effects on the state of the physical world as depicted by images. | ||
| D18-1460 The experiments show that, at the same or even better translation quality, our method can translate faster compared with ***** naive ***** beam search by 3.3x on GPUs and 3.5x on CPUs | ||
| untagged | 6 | |
| L04-1190 The LRs available for on-line queries include: a) several subcorpora (written and spoken, tagged and ***** untagged *****) compiled and extracted from CRPC for specific CLUL's projects and now available for on-line queries; b) a published sample of “Português Fundamental”, a spoken CRPC subcorpus, available for texts download; c) a frequency lexicon extracted from a CRPC subcorpus available for both on-line queries and download. | ||
| L08-1455 The algorithm is applied on ***** untagged ***** data, on manually assigned tags and on tags produced by an unsupervised part of speech tagger. | ||
| R19-1060 We show that a bigram HMM tagger benefits from re-training on a larger ***** untagged ***** text using Baum-Welch estimation. | ||
| W18-4417 Our voting combination system resulted second place in predicting Aggression levels on a test set of ***** untagged ***** social network posts. | ||
| R19-1006 Sifting through the revision history of the articles that at some point had been considered biased and later corrected, we retrieve the last tagged and the first ***** untagged ***** revisions as the before/after snapshots of what was deemed a violation of Wikipedia's neutral point of view policy | ||
| serialized | 6 | |
| N19-2005 In VRDs, visual and layout information is critical for document understanding, and texts in such documents cannot be ***** serialized ***** into the one-dimensional sequence without losing information. | ||
| 2020.lrec-1.885 CoNLL-RDF is a technology that provides such a bridge for popular one-word-per-line formats as widely used in NLP (e.g., the CoNLL Shared Tasks), annotation (Universal Dependencies, Unimorph), corpus linguistics (Corpus WorkBench, CWB) and digital lexicography (SketchEngine): Every empty-line separated table (usually a sentence) is parsed into an graph, can be freely manipulated and enriched using W3C-standardized RDF technology, and then be ***** serialized ***** back into in a TSV format, RDF or other formats. | ||
| N19-2013 While previous approaches have addressed the disparate schema issue by learning candidate transformations of the meaning representation, in this paper, we instead model the reference resolution as a dialogue context-aware user query reformulation task – the dialog state is ***** serialized ***** to a sequence of natural language tokens representing the conversation. | ||
| 2020.acl-main.224 Hence, popular sequence-to-sequence models, which require ***** serialized ***** input, are not a natural fit for this task. | ||
| D19-6309 We describe our exploratory system for the shallow surface realization task, which combines morphological inflection using character sequence-to-sequence models with a baseline linearizer that implements a tree-to-tree model using sequence-to-sequence models on ***** serialized ***** trees | ||
| standalone | 6 | |
| L14-1708 However, the resulting pipelines are ***** standalone ***** applications, i.e., software tools that are accessible only via local machine and that can only be run with the processing pipeline platforms. | ||
| D19-1605 We construct, CANARD, a dataset of 40,527 questions based on QuAC (Choi et al., 2018) and train Seq2Seq models for incorporating context into ***** standalone ***** questions. | ||
| 2021.ranlp-1.94 Experiments of the IP approach on combining state-of-the-art ***** standalone ***** GEC systems show that the combined system outperforms all ***** standalone ***** systems. | ||
| 2020.emnlp-main.388 We report state of the art results in morphological parsing, and in dependency parsing, both in ***** standalone ***** (with gold morphological tags) and joint morphosyntactic parsing setting. | ||
| W18-3715 It is aimed for easy integration into the traditional classroom setting and syllabus, which makes it distinct from other language learning tools that provide ***** standalone ***** learning experience | ||
| verb subcategorization | 6 | |
| 2000.iwpt-1.5 The system has been applied to the acquisition of verbal subcategorization information, obtaining 66% recall and 87% precision in the determination of ***** verb subcategorization ***** instances. | ||
| C16-1071 We explore the method using features based on ***** verb subcategorization ***** information and evaluate the approach in the context of the Native Language Identification (NLI) task. | ||
| 1998.amta-papers.27 In conventional approaches to Korean analysis, ***** verb subcategorization ***** has generally been used as lexical knowledge. | ||
| L12-1198 This paper describes a web-service system for automatic acquisition of ***** verb subcategorization ***** frames (SCFs) from parsed data in Italian. | ||
| 1993.iwpt-1.24 We describe a mechanism for automatically estimating frequencies of ***** verb subcategorization ***** frames in a large corpus | ||
| AMI | 6 | |
| 2021.acl-long.117 We apply DialoGPT to label three types of features on two dialogue summarization datasets, SAMSum and ***** AMI *****, and employ pre-trained and non pre-trained models as our summarizers. | ||
| W17-4507 We evaluate our approach on the ***** AMI ***** and ICSI meeting speech corpora, and on the DUC2001 news corpus. | ||
| P18-1062 Experiments on the ***** AMI ***** and ICSI corpus show that our system improves on the state-of-the-art. | ||
| R19-1149 The Adjusted Mutual Information (***** AMI *****) metric was used to verify the quality of clustering results. | ||
| E17-2074 We outperform multiple baselines in a real-time scenario emulated from the ***** AMI ***** and ICSI meeting corpora | ||
| stereotyping | 6 | |
| 2021.acl-long.81 We apply a measurement modeling lens—originating from the social sciences—to inventory a range of pitfalls that threaten these benchmarks' validity as measurement models for ***** stereotyping *****. | ||
| W17-1602 More fundamentally, such classification efforts risk invoking ***** stereotyping ***** and essentialism. | ||
| 2020.findings-emnlp.311 While language embeddings have been shown to have ***** stereotyping ***** biases, how these biases affect downstream question answering (QA) models remains unexplored. | ||
| 2021.cl-3.19 Through these studies, conducted on English text, we confirm that without acknowledging and building systems that recognize the complexity of gender, we will build systems that fail for: quality of service, ***** stereotyping *****, and over- or under-representation, especially for binary and non-binary trans users. | ||
| 2020.emnlp-main.154 In CrowS-Pairs a model is presented with two sentences: one that is more ***** stereotyping ***** and another that is less ***** stereotyping ***** | ||
| Charniak | 6 | |
| D17-1178 Applied to the model of Choe and ***** Charniak ***** (2016), our inference procedure obtains 92.56 F1 on section 23 of the Penn Treebank, surpassing prior state-of-the-art results for single-model systems. | ||
| J77-4005 Computational Semantics edited by Eugene ***** Charniak ***** and Yorick Wilks (Stuart Shapiro); Introduction to Contemporary Linguistics Semantics, by George L. Dillon (James D. McCawley); Current Bibliography; BBN Publications on Intelligent CAI | ||
| 2021.conll-1.50 Speakers are thought to use rational information transmission strategies for efficient communication (Genzel and ***** Charniak *****, 2002; Aylett and Turk, 2004; Jaeger and Levy, 2007). | ||
| J76-2001 Technical Help for Proposers, Minority Programs (Richard Lopez); NSF: Foreign Currency Program–Egypt, India, Pakistan; Catastrophe Theory: Thom at SIAM Meeting (Rene Thom); NATO: Advanced Study Institutes (Joseph M. Scandura), Structural-Process Theories of Behavior, Man-Computer Interaction, Computer-Based Science Instruction; C. S. Pierce International Congress (Max H. Fisch); New Journal: Cognitive Science (Eugene ***** Charniak *****; Allan Collins; Roger C. Schank); NSF: Rejected Proposals and Reconsideration; Conference Chronicle; 1976 Linguistics Institute, Oswego, New York; BAAL: | ||
| 1997.iwpt-1.13 PFGs combine most of the best properties of several other formalisms, including those of Collins, Magerman, and ***** Charniak *****, and in experiments have comparable or better performance | ||
| customized | 6 | |
| 2008.amta-govandcom.3 Further, we confirm that ***** customized ***** user dictionaries are effective across systems, although with a slight loss in quality: on average, user dictionaries improved the translations for 44.8% of translations with the systems they were built for and 37.3% of translations for different systems. | ||
| L14-1622 This part-of-speech tagger has been built on already available resources, in particular a Norwegian dictionary and gold standard corpus, which were partly ***** customized ***** for the purposes of this paper. | ||
| 2018.gwc-1.12 If a tool existed that could create some semantic abstractions, it would free the lexicographer from the need to resort to ***** customized ***** development of analysis software. | ||
| W19-5209 In this work, we ***** customized ***** a neural machine translation system for translation of subtitles in the domain of entertainment. | ||
| 2020.acl-demos.16 We present MT-DNN, an open-source natural language understanding (NLU) toolkit that makes it easy for researchers and developers to train ***** customized ***** deep learning models | ||
| receptive | 6 | |
| 2021.semeval-1.23 Usually, contexts are very lengthy and require a large ***** receptive ***** field from the model. | ||
| L16-1032 The resource is based on a corpus of coursebook texts, and thus describes ***** receptive ***** vocabulary learners are exposed to during reading activities, as opposed to productive vocabulary they use when speaking or writing. | ||
| D18-1485 Our designed dilated convolution effectively reduces dimension and supports an exponential expansion of ***** receptive ***** fields without loss of local information, and the attention-over-attention mechanism is able to capture more summary relevant information from the source context. | ||
| C18-1294 The literature frequently addresses the differences in ***** receptive ***** and productive vocabulary, but grammar is often left unacknowledged in second language acquisition studies. | ||
| 2020.findings-emnlp.420 However, few models consider the fusion of linguistic features with multiple visual features with different sizes of ***** receptive ***** fields, though the proper size of the ***** receptive ***** field of visual features intuitively varies depending on expressions | ||
| deviate | 6 | |
| 2020.lrec-1.743 These challenges ***** deviate ***** from the classic task of identifying a limited number of lexical signs in a video stream. | ||
| C16-1181 The key idea is that linguistic non-cooperation can be measured in terms of the extent to which dialogue participants ***** deviate ***** from conventions regarding the proper introduction and discharging of conversational obligations (e.g., the obligation to respond to a question). | ||
| L10-1443 We gathered evidence that decision times are non-uniformly distributed over the annotation units, while they do not substantially ***** deviate ***** among annotators. | ||
| 2020.lrec-1.780 While recent automatic speech recognition systems achieve remarkable performance when large amounts of adequate, high quality annotated speech data is used for training, the same systems often only achieve an unsatisfactory result for tasks in domains that greatly ***** deviate ***** from the conditions represented by the training data. | ||
| 2021.emnlp-main.135 We theoretically analyze the difficulty of creating data for competency problems when human bias is taken into account, showing that realistic datasets will increasingly ***** deviate ***** from competency problems as dataset size increases | ||
| render | 6 | |
| 2020.signlang-1.17 The new approach, when applied to our avatar “Paula”, results in much quicker ***** render ***** times than more sophisticated, computationally intensive techniques. | ||
| L12-1533 While the proposed modifications were driven by the desire to introduce greater conceptual clarity in the PDTB scheme and to facilitate better annotation quality, our findings indicate that overall, some of the changes ***** render ***** the annotation task much more difficult for the annotators, as also reflected in lower inter-annotator agreement for the relevant sub-tasks. | ||
| L12-1647 The current prevailing methodologies, the sheer number of languages and the vast volumes of digital content together with the wide palette of useful content processing applications, ***** render ***** new models for managing the underlying language resources indispensable. | ||
| P18-3008 While growing code-mixed content on Online Social Networks(OSN) provides a fertile ground for studying various aspects of code-mixing, the lack of automated text analysis tools ***** render ***** such studies challenging. | ||
| W18-3807 In addition, the features that the adjective shares with other grammatical categories ***** render ***** it extremely productive and provide elements that enrich the learners' proficiency | ||
| routing | 6 | |
| P19-1052 To this end, we first develop an aspect ***** routing ***** approach to encapsulate the sentence-level semantic representations into semantic capsules from both the aspect-level and document-level data. | ||
| 2020.acl-main.283 To efficiently handle large-scale MLC datasets, we additionally present a new ***** routing ***** method to adaptively adjust the capsule number during ***** routing *****. | ||
| N19-1150 Experiments on a complex biomedical information extraction task using expert and lay annotators show that: (i) simply excluding from the training data instances predicted to be difficult yields a small boost in performance; (ii) using difficulty scores to weight instances during training provides further, consistent gains; (iii) assigning instances predicted to be difficult to domain experts is an effective strategy for task ***** routing *****. | ||
| D18-1486 With the advantages of capsules for feature clustering, proposed task ***** routing ***** algorithm can cluster the features for each task in the network, which helps reduce the interference among tasks. | ||
| P19-1150 Obstacles hindering the development of capsule networks for challenging NLP applications include poor scalability to large output spaces and less reliable *****routing***** processes . | ||
| lexicalized reordering | 6 | |
| L10-1008 This proposed reordering technique offers a better and more efficient translation when compared to both the distance-based and the ***** lexicalized reordering *****. | ||
| 2016.amta-researchers.12 Our models achieve improvements of up to 0.40 BLEU points in Chinese-English translation compared to a baseline which uses a regular ***** lexicalized reordering ***** model and a hierarchical reordering model. | ||
| E17-2097 We introduce a novel shift-reduce algorithm to LR-Hiero to decode with our ***** lexicalized reordering ***** model (LRM) and show that it improves translation quality for Czech-English, Chinese-English and German-English. | ||
| W16-4607 This paper presents an improved ***** lexicalized reordering ***** model for phrase-based statistical machine translation using a deep neural network. | ||
| 2017.iwslt-1.18 More concretely we train neuralized versions of ***** lexicalized reordering ***** [1] and the operation sequence models [2] using feed-forward neural network | ||
| semantic parse | 6 | |
| C18-1280 Previous work largely focused on selecting the correct semantic relations for a question and disregarded the structure of the ***** semantic parse *****: the connections between entities and the directions of the relations. | ||
| 2021.naacl-main.236 In this work, we propose a non-autoregressive approach to predict ***** semantic parse ***** trees with an efficient seq2seq model architecture. | ||
| D19-1547 The agent maintains its own state as the current predicted ***** semantic parse *****, decides whether and where human intervention is needed, and generates a clarification question in natural language. | ||
| 2021.acl-short.34 However, the explicit ***** semantic parse ***** of the question is a rich source of relation information that is not taken advantage of | ||
| 2020.acl-main.187 We study the task of *****semantic parse***** correction with natural language feedback . | ||
| adjacent | 6 | |
| 1963.earlymt-1.26 Our translation system (1) converts input text to syntactic and semantic codes with a dictionary scan, (2) clears syntactic ambiguities where resolution by ***** adjacent ***** words is effective, (3) resolves residual syntactic ambiguities by determining the longest meaningful semantic unit, (4) reorders word sequence according to the rules of the target language and (5) produces the final target language translation. | ||
| D19-1537 Based on the observation that ***** adjacent ***** natural language questions are often linguistically dependent and their corresponding SQL queries tend to overlap, we utilize the interaction history by editing the previous predicted query to improve the generation quality. | ||
| D18-1464 We encode the perceived coherence of a text by a vector, which represents patterns of changes in salient information that relates ***** adjacent ***** sentences. | ||
| 2020.acl-main.178 Our analyses show that imagined stories have a substantially more linear narrative flow, compared to recalled stories in which ***** adjacent ***** sentences are more disconnected. | ||
| L06-1210 We analyze this by computing all possible skip-grams in a training corpus and measure how many ***** adjacent ***** (standard) n-grams these cover in test documents | ||
| Extractive summarization | 6 | |
| 2021.newsum-1.10 ***** Extractive summarization ***** systems, though interpretable, suffer from redundancy and possible lack of coherence. | ||
| D18-1088 ***** Extractive summarization ***** models need sentence level labels, which are usually created with rule-based methods since most summarization datasets only have document summary pairs. | ||
| 2020.nlpcovid19-acl.7 ***** Extractive summarization ***** using BERT and PageRank methods is used to provide responses to the query. | ||
| D19-5415 ***** Extractive summarization ***** selects and concatenates the most essential text spans in a document. | ||
| W18-5307 ***** Extractive summarization ***** techniques, which concatenate the most relevant text units drawn from multiple documents, perform well on automatic evaluation metrics like ROUGE, but score poorly on human readability, due to the presence of redundant text and grammatical errors in the answer | ||
| guideline | 6 | |
| W18-5613 We make use of incrementally developed synthetic clinical text describing patients' family history relating to cases of cardiac disease and present a general methodology which integrates the synthetically produced clinical statements and ***** guideline ***** development. | ||
| 2021.bppf-1.2 This paper focuses on a different kind of bias that has received very little attention: ***** guideline ***** bias, i.e., the bias introduced by how our annotator ***** guideline *****s are formulated. | ||
| 2020.lrec-1.561 Results suggest that our ***** guideline ***** is applicable to large-scale clinical NLP projects. | ||
| L16-1272 As data was drawn from many disparate authors, we define a unified scheme of importance labels, and provide a mapping for each ***** guideline *****. | ||
| L12-1441 The latter reflects the increasing understanding of the sloppy entity class both from the perspective of ***** guideline ***** writers and users (annotators) | ||
| configurational | 6 | |
| D18-1276 The ***** configurational ***** information in sentences of a free word order language such as Sanskrit is of limited use. | ||
| Q15-1013 We propose a language model for dependency structures that is relational rather than ***** configurational ***** and thus particularly suited for languages with a (relatively) free word order. | ||
| L08-1011 The paper presents a sketch grammar for German, a language which is not strictly ***** configurational ***** and which shows a considerable amount of case syncretism, and evaluates its accuracy, which has not been done for other sketch grammars. | ||
| J19-1003 Chinese, as an analytic language, encodes grammatical information in a highly ***** configurational ***** rather than morphological way. | ||
| 2020.acl-main.379 However, all neural rerankers so far have been evaluated on English and Chinese only, both languages with a ***** configurational ***** word order and poor morphology | ||
| recall | 6 | |
| I17-3013 Our method combines and improves channel and language models, resulting in high ***** recall ***** of detecting and correcting verb misuse. | ||
| R19-1043 We form two sets of question-answer pairs for FAQ and community QA search domains and use them for evaluation of the proposed indexing methodology, which delivers up to 16 percent improvement in search ***** recall *****. | ||
| L10-1115 Precision for miscellaneous names, subjects, persons and locations for the alignment with Wikipedia ranges from 0.63 to 0.94, while ***** recall ***** for subject terms is 0.62. | ||
| 2021.naacl-industry.35 Our extensive experiments on multiple languages show that these techniques detect adversarial ad categories with a substantial gain in precision at high ***** recall ***** threshold over the baseline. | ||
| L08-1136 We have measured the increase in ***** recall ***** obtained by morphological query expansion and the increase in precision and loss in ***** recall ***** produced by language-filtering-words, but not only by searching the web directly and looking at the hit counts which are not considered to be very reliable at best, but also using both a Basque web corpus and a classical lemmatised corpus, thus providing more exact quantitative results | ||
| whereas | 6 | |
| L14-1316 The Portuguese, French and Polish corpora contain read speech only, ***** whereas ***** the Hungarian corpus also contains spontaneous command and control type of speech. | ||
| D19-5726 Our first system is a BiLSTM network with two separate outputs for NER and NEN trained from scratch, ***** whereas ***** the second system is an instance of BioBERT fine-tuned on the concept-recognition task. | ||
| 2021.emnlp-main.46 Existing text classification methods mainly focus on a fixed label set, ***** whereas ***** many real-world applications require extending to new fine-grained classes as the number of samples per label increases. | ||
| C18-1264 A wide range of clustering and classification algorithms has been explored for the purpose, ***** whereas ***** possible improvements on the level of pairwise form similarity measures have not been the main focus of research. | ||
| W19-4712 In relation to themes such as sexuality and leisure, we see the bias moving toward women, ***** whereas *****, generally, the bias shifts in the direction of men, despite growing female employment number and feminist movements | ||
| anaphoricity | 6 | |
| 2020.lrec-1.137 The lexicon is one of the outcomes of the research on ***** anaphoricity ***** and long-distance relations in discourse, it contains at present anaphoric connectives (ACs) for Czech and German connectives, and further their possible translations documented in bilingual parallel corpora (not necessarily anaphoric). | ||
| P17-1009 To our knowledge, this is the first attempt to train a mention-ranking model and employ event ***** anaphoricity ***** for event coreference. | ||
| L10-1295 The corpus includes manual annotated information about morphosyntactic agreement, ***** anaphoricity *****, and semantic class of the NPs. | ||
| 2021.crac-1.16 However, we point out the difficulty in building a precise detector due to its inability to make important ***** anaphoricity ***** decisions | ||
| 2021.naacl-main.356 We propose a neural event coreference model in which event coreference is jointly trained with five tasks : trigger detection , entity coreference , *****anaphoricity***** determination , realis detection , and argument extraction . | ||
| incongruent | 6 | |
| 2021.econlp-1.12 Our initial findings for the US point towards the fact that more dispersed or ***** incongruent ***** monetary policy stance communication in the build up to Federal Open Market Committee (FOMC) meetings might be associated with stronger subsequent market surprises at FOMC policy announcement time. | ||
| 2020.lrec-1.85 In this corpus, the patient is mainly a listener and produces different feedbacks, some of them being (voluntary) ***** incongruent *****. | ||
| D18-1329 Our evaluation tests whether systems perform better when paired with congruent images or ***** incongruent ***** images. | ||
| 2020.wmt-1.70 Furthermore, we verified the importance of visual information during decoding by performing an adversarial evaluation of MSNMT, where we studied how models behaved with ***** incongruent ***** input modality and analyzed the effect of different word order between source and target languages. | ||
| W17-4210 This paper discusses the problem of ***** incongruent ***** headlines: those which do not accurately represent the information contained in the article with which they occur | ||
| accented | 6 | |
| L06-1193 The orthography of Gikuyu includes a number of ***** accented ***** characters to represent the entire vowel system. | ||
| 2021.emnlp-main.541 We demonstrate this on two speech adaptation tasks (atypical and ***** accented ***** speech) and for two state-of-the-art ASR architectures. | ||
| L06-1124 The first has been designed for Polish-English Literacy Tutor (PELT), a multimodal system for foreign language learning, as training input to speech recognition system for highly ***** accented *****, strongly variable second language speech. | ||
| L14-1033 In the data analysis, the acoustic-phonetic properties of words spoken with two different levels of accentuation (de-***** accented ***** and nuclear ***** accented ***** in non-contrastive narrow-focus) are examined in question-answer elicited sentences and iterative imitations (on the syllable da) produced by Bulgarian, Russian, French, German and Norwegian speakers (3 male and 3 female per language). | ||
| L10-1603 We show a statistically significant correspondence (1) between the overall market trend on the Zagreb Stock Exchange and the number of positively and negatively ***** accented ***** articles within periods of trend and (2) between the general sentiment of articles and the number of polarity phrases within those articles | ||
| gradual | 6 | |
| Q16-1003 Unlike previous work, we explicitly model language change as a smooth, ***** gradual ***** process. | ||
| 2021.emnlp-main.737 For model adaptation, we use a novel ***** gradual ***** pruning method to adapt to target speakers without changing the model architecture, which to the best of our knowledge, has never been explored in ASR. | ||
| 2021.sigdial-1.34 We aim to examine the performance of current discourse parsing models via ***** gradual ***** domain shift: within the corpus, on in-domain texts, and on out-of-domain texts, and discuss the differences between the transformer-based models and the previous models in predicting different types of implicit relations both inter- and intra-sentential. | ||
| L12-1479 Scores were computed using metrics that take into account ***** gradual ***** relevance. | ||
| R17-1050 We focus on two types of such orderings: (1) ensuring that each minibatch contains sentences similar in some aspect and (2) ***** gradual ***** inclusion of some sentence types as the training progresses (so called “curriculum learning”) | ||
| anomaly | 6 | |
| 2021.blackboxnlp-1.18 While sentence anomalies have been applied periodically for testing in NLP, we have yet to establish a picture of the precise status of ***** anomaly ***** information in representations from NLP models. | ||
| 2020.lrec-1.64 However, a more in-depth look shows some potential for using ***** anomaly ***** detection for evaluating dialogues. | ||
| P19-1398 In this paper we introduce a new ***** anomaly ***** detection method—Context Vector Data Description (CVDD)—which builds upon word embedding models to learn multiple sentence representations that capture multiple semantic contexts via the self-attention mechanism. | ||
| 2020.bionlp-1.14 In this paper, we show that machine learning-based unsupervised clustering of and ***** anomaly ***** detection with linguistic biomarkers are promising approaches for intuitive visualization and personalized early stage detection of Alzheimer's disease. | ||
| 2020.emnlp-main.385 Our findings show that output distributions that incorporate discrete latent variables and allow for multiple modes outperform simple flow-based counterparts on all datasets, yielding more accurate numerical pre-diction and ***** anomaly ***** detection | ||
| logarithmic | 6 | |
| D18-1188 In addition, we observe that there is a ***** logarithmic ***** relationship between the accuracy of a semantic parser and the amount of training data. | ||
| 2019.iwslt-1.14 On the model side, we deployed our recently proposed S-Transformer with ***** logarithmic ***** distance penalty, an ST-oriented adaptation of the Transformer architecture widely used in machine translation (MT). | ||
| D17-1156 We also investigate the amounts of in-domain training data needed for domain adaptation in NMT, and find a ***** logarithmic ***** relationship between the amount of training data and gain in BLEU score. | ||
| P17-1079 The method is based on predicting a binary code for each word and can reduce computation time/memory requirements of the output layer to be ***** logarithmic ***** in vocabulary size in the best case. | ||
| 1991.iwpt-1.14 A connectionist network is defined that parses a grammar in Chomsky Normal Form in ***** logarithmic ***** time, based on a modification of Rytter's recognition algorithm | ||
| externally | 6 | |
| 2021.isa-1.7 It is diagnosed that these changes are bound up with multi-media and multi-perspective facilities of annotation tools, in particular when considering virtual reality (VR) and augmented reality (AR) applications, their potential ubiquitous use, and the exploitation of ***** externally ***** trained natural language pre-processing methods. | ||
| 2020.coling-main.158 This enables leveraging the intrinsic knowledge existing within BERT together with ***** externally ***** introduced syntactic information, to bridge the gap across domains. | ||
| P18-1021 We propose an encoder-decoder style neural network-based argument generation model enriched with ***** externally ***** retrieved evidence from Wikipedia. | ||
| N19-1281 We propose a compact alternative to these cumbersome approaches which do not rely on any ***** externally ***** provided n-gram or word representations. | ||
| 2021.naacl-main.476 (1) Decoder state adjustment instantly modifies decoder final states with ***** externally ***** trained style scorers, to iteratively refine the output against a target style | ||
| imitation | 6 | |
| 2020.emnlp-main.446 Using simulated experiments, we demonstrate that MT model stealing is possible even when ***** imitation ***** models have different input data or architectures than their target models. | ||
| S17-1029 Humans as well as animals are good at ***** imitation *****. | ||
| D19-1619 Experiments on the benchmark datasets show that (1) ***** imitation ***** learning is constantly better than reinforcement learning; and (2) the pointer-generator models with ***** imitation ***** learning outperform the state-of-the-art methods with a large margin. | ||
| W19-3620 Learning is framed as ***** imitation ***** learning, including a coaching method which moves from imitating an oracle to reinforcing the policy's own preferences | ||
| D18-1314 We employ *****imitation***** learning to train a neural transition - based string transducer for morphological tasks such as inflection generation and lemmatization . | ||
| semisupervised | 6 | |
| 2020.emnlp-main.238 However, previous ***** semisupervised ***** methods do not fully utilize the knowledge hidden in annotated and nonannotated data, which hinders further improvement of their performance. | ||
| 2020.wanlp-1.26 In this paper we describe our effort and simple approach on the NADI Shared Task 1 that requires us to build a system to differentiate between different 21 Arabic dialects, we introduce a deep learning ***** semisupervised ***** fashion approach along with pre-processing that was reported on NADI shared Task 1 Corpus. | ||
| 2021.emnlp-main.408 Unsupervised Data Augmentation (UDA) is a ***** semisupervised ***** technique that applies a consistency loss to penalize differences between a model's predictions on (a) observed (unlabeled) examples; and (b) corresponding `noised' examples produced via data augmentation. | ||
| R19-1035 We demonstrate, however, that a carefully developed, ***** semisupervised ***** method of optimising and extending existing tools for Classical Tibetan, as well as creating specific ones for Old Tibetan can address these issues. | ||
| 2021.acl-long.537 We perform a comprehensive empirical study to explore different summarization techniques (including extractive and abstractive methods, single-document and hierarchical models, as well as transfer and ***** semisupervised ***** learning) and conduct human evaluations on both short and long summary generation tasks | ||
| replaced | 6 | |
| 2021.wanlp-1.20 Our model is pretrained using the ***** replaced ***** token detection objective on large Arabic text corpora. | ||
| 2020.emnlp-main.208 CSP adopts the encoder-decoder framework: its encoder takes the code-mixed sentence as input, and its decoder predicts the ***** replaced ***** fragment of the input sentence. | ||
| 2021.emnlp-main.430 For Named Entity Recognition (NER), existing approaches augment the input sequence with token replacement, assuming annotations on the ***** replaced ***** positions unchanged. | ||
| 2020.findings-emnlp.139 We develop CodeBERT with Transformer-based neural architecture, and train it with a hybrid objective function that incorporates the pre-training task of ***** replaced ***** token detection, which is to detect plausible alternatives sampled from generators. | ||
| W16-4605 We further show results of manual analysis on the ***** replaced ***** unknown words | ||
| conjunction | 6 | |
| R19-1111 All machine translation systems almost perfectly recognise one variant of the target ***** conjunction *****, especially for the source ***** conjunction ***** “but”. | ||
| 2021.semspace-1.7 We provide an account of ***** conjunction ***** and an interpretation for the word `and' that solves this, and moreover ensures certain intuitively similar sentences can be given the same interpretations. | ||
| W19-5353 Qualitative manual inspection of translation hypotheses shown that highly ranked systems generally produce translations with high adequacy and fluency, meaning that these systems are not only capable of capturing the right ***** conjunction ***** whereas the rest of the translation hypothesis is poor. | ||
| 2021.eacl-main.67 In this paper, we address the representation of coordinate constructions in Enhanced Universal Dependencies (UD), where relevant dependency links are propagated from ***** conjunction ***** heads to other conjuncts | ||
| 2020.lr4sshoc-1.3 The ELAN files form a very heterogeneous set , but the hierarchical configuration of their tiers allow , in *****conjunction***** with the tier content , to identify transcriptions , translations , and glosses . | ||
| trustworthiness | 6 | |
| 2020.pam-1.9 Judgements about communicative agents evolve over the course of interactions both in how individuals are judged for testimonial reliability and for (ideological) ***** trustworthiness *****. | ||
| W18-5209 It develops a typology of kinds of relevant evidence (argument premises) employed in cases, and it identifies factors that the tribunal considers when assessing the credibility or ***** trustworthiness ***** of individual items of evidence. | ||
| 2021.acl-long.94 Interpretability is an important aspect of the ***** trustworthiness ***** of a model's predictions. | ||
| 2021.mrqa-1.3 One likely reason for this is that clinicians may not readily trust QA system outputs, in part because transparency, ***** trustworthiness *****, and provenance have not been key considerations in the design of such models. | ||
| D18-1003 It presents a neural network model that judiciously aggregates signals from external evidence articles, the language of these articles and the ***** trustworthiness ***** of their sources | ||
| explorative | 6 | |
| 2020.lrec-1.70 This paper describes data collection and the first ***** explorative ***** research on the AICO Multimodal Corpus. | ||
| L10-1131 The general conclusion of our ***** explorative ***** study is that a multidimensional dialogue act taxonomy is usable for this purpose when some adjustments are made. | ||
| W89-0215 On the basis of our ***** explorative ***** research we have planned a number of small-scale implementations in the near future. | ||
| L08-1574 We present the machine learning framework that we are developing, in order to support ***** explorative ***** search for non-trivial linguistic configurations in low-density languages (languages with no or few NLP tools). | ||
| W16-4013 This enables ***** explorative ***** queries of the quantitative aspects of a corpus with geographical features | ||
| contextual semantic | 6 | |
| D19-1408 Specifically, we propose to learn low-rank sentence embeddings by tensor decomposition to capture their ***** contextual semantic ***** similarity, and use K-nearest neighbors (KNNs) of each sentence in the embedding space to generate sample clusters. | ||
| 2020.coling-main.8 We further investigate effects of four attention variants in generating ***** contextual semantic ***** representations. | ||
| 2020.emnlp-main.582 Language representation models such as BERT could effectively capture ***** contextual semantic ***** information from plain text, and have been proved to achieve promising results in lots of downstream NLP tasks with appropriate fine-tuning. | ||
| 2021.emnlp-main.206 As far as we know, existing neural-based ED models make decisions relying entirely on the ***** contextual semantic ***** features of each word in the inputted text, which we find is easy to be confused by the varied contexts in the test stage. | ||
| 2020.coling-main.14 First, the aspect enhancement module in METNet improves the representation learning of the aspect with ***** contextual semantic ***** features, which gives the aspect more abundant information | ||
| scanned | 6 | |
| 2020.wildre-1.3 OdiEnCorp 2.0 includes existing English-Odia corpora and we extended the collection by several other methods of data acquisition: parallel data scraping from many websites, including Odia Wikipedia, but also optical character recognition (OCR) to extract parallel data from ***** scanned ***** images. | ||
| 2021.acl-long.493 In this paper, we propose a new pre-training approach, StructuralLM, to jointly leverage cell and layout information from ***** scanned ***** documents. | ||
| W18-4502 The dictionary encompasses comprehensive cross-referencing mechanisms, including linking entries to an online ***** scanned ***** edition of Crum's Coptic Dictionary, internal cross-references and etymological information, translated searchable definitions in English, French and German, and linked corpus data which provides frequencies and corpus look-up for headwords and multiword expressions. | ||
| 2020.lrec-1.351 Here we demonstrate fundamental viability for a technology that can assist in making a large number of linguistic data sources machine readable: the automated identification and parsing of interlinear glossed text from ***** scanned ***** page images. | ||
| 2021.nllp-1.18 Older legal texts are often ***** scanned ***** and digitized via Optical Character Recognition (OCR), which results in numerous errors | ||
| compatible | 6 | |
| N19-1085 In addition, we adopt an interactive inference network based model to better capture the ***** compatible ***** and in***** compatible ***** relations between the context words of event mentions. | ||
| L12-1478 Sometimes it requires programming skill in addition to the expert knowledge to make the resources ***** compatible ***** and interoperable when the resources are not created so. | ||
| L16-1565 The treebank is dynamic: by global reparsing at certain intervals it is kept ***** compatible ***** with the latest versions of the grammar and the lexicon, which are continually further developed in interaction with the annotators. | ||
| 2021.emnlp-main.807 For this approach to be effective, the model should form ***** compatible ***** conditional distributions when making predictions on incomplete subsets of the context | ||
| L10-1236 Each ECA entry is mapped to its MSA synonym , Part - of - Speech ( POS ) tag and top - ranked contexts based on Web queries ; and thus each entry is provided with basic syntactic and semantic information for a generic lexicon *****compatible***** with multiple NLP applications . | ||
| solves | 6 | |
| 2021.smm4h-1.11 The winning system is based on a transformer-based pretrained language model and ***** solves ***** the two sub-tasks simultaneously. | ||
| 2021.emnlp-main.239 We propose a reference-less GEC evaluation system that is strongly correlated with human judgement, ***** solves ***** the issues related to the use of a reference, and does not need another annotated dataset for fine-tuning. | ||
| L08-1535 Experimental results show that the use of goshu information considerably improves the performance of heteronym disambiguation and lemma identification, suggesting that goshu information ***** solves ***** the lemma identification task very effectively. | ||
| P18-1092 Moreover, a simple ensemble of two of our models ***** solves ***** all 20 tasks in the joint version of the benchmark. | ||
| 2020.findings-emnlp.97 To achieve this, our model ***** solves ***** the task based on each rationale individually and learns to assign high scores to those which solved the task best | ||
| disjunctive | 6 | |
| 1993.iwpt-1.17 Parsing with a large systemic grammar brings one face-to-face with the problem of unification with ***** disjunctive ***** descriptions. | ||
| L06-1442 The authors show that since reduplication in the Northern Sotho language does not affect the pre-processing tokeniser, the ***** disjunctive ***** standard verbal segment as a construct in Northern Sotho is deterministic, finite-state and a regular Type 0 language in the Chomsky hierarchy and that the copulative verbal segment, due to its semi-disjunctivism, is ambiguously non-deterministic. | ||
| 2020.rail-1.4 The ***** disjunctive ***** style of writing poses a challenge when a sentence is tokenized or when tagging. | ||
| 1995.iwpt-1.9 In order to allow for applications with parallel search, incremental backtracking can be localized to ***** disjunctive ***** choice points within the description of a single structure, thus supporting the kind of conditional mutual consistency checks used in modern grammatical theories such as HPSG, GB, and LFG. | ||
| 2020.sdp-1.8 Here, we introduce a new way of learning blocking schemes by using a conjunctive normal form (CNF) in contrast to the ***** disjunctive ***** normal form (DNF) | ||
| desirable | 6 | |
| 2020.sdp-1.12 However, due to their large model size and resulting increased computational need, practical application of models such as BERT is challenging making smaller models with comparable performance ***** desirable ***** for real word applications. | ||
| 2021.emnlp-main.507 These parsers are simple and avoid explicit modeling of structure but lack ***** desirable ***** properties such as graph well-formedness guarantees or built-in graph-sentence alignments. | ||
| 2020.acl-main.64 USR additionally produces interpretable measures for several ***** desirable ***** properties of dialog. | ||
| 2021.repl4nlp-1.24 Specially, neural semantic parsers (NSPs) effectively translate natural questions to logical forms, which execute on KB and give ***** desirable ***** answers. | ||
| D19-1542 Furthermore, it achieves larger performance gains on tasks with limited training datasets for fine-tuning, which is a property ***** desirable ***** for transfer learning | ||
| discussed | 6 | |
| L10-1491 Outcomes of the STEVIN scientific midterm review are shortly ***** discussed ***** as the overall final evaluation is currently still on-going. | ||
| 2020.nl4xai-1.9 We then illustrate the ***** discussed ***** issues and potential ways of addressing them using a simple demo system's output generated from a propositional logic formula. | ||
| 2021.argmining-1.17 In this paper, we introduce a promising model, named Matching the Statements (MTS) that incorporates the ***** discussed ***** topic information into arguments/key points comprehension to fully understand their meanings, thus accurately performing ranking and retrieving best-match key points for an input argument. | ||
| 2020.nlpmc-1.9 Conversation is a complex cognitive task that engages multiple aspects of cognitive functions to remember the ***** discussed ***** topics, monitor the semantic and linguistic elements, and recognize others' emotions. | ||
| 2021.eacl-main.147 In this paper, we study claim quality assessment irrespective of ***** discussed ***** aspects by comparing different revisions of the same claim | ||
| multilingual embedding | 6 | |
| 2021.nodalida-main.7 While statistical word aligners can work well, especially when parallel training data are plentiful, ***** multilingual embedding ***** models have recently been shown to give good results in unsupervised scenarios. | ||
| 2021.emnlp-main.126 Especially, learning alignments in the ***** multilingual embedding ***** space usually requires sentence-level or word-level parallel corpora, which are expensive to be obtained for low-resource languages. | ||
| 2021.emnlp-main.612 Experimental results on both quality estimation of machine translation and cross-lingual semantic textual similarity tasks reveal that our method consistently outperforms the strong baselines using the original ***** multilingual embedding *****. | ||
| N19-1188 We learn a shared ***** multilingual embedding ***** space for a variable number of languages by incrementally adding new languages one by one to the current multilingual space. | ||
| 2021.emnlp-main.470 We first evaluate the LIR on a cross-lingual question answer retrieval task (LAReQA), which requires the strong alignment for the ***** multilingual embedding ***** space | ||
| NLPContributionGraph | 6 | |
| 2021.semeval-1.44 The SemEval-2021 Shared Task ***** NLPContributionGraph ***** (a.k.a. | ||
| 2021.semeval-1.58 This paper describes the system we built as the YNU-HPCC team in the SemEval-2021 Task 11: ***** NLPContributionGraph *****. | ||
| 2021.semeval-1.59 This paper describes the winning system in the End-to-end Pipeline phase for the ***** NLPContributionGraph ***** task. | ||
| 2021.semeval-1.61 With the ***** NLPContributionGraph ***** Shared Task, the organizers formalized the building of a scholarly contributions-focused graph over NLP scholarly articles as an automated task. | ||
| 2021.semeval-1.57 In this paper, we address this challenge via the SemEval 2021 Task 11: ***** NLPContributionGraph *****, by developing a system for a research paper contributions-focused knowledge graph over Natural Language Processing literature | ||
| executable SQL | 6 | |
| 2020.findings-emnlp.79 Specifically, the neural agent first learns to ask and confirm the customer's intent during the multi-turn interactions, then dynamically determining when to ground the user constraints into ***** executable SQL ***** queries so as to fetch relevant information from KBs. | ||
| 2020.lrec-1.714 In this paper, we introduce a new dataset that includes the first-of-its-kind eligibility-criteria corpus and the corresponding queries for criteria-to-sql (Criteria2SQL), a task translating the eligibility criteria to ***** executable SQL ***** queries. | ||
| 2020.emnlp-main.563 To solve these problems, this paper proposes a novel extraction-linking approach, where a unified extractor recognizes all types of slot mentions appearing in the question sentence before a linker maps the recognized columns to the table schema to generate ***** executable SQL ***** queries. | ||
| 2020.findings-emnlp.201 This paper is aimed to develop a service-oriented Text-to-SQL parser that translates natural language utterance to structural and ***** executable SQL ***** query. | ||
| 2020.acl-main.742 We study the task of cross-database semantic parsing (XSP), where a system that maps natural language utterances to ***** executable SQL ***** queries is evaluated on databases unseen during training | ||
| Dialogue state tracking ( DST | 6 | |
| 2020.findings-emnlp.142 *****Dialogue state tracking ( DST***** ) is an important part of a spoken dialogue system . | ||
| 2020.findings-emnlp.68 *****Dialogue state tracking ( DST***** ) aims at estimating the current dialogue state given all the preceding conversation . | ||
| W18-5022 *****Dialogue state tracking ( DST***** ) , when formulated as a supervised learning problem , relies on labelled data . | ||
| 2020.nlp4convai-1.10 *****Dialogue state tracking ( DST***** ) is at the heart of task - oriented dialogue systems . | ||
| 2021.emnlp-main.176 *****Dialogue state tracking ( DST***** ) , which estimates user goals given a dialogue context , is an essential component of task - oriented dialogue systems . | ||
| Digital | 6 | |
| L14-1256 Computational Narratology is an emerging field within the *****Digital***** Humanities . | ||
| L14-1164 *****Digital***** libraries are frequently treated just as a new method of storage of digitized artifacts , with all consequences of transferring long - established ways of dealing with physical objects into the digital world . | ||
| D19-5006 *****Digital***** media enables not only fast sharing of information , but also disinformation . | ||
| N19-1231 Scholars in inter - disciplinary fields like the *****Digital***** Humanities are increasingly interested in semantic annotation of specialized corpora . | ||
| 2020.latechclfl-1.3 Entity recognition provides semantic access to ancient materials in the *****Digital***** Humanities : it exposes people and places of interest in texts that can not be read exhaustively , facilitates linking resources and can provide a window into text contents , even for texts with no translations . | ||
| natural language understanding ( NLU | 6 | |
| 2020.acl-main.63 In modular dialogue systems , *****natural language understanding ( NLU***** ) and natural language generation ( NLG ) are two critical components , where NLU extracts the semantics from the given texts and NLG is to construct corresponding natural language sentences based on the input semantic representations . | ||
| 2021.mrl-1.13 An exciting frontier in *****natural language understanding ( NLU***** ) and generation ( NLG ) calls for ( vision - and- ) language models that can efficiently access external structured knowledge repositories . | ||
| 2020.findings-emnlp.39 Natural language inference ( NLI ) and semantic textual similarity ( STS ) are key tasks in *****natural language understanding ( NLU***** ) . | ||
| 2021.naacl-main.212 Domain classification is the fundamental task in *****natural language understanding ( NLU***** ) , which often requires fast accommodation to new emerging domains . | ||
| 2020.acl-main.559 While *****natural language understanding ( NLU***** ) is advancing rapidly , today 's technology differs from human - like language understanding in fundamental ways , notably in its inferior efficiency , interpretability , and generalization . | ||
| Abstract Meaning Representations ( AMRs | 6 | |
| W19-3303 In this paper , we propose an extension to *****Abstract Meaning Representations ( AMRs***** ) to encode scope information of quantifiers and negation , in a way that overcomes the semantic gaps of the schema while maintaining its cognitive simplicity . | ||
| L14-1332 *****Abstract Meaning Representations ( AMRs***** ) are rooted , directional and labeled graphs that abstract away from morpho - syntactic idiosyncrasies such as word category ( verbs and nouns ) , word order , and function words ( determiners , some prepositions ) . | ||
| 2020.acl-main.167 *****Abstract Meaning Representations ( AMRs***** ) are broad - coverage sentence - level semantic graphs . | ||
| 2020.aacl-main.27 Structured semantic sentence representations such as *****Abstract Meaning Representations ( AMRs***** ) are potentially useful in various NLP tasks . | ||
| 2020.acl-main.397 *****Abstract Meaning Representations ( AMRs***** ) capture sentence - level semantics structural representations to broad - coverage natural sentences . | ||
| Information Extraction | 6 | |
| D19-1030 The identification of complex semantic structures such as events and entity relations , already a challenging *****Information Extraction***** task , is doubly difficult from sources written in under - resourced and under - annotated languages . | ||
| 2020.lrec-1.528 The task of Entity linking , which aims at associating an entity mention with a unique entity in a knowledge base ( KB ) , is useful for advanced *****Information Extraction***** tasks such as relation extraction or event detection . | ||
| R17-1019 Shallow text analysis ( Text Mining ) uses mainly *****Information Extraction***** techniques . | ||
| 2020.louhi-1.9 Detecting negation and speculation in language has been a task of considerable interest to the biomedical community , as it is a key component of *****Information Extraction***** systems from Biomedical documents . | ||
| 2021.eacl-demos.33 It is standard procedure these days to solve *****Information Extraction***** task by fine - tuning large pre - trained language models . | ||
| pre - trained language models ( LMs | 6 | |
| 2021.emnlp-main.269 Rumor detection on social media puts *****pre - trained language models ( LMs***** ) , such as BERT , and auxiliary features , such as comments , into use . | ||
| 2021.naacl-main.45 The problem of answering questions using knowledge from *****pre - trained language models ( LMs***** ) and knowledge graphs ( KGs ) presents two challenges : given a QA context ( question and answer choice ) , methods need to ( i ) identify relevant knowledge from large KGs , and ( ii ) perform joint reasoning over the QA context and KG . | ||
| D18-1153 Many efforts have been made to facilitate natural language processing tasks with *****pre - trained language models ( LMs***** ) , and brought significant improvements to various applications . | ||
| 2021.conll-1.29 As *****pre - trained language models ( LMs***** ) continue to dominate NLP , it is increasingly important that we understand the depth of language capabilities in these models . | ||
| 2021.emnlp-main.407 Recently , *****pre - trained language models ( LMs***** ) have achieved strong performance when fine - tuned on difficult benchmarks like SuperGLUE . | ||
| META - | 6 | |
| L14-1350 This article provides an overview of the dissemination work carried out in *****META -***** NET from 2010 until early 2014 ; we describe its impact on the regional , national and international level , mainly with regard to politics and the situation of funding for LT topics . | ||
| 2020.ngt-1.5 We present *****META -***** MT , a meta - learning approach to adapt Neural Machine Translation ( NMT ) systems in a few - shot setting . | ||
| L16-1251 *****META -***** NET is a European network of excellence , founded in 2010 , that consists of 60 research centres in 34 European countries . | ||
| L12-1435 The *****META -***** NORD project has contributed to an open infrastructure for language resources ( data and tools ) under the META - NET umbrella . | ||
| L12-1483 We describe *****META -***** SHARE which aims at providing an open , distributed , secure , and interoperable infrastructure for the exchange of language resources , including both data and tools . | ||
| out - of - vocabulary | 6 | |
| 2010.amta-government.5 Enabling improved translations by MSA - trained MT systems through decreases in *****out - of - vocabulary***** terms achieved by means of colloquial term conversion to MSA . | ||
| N19-1280 Character - level models of tokens have been shown to be effective at dealing with within - token noise and *****out - of - vocabulary***** words . | ||
| W18-1205 Subword - level information is crucial for capturing the meaning and morphology of words , especially for *****out - of - vocabulary***** entries . | ||
| R19-1090 Building representative linguistic resources and NLP tools for non - standardized languages is challenging : when spelling is not determined by a norm , multiple written forms can be encountered for a given word , inducing a large proportion of *****out - of - vocabulary***** words . | ||
| 2021.wnut-1.14 Large - scale language models such as ELMo and BERT have pushed the horizon of what is possible in semantic role labeling ( SRL ) , solving the *****out - of - vocabulary***** problem and enabling end - to - end systems , but they have also introduced significant biases . | ||
| information extraction ( IE | 6 | |
| 2021.naacl-main.3 Existing works on *****information extraction ( IE***** ) have mainly solved the four main tasks separately ( entity mention recognition , relation extraction , event trigger detection , and argument extraction ) , thus failing to benefit from inter - dependencies between tasks . | ||
| R19-1033 In this paper , we report on the extrinsic evaluation of an automatic sentence simplification method with respect to two NLP tasks : semantic role labelling ( SRL ) and *****information extraction ( IE***** ) . | ||
| P19-1333 We present an approach for recursively splitting and rephrasing complex English sentences into a novel semantic hierarchy of simplified sentences , with each of them presenting a more regular structure that may facilitate a wide variety of artificial intelligence tasks , such as machine translation ( MT ) or *****information extraction ( IE***** ) . | ||
| W19-1505 Generating a large amount of training data for *****information extraction ( IE***** ) is either costly ( if annotations are created manually ) , or runs the risk of introducing noisy instances ( if distant supervision is used ) . | ||
| 2021.acl-long.488 Compared to the general news domain , *****information extraction ( IE***** ) from biomedical text requires much broader domain knowledge . | ||
| hand - crafted | 6 | |
| C18-1161 Neural network approaches to Named - Entity Recognition reduce the need for carefully *****hand - crafted***** features . | ||
| D18-1310 Conventional wisdom is that *****hand - crafted***** features are redundant for deep learning models , as they already learn adequate representations of text automatically from corpora . | ||
| P19-1524 Most of the recently proposed neural models for named entity recognition have been purely data - driven , with a strong emphasis on getting rid of the efforts for collecting external resources or designing *****hand - crafted***** features . | ||
| P18-2065 Conventional Open Information Extraction ( Open IE ) systems are usually built on *****hand - crafted***** patterns from other NLP tools such as syntactic parsing , yet they face problems of error propagation . | ||
| L08-1246 Motivated by the expense in time and other resources to produce *****hand - crafted***** grammars , there has been increased interest in automatically obtained wide - coverage grammars from treebanks for natural language processing . | ||
| machine - readable | 6 | |
| L06-1362 Lexical information for South African Bantu languages is not readily available in the form of *****machine - readable***** lexicons . | ||
| L10-1502 Automatically translating natural language into *****machine - readable***** instructions is one of major interesting and challenging tasks in Natural Language ( NL ) Processing . | ||
| N18-2016 We describe an effort to annotate a corpus of natural language instructions consisting of 622 wet lab protocols to facilitate automatic or semi - automatic conversion of protocols into a *****machine - readable***** format and benefit biological research . | ||
| 2020.lrec-1.401 In this paper , we report the release of the ACoLi Dictionary Graph , a large - scale collection of multilingual open source dictionaries available in two *****machine - readable***** formats , a graph representation in RDF , using the OntoLex - Lemon vocabulary , and a simple tabular data format to facilitate their use in NLP tasks , such as translation inference across dictionaries . | ||
| W16-5324 Although quantifiers / classifiers expressions occur frequently in everyday communications or written documents , there is no description for them in classical bilingual paper dictionaries , nor in *****machine - readable***** dictionaries . | ||
| instruction | 6 | |
| D18-1287 We propose to decompose *****instruction***** execution to goal prediction and action generation . | ||
| D17-1106 We propose to directly map raw visual observations and text input to actions for *****instruction***** execution . | ||
| P18-5006 Semantic parsing , the study of translating natural language utterances into machine - executable programs , is a well - established research area and has applications in question answering , *****instruction***** following , voice assistants , and code generation . | ||
| 2021.acl-long.382 Sequence - to - sequence transduction is the core problem in language processing applications as diverse as semantic parsing , machine translation , and *****instruction***** following . | ||
| 2021.naacl-main.81 Standard architectures used in *****instruction***** following often struggle on novel compositions of subgoals ( e.g. | ||
| convolutional neural networks ( CNNs | 6 | |
| D18-1109 We introduce a class of *****convolutional neural networks ( CNNs***** ) that utilize recurrent neural networks ( RNNs ) as convolution filters . | ||
| W16-4824 In this paper , we describe a system ( CGLI ) for discriminating similar languages , varieties and dialects using *****convolutional neural networks ( CNNs***** ) and long short - term memory ( LSTM ) neural networks . | ||
| I17-2001 This paper proposes a new attention mechanism for neural machine translation ( NMT ) based on *****convolutional neural networks ( CNNs***** ) , which is inspired by the CKY algorithm . | ||
| Q18-1047 In NLP , *****convolutional neural networks ( CNNs***** ) have benefited less than recurrent neural networks ( RNNs ) from attention mechanisms . | ||
| 2020.sdp-1.10 We present DeepPaperComposer , a simple solution for preparing highly accurate ( 100 % ) training data without manual labeling to extract content from scholarly articles using *****convolutional neural networks ( CNNs***** ) . | ||
| Semantic Role Labeling ( SRL | 6 | |
| 2021.naacl-main.31 While cross - lingual techniques are finding increasing success in a wide range of Natural Language Processing tasks , their application to *****Semantic Role Labeling ( SRL***** ) has been strongly limited by the fact that each language adopts its own linguistic formalism , from PropBank for English to AnCora for Spanish and PDT - Vallex for Czech , inter alia . | ||
| 2020.emnlp-demos.11 *****Semantic Role Labeling ( SRL***** ) is deeply dependent on complex linguistic resources and sophisticated neural models , which makes the task difficult to approach for non - experts . | ||
| W18-3027 We explore a novel approach for *****Semantic Role Labeling ( SRL***** ) by casting it as a sequence - to - sequence process . | ||
| 2020.findings-emnlp.38 Resources for *****Semantic Role Labeling ( SRL***** ) are typically annotated by experts at great expense . | ||
| D18-1538 Neural models have shown several state - of - the - art performances on *****Semantic Role Labeling ( SRL***** ) . | ||
| our | 6 | |
| W16-5207 We collected a speech corpus over fifteen hours from about fifty Vietnamese native speakers and using it to test the feasibility of *****our***** setup . | ||
| 2021.bionlp-1.25 Monitoring the safe use of medication drugs is an important task of pharmacovigilance , and first - hand experience of effects about consumers ' medication intake can be valuable to gain insight into how *****our***** human body reacts to medications . | ||
| D18-1201 On the local level , individual relations between synsets ( semantic building blocks ) such as hypernymy and meronymy enhance *****our***** understanding of the words used to express their meanings . | ||
| 2020.lrec-1.268 We describe the creation of such a multidisciplinary corpus and highlight the obtained findings in terms of the following features : 1 ) a generic conceptual formalism for scientific entities in a multidisciplinary scientific context ; 2 ) the feasibility of the domain - independent human annotation of scientific entities under such a generic formalism ; 3 ) a performance benchmark obtainable for automatic extraction of multidisciplinary scientific entities using BERT - based neural models ; 4 ) a delineated 3 - step entity resolution procedure for human annotation of the scientific entities via encyclopedic entity linking and lexicographic word sense disambiguation ; and 5 ) human evaluations of Babelfy returned encyclopedic links and lexicographic senses for *****our***** entities . | ||
| 2020.lrec-1.500 Aiming at a high - precision , fine - grained , configurable , and non - biased system for practical use cases , we have designed a pipeline method that makes the most of syntactic structures based on Universal Dependencies , avoiding machine - learning approaches that may cause obstacles to *****our***** purposes . | ||
| Video | 6 | |
| 2021.emnlp-main.773 *****Video***** grounding aims to localize the temporal segment corresponding to a sentence query from an untrimmed video . | ||
| D18-1117 *****Video***** content on social media platforms constitutes a major part of the communication between people , as it allows everyone to share their stories . | ||
| Q18-1013 *****Video***** captioning has attracted an increasing amount of interest , due in part to its potential for improved accessibility and information retrieval . | ||
| P17-1117 *****Video***** captioning , the task of describing the content of a video , has seen some promising improvements in recent years with sequence - to - sequence models , but accurately learning the temporal and logical dynamics involved in the task still remains a challenge , especially given the lack of sufficient annotated data . | ||
| D19-1217 *****Video***** dialog is a new and challenging task , which requires the agent to answer questions combining video information with dialog history . | ||
| human - written | 6 | |
| 2021.emnlp-main.50 The impressive capabilities of recent generative models to create texts that are challenging to distinguish from the *****human - written***** ones can be misused for generating fake news , product reviews , and even abusive content . | ||
| 2020.lrec-1.827 Large state - of - the - art corpora for training neural networks to create abstractive summaries are mostly limited to the news genre , as it is expensive to acquire *****human - written***** summaries for other types of text at a large scale . | ||
| 2021.ranlp-1.18 This paper presents a global summarization method for live sport commentaries for which we have a *****human - written***** summary available . | ||
| P19-1191 We present a PaperRobot who performs as an automatic research assistant by ( 1 ) conducting deep understanding of a large collection of *****human - written***** papers in a target domain and constructing comprehensive background knowledge graphs ( KGs ) ; ( 2 ) creating new ideas by predicting links from the background KGs , by combining graph attention and contextual text attention ; ( 3 ) incrementally writing some key elements of a new paper based on memory - attention networks : from the input title along with predicted related entities to generate a paper abstract , from the abstract to generate conclusion and future work , and finally from future work to generate a title for a follow - on paper . | ||
| 2021.newsum-1.12 Dialogue summarization is a long - standing task in the field of NLP , and several data sets with dialogues and associated *****human - written***** summaries of different styles exist . | ||
| bridging anaphora | 6 | |
| 2020.coling-main.537 Previous work on *****bridging anaphora***** recognition ( Hou et al . , 2013 ) casts the problem as a subtask of learning fine - grained information status ( IS ) . | ||
| 2020.acl-main.132 Most previous studies on *****bridging anaphora***** resolution ( Poesio et al . , 2004 ; Hou et al . , 2013b ; Hou , 2018a ) use the pairwise model to tackle the problem and assume that the gold mention information is given . | ||
| 2020.coling-main.114 The meaning of natural language text is supported by cohesion among various kinds of entities , including coreference relations , predicate - argument structures , and *****bridging anaphora***** relations . | ||
| W18-0703 We present two systems for bridging resolution , which we submitted to the CRAC shared task on *****bridging anaphora***** resolution in the ARRAU corpus ( track 2 ): a rule - based approach following Hou et al . | ||
| 2021.naacl-main.131 While Yu and Poesio ( 2020 ) have recently demonstrated the superiority of their neural multi - task learning ( MTL ) model to rule - based approaches for *****bridging anaphora***** resolution , there is little understanding of ( 1 ) how it is better than the rule - based approaches ( e.g. , are the two approaches making similar or complementary mistakes ? ) | ||
| Machine Learning | 6 | |
| L12-1592 We describe the Shared Task on Applying *****Machine Learning***** Techniques to Optimise the Division of Labour in Hybrid Machine Translation ( ML4HMT ) which aims to foster research on improved system combination approaches for machine translation ( MT ) . | ||
| S19-2045 Existing *****Machine Learning***** techniques yield close to human performance on text - based classification tasks . | ||
| 2020.codi-1.12 First , we discuss the most common linguistic perspectives on the concept of recency and propose a taxonomy of recency metrics employed in *****Machine Learning***** studies for choosing the form of referring expressions in discourse context . | ||
| L14-1228 In this paper , we describe how a Constraint Grammar with linguist - written rules can be optimized and ported to another language using a *****Machine Learning***** technique . | ||
| 2020.lrec-1.164 Film age appropriateness classification is an important problem with a significant societal impact that has so far been out of the interest of Natural Language Processing and *****Machine Learning***** researchers . | ||
| question - | 6 | |
| C16-1260 Several tasks in argumentation mining and debating , *****question -***** answering , and natural language inference involve classifying a sequence in the context of another sequence ( referred as bi - sequence classification ) . | ||
| D19-6609 Recent Deep Learning ( DL ) models have succeeded in achieving human - level accuracy on various natural language tasks such as *****question -***** answering , natural language inference ( NLI ) , and textual entailment . | ||
| 2021.naacl-tutorials.2 Deep neural networks have constantly pushed the state - of - the - art performance in natural language processing and are considered as the de - facto modeling approach in solving complex NLP tasks such as machine translation , summarization and *****question -***** answering . | ||
| 2021.eacl-main.177 Conversational systems enable numerous valuable applications , and *****question -***** answering is an important component underlying many of these . | ||
| D18-1424 Question generation , the task of automatically creating questions that can be answered by a certain span of text within a given passage , is important for *****question -***** answering and conversational systems in digital assistants such as Alexa , Cortana , Google Assistant and Siri . | ||
| spoken language understanding ( SLU | 6 | |
| N19-1372 To capture salient contextual information for *****spoken language understanding ( SLU***** ) of a dialogue , we propose time - aware models that automatically learn the latent time - decay function of the history without a manual time - decay function . | ||
| 2020.coling-main.310 Recently , *****spoken language understanding ( SLU***** ) has attracted extensive research interests , and various SLU datasets have been proposed to promote the development . | ||
| 2021.emnlp-main.259 Lack of training data presents a grand challenge to scaling out *****spoken language understanding ( SLU***** ) to low - resource languages . | ||
| D19-1126 Semantic slot filling is one of the major tasks in *****spoken language understanding ( SLU***** ) . | ||
| 2021.blackboxnlp-1.25 Language Models ( LMs ) have been ubiquitously leveraged in various tasks including *****spoken language understanding ( SLU***** ) . | ||
| spoken language understanding ( SLU ) | 6 | |
| P19-1541 Dialogue contexts are proven helpful in the *****spoken language understanding ( SLU )***** system and they are typically encoded with explicit memory representations . | ||
| D19-1214 Intent detection and slot filling are two main tasks for building a *****spoken language understanding ( SLU )***** system . | ||
| 2020.coling-industry.11 Currently , in *****spoken language understanding ( SLU )***** systems , the automatic speech recognition ( ASR ) module produces multiple interpretations ( or hypotheses ) for the input audio signal and the natural language understanding ( NLU ) module takes the one with the highest confidence score for domain or intent classification . | ||
| 2020.emnlp-main.152 Slot filling and intent detection are two main tasks in *****spoken language understanding ( SLU )***** system . | ||
| C18-1305 To deploy a *****spoken language understanding ( SLU )***** model to a new language , language transferring is desired to avoid the trouble of acquiring and labeling a new big SLU corpus . | ||
| parallel sentence | 6 | |
| W17-2508 This article presents the STACCw system for the BUCC 2017 shared task on *****parallel sentence***** extraction from comparable corpora . | ||
| N18-1136 Recognizing that even correct translations are not always semantically equivalent , we automatically detect meaning divergences in *****parallel sentence***** pairs with a deep neural model of bilingual semantic similarity which can be trained for any parallel corpus without any manual annotation . | ||
| 2021.mtsummit-research.21 Word alignment identify translational correspondences between words in a *****parallel sentence***** pair and are used and for example and to train statistical machine translation and learn bilingual dictionaries or to perform quality estimation . | ||
| W17-2512 This paper presents the BUCC 2017 shared task on *****parallel sentence***** extraction from comparable corpora . | ||
| 2019.iwslt-1.19 Word alignments identify translational correspondences between words in a *****parallel sentence***** pair and is used , for instance , to learn bilingual dictionaries , to train statistical machine translation systems , or to perform quality estimation . | ||
| Distantly supervised relation | 6 | |
| D18-1247 *****Distantly supervised relation***** extraction employs existing knowledge graphs to automatically collect training data . | ||
| P19-1134 *****Distantly supervised relation***** extraction is widely used to extract relational facts from text , but suffers from noisy labels . | ||
| 2020.coling-main.562 *****Distantly supervised relation***** extraction has been widely applied in knowledge base construction due to its less requirement of human efforts . | ||
| 2021.emnlp-main.15 *****Distantly supervised relation***** extraction is widely used in the construction of knowledge bases due to its high efficiency . | ||
| D17-1186 *****Distantly supervised relation***** extraction has been widely used to find novel relational facts from plain text . | ||
| Abstract Meaning | 6 | |
| P17-1043 We present a system which parses sentences into *****Abstract Meaning***** Representations , improving state - of - the - art results for this task by more than 5 % . | ||
| 2021.starsem-1.19 AMR ( *****Abstract Meaning***** Representation ) and EDS ( Elementary Dependency Structures ) are two popular meaning representations in NLP / NLU . | ||
| P18-1170 We present a semantic parser for *****Abstract Meaning***** Representations which learns to parse strings into tree representations of the compositional structure of an AMR graph . | ||
| S18-2006 This paper describes CCG / AMR , a novel grammar for semantic parsing of *****Abstract Meaning***** Representations . | ||
| 2020.dmr-1.3 To explore the potential sembanking in Korean and ways to represent the meaning of Korean sentences , this paper reports on the process of applying *****Abstract Meaning***** Representation to Korean , a semantic representation framework that has been studied in wide range of languages , and its output : the Korean AMR corpus . | ||
| Machine translation ( MT | 6 | |
| 2021.naacl-main.252 *****Machine translation ( MT***** ) is currently evaluated in one of two ways : in a monolingual fashion , by comparison with the system output to one or more human reference translations , or in a trained crosslingual fashion , by building a supervised model to predict quality scores from human - labeled data . | ||
| 2020.wat-1.11 *****Machine translation ( MT***** ) focuses on the automatic translation of text from one natural language to another natural language . | ||
| 2020.acl-main.359 *****Machine translation ( MT***** ) has benefited from using synthetic training data originating from translating monolingual corpora , a technique known as backtranslation . | ||
| Q13-1014 *****Machine translation ( MT***** ) draws from several different disciplines , making it a complex subject to teach . | ||
| 2020.eamt-1.23 *****Machine translation ( MT***** ) has been shown to produce a number of errors that require human post - editing , but the extent to which professional human translation ( HT ) contains such errors has not yet been compared to MT . | ||
| Persian | 6 | |
| 2016.gwc-1.53 This paper discusses the semantic augmentation of FarsNet - the *****Persian***** WordNet - with new relations and structures for verbs . | ||
| L06-1014 In this paper building statistical language models for *****Persian***** language using a corpus and incorporating them in Persian continuous speech recognition ( CSR ) system are described . | ||
| L10-1486 We introduce PerLex , a large - coverage and freely - available morphological lexicon for the *****Persian***** language . | ||
| 2021.ranlp-1.106 This paper evaluates normalization procedures of *****Persian***** text for a downstream NLP task - multiword expressions ( MWEs ) discovery . | ||
| 2021.ranlp-1.105 This paper presents an attempt at multiword expressions ( MWEs ) discovery in the *****Persian***** language . | ||
| type | 6 | |
| N19-1084 Existing entity typing systems usually exploit the *****type***** hierarchy provided by knowledge base ( KB ) schema to model label correlations and thus improve the overall performance . | ||
| P19-1196 A *****type***** description is a succinct noun compound which helps human and machines to quickly grasp the informative and distinctive information of an entity . | ||
| 2020.coling-main.519 As an audio format , podcasts are more varied in style and production *****type***** than broadcast news , contain more genres than typically studied in video data , and are more varied in style and format than previous corpora of conversations . | ||
| 2020.findings-emnlp.105 Some traditional KGE models leveraging additional type information can improve the representation of entities which however totally rely on the explicit types or neglect the diverse *****type***** representations specific to various relations . | ||
| L06-1261 This paper describes FreP , a new electronic tool that provides frequency counts of phonological units at the word - level and below from Portuguese written text : namely , major classes of segments , syllables and syllable types , phonological clitics , clitic types and size , prosodic words and their shape , word stress location , and syllable *****type***** by position within the word and/or status relative to word stress . | ||
| Recurrent neural networks ( RNNs | 6 | |
| N18-1108 *****Recurrent neural networks ( RNNs***** ) achieved impressive results in a variety of linguistic processing tasks , suggesting that they can induce non - trivial properties of language . | ||
| W18-5418 *****Recurrent neural networks ( RNNs***** ) are temporal networks and cumulative in nature that have shown promising results in various natural language processing tasks . | ||
| P17-1030 *****Recurrent neural networks ( RNNs***** ) have shown promising performance for language modeling . | ||
| 2021.cmcl-1.2 *****Recurrent neural networks ( RNNs***** ) have long been an architecture of interest for computational models of human sentence processing . | ||
| E17-1002 *****Recurrent neural networks ( RNNs***** ) process input text sequentially and model the conditional transition between word tokens . | ||
| Automatic Post - Editing ( APE | 6 | |
| W19-5412 This paper describes POSTECH 's submission to the WMT 2019 shared task on *****Automatic Post - Editing ( APE***** ) . | ||
| W18-6470 This paper describes the POSTECH 's submission to the WMT 2018 shared task on *****Automatic Post - Editing ( APE***** ) . | ||
| 2020.wmt-1.81 In this paper , we describe the Bering Lab 's submission to the WMT 2020 Shared Task on *****Automatic Post - Editing ( APE***** ) . | ||
| 2020.wmt-1.83 This paper describes POSTECH 's submission to WMT20 for the shared task on *****Automatic Post - Editing ( APE***** ) . | ||
| 2020.wmt-1.84 The goal of *****Automatic Post - Editing ( APE***** ) is basically to examine the automatic methods for correcting translation errors generated by an unknown machine translation ( MT ) system . | ||
| geographic | 6 | |
| 2020.findings-emnlp.195 Research in NLP lacks *****geographic***** diversity , and the question of how NLP can be scaled to low - resourced languages has not yet been adequately solved . | ||
| P18-1119 The purpose of text geolocation is to associate *****geographic***** information contained in a document with a set ( or sets ) of coordinates , either implicitly by using linguistic features and/or explicitly by using geographic metadata combined with heuristics . | ||
| 2021.splurobonlp-1.9 We present a multi - level geocoding model ( MLG ) that learns to associate texts to *****geographic***** coordinates . | ||
| 2020.lrec-1.252 Analyzing the *****geographic***** movement of humans , animals , and other phenomena is a growing field of research . | ||
| S19-1015 Language use varies across different demographic factors , such as gender , age , and *****geographic***** location . | ||
| Machine reading | 6 | |
| P19-1415 *****Machine reading***** comprehension with unanswerable questions is a challenging task . | ||
| 2021.naacl-main.367 *****Machine reading***** comprehension is a challenging task especially for querying documents with deep and interconnected contexts . | ||
| D19-5803 *****Machine reading***** comprehension is a task related to Question - Answering where questions are not generic in scope but are related to a particular document . | ||
| D19-5815 *****Machine reading***** comprehension , the task of evaluating a machine 's ability to comprehend a passage of text , has seen a surge in popularity in recent years . | ||
| D18-1237 *****Machine reading***** comprehension helps machines learn to utilize most of the human knowledge written in the form of text . | ||
| multiword expressions ( MWEs | 6 | |
| N18-2068 One of the most outstanding properties of *****multiword expressions ( MWEs***** ) , especially verbal ones ( VMWEs ) , important both in theoretical models and applications , is their idiosyncratic variability . | ||
| W19-5110 Because most *****multiword expressions ( MWEs***** ) , especially verbal ones , are semantically non - compositional , their automatic identification in running text is a prerequisite for semantically - oriented downstream applications . | ||
| L16-1194 This paper presents mwetoolkit+sem : an extension of the mwetoolkit that estimates semantic compositionality scores for *****multiword expressions ( MWEs***** ) based on word embeddings . | ||
| C16-1046 Much previous research on *****multiword expressions ( MWEs***** ) has focused on the token- and type - level tasks of MWE identification and extraction , respectively . | ||
| L12-1517 Light verb constructions ( LVCs ) , such as take a walk and make a decision , are a common subclass of *****multiword expressions ( MWEs***** ) , whose distinct syntactic and semantic properties call for a special treatment within a computational system . | ||
| Event Detection ( ED | 6 | |
| 2021.emnlp-main.439 The task of *****Event Detection ( ED***** ) in Information Extraction aims to recognize and classify trigger words of events in text . | ||
| 2021.eacl-main.237 Most of the previous work on *****Event Detection ( ED***** ) has only considered the datasets with a small number of event types ( i.e. , up to 38 types ) . | ||
| 2021.acl-long.220 *****Event Detection ( ED***** ) aims to identify event trigger words from a given text and classify it into an event type . | ||
| K19-1057 *****Event Detection ( ED***** ) is one of the most important task in the field of information extraction . | ||
| 2021.acl-long.490 *****Event Detection ( ED***** ) aims to recognize mentions of events ( i.e. , event triggers ) and their types in text . | ||
| neural network - based | 6 | |
| P19-1330 There has been substantial progress in summarization research enabled by the availability of novel , often large - scale , datasets and recent advances on *****neural network - based***** approaches . | ||
| W18-6558 This paper presents the two systems we entered into the 2017 E2E NLG Challenge : TemplGen , a templated - based system and SeqGen , a *****neural network - based***** system . | ||
| I17-3010 This paper demonstrates *****neural network - based***** toolkit namely NNVLP for essential Vietnamese language processing tasks including part - of - speech ( POS ) tagging , chunking , Named Entity Recognition ( NER ) . | ||
| P19-1045 In this paper , we propose a *****neural network - based***** approach , namely Adversarial Attention Network , to the task of multi - dimensional emotion regression , which automatically rates multiple emotion dimension scores for an input text . | ||
| D17-1016 We propose a method for embedding two - dimensional locations in a continuous vector space using a *****neural network - based***** model incorporating mixtures of Gaussian distributions , presenting two model variants for text - based geolocation and lexical dialectology . | ||
| pretrained language models ( LMs | 6 | |
| 2020.emnlp-main.553 While behaviors of *****pretrained language models ( LMs***** ) have been thoroughly examined , what happened during pretraining is rarely studied . | ||
| 2020.emnlp-main.586 The success of large *****pretrained language models ( LMs***** ) such as BERT and RoBERTa has sparked interest in probing their representations , in order to unveil what types of knowledge they implicitly capture . | ||
| 2020.findings-emnlp.278 In Natural Language Processing ( NLP ) , *****pretrained language models ( LMs***** ) that are transferred to downstream tasks have been recently shown to achieve state - of - the - art results . | ||
| 2021.naacl-main.301 Existing work on probing of *****pretrained language models ( LMs***** ) has predominantly focused on sentence - level syntactic tasks . | ||
| 2021.acl-short.72 Injecting external domain - specific knowledge ( e.g. , UMLS ) into *****pretrained language models ( LMs***** ) advances their capability to handle specialised in - domain tasks such as biomedical entity linking ( BEL ) . | ||
| Active learning ( AL | 6 | |
| D19-6110 *****Active learning ( AL***** ) for machine translation ( MT ) has been well - studied for the phrase - based MT paradigm . | ||
| 2021.tacl-1.1 *****Active learning ( AL***** ) uses a data selection algorithm to select useful training samples to minimize annotation cost . | ||
| L16-1697 *****Active learning ( AL***** ) is often used in corpus construction ( CC ) for selecting informative documents for annotation . | ||
| L08-1208 *****Active learning ( AL***** ) is getting more and more popular as a methodology to considerably reduce the annotation effort when building training material for statistical learning methods for various NLP tasks . | ||
| D19-1003 *****Active learning ( AL***** ) is a widely - used training strategy for maximizing predictive performance subject to a fixed annotation budget . | ||
| English language | 6 | |
| L14-1351 We present the Weltmodell , a commonsense knowledge base that was automatically generated from aggregated dependency parse fragments gathered from over 3.5 million *****English language***** books . | ||
| 2021.emnlp-main.454 Prior work has shown that structural supervision helps *****English language***** models learn generalizations about syntactic phenomena such as subject - verb agreement . | ||
| 2021.wanlp-1.20 Advances in *****English language***** representation enabled a more sample - efficient pre - training task by Efficiently Learning an Encoder that Classifies Token Replacements Accurately ( ELECTRA ) . | ||
| 2021.nllp-1.17 This paper presents a technique for the identification of participant slots in *****English language***** contracts . | ||
| W16-4005 We present an approach to detect differences in lexical semantics across *****English language***** registers , using word embedding models from distributional semantics paradigm . | ||
| neural encoder - decoder | 6 | |
| D19-1186 The *****neural encoder - decoder***** models have shown great promise in neural conversation generation . | ||
| P17-1061 While recent *****neural encoder - decoder***** models have shown great promise in modeling open - domain conversations , they often generate dull and generic responses . | ||
| D18-1086 Recent work on abstractive summarization has made progress with *****neural encoder - decoder***** architectures . | ||
| I17-2062 We propose a *****neural encoder - decoder***** model with reinforcement learning ( NRL ) for grammatical error correction ( GEC ) . | ||
| P17-1182 We present a novel cross - lingual transfer method for paradigm completion , the task of mapping a lemma to its inflected forms , using a *****neural encoder - decoder***** model , the state of the art for the monolingual task . | ||
| programming | 6 | |
| L12-1572 Challenges in creating comprehensive text - processing worklows include a lack of the interoperability of individual components coming from different providers and/or a requirement imposed on the end users to know *****programming***** techniques to compose such workflows . | ||
| 2003.mtsummit-papers.15 We describe an experiment in rapid development of a statistical machine translation ( SMT ) system from scratch , using limited resources : under this heading we include not only training data , but also computing power , linguistic knowledge , *****programming***** effort , and absolute time . | ||
| L12-1582 This paper describes an open - source Latvian resource grammar implemented in Grammatical Framework ( GF ) , a *****programming***** language for multilingual grammar applications . | ||
| 2021.teachingnlp-1.17 We present a series of *****programming***** assignments , adaptable to a range of experience levels from advanced undergraduate to PhD , to teach students design and implementation of modern NLP systems . | ||
| W17-5522 One of the reasons for the current hype is the fact that chatbots ( one particularly popular form of conversational interfaces ) nowadays can be created without any *****programming***** knowledge , thanks to different toolkits and so - called Natural Language Understanding ( NLU ) services . | ||
| multilingual sentence | 6 | |
| W19-4305 In this paper , we propose an architecture for machine translation ( MT ) capable of obtaining *****multilingual sentence***** representations by incorporating an intermediate attention bridge that is shared across all languages . | ||
| 2021.eacl-main.115 We present an approach based on *****multilingual sentence***** embeddings to automatically extract parallel sentences from the content of Wikipedia articles in 96 languages , including several dialects or low - resource languages . | ||
| 2020.acl-srw.34 Existing models of *****multilingual sentence***** embeddings require large parallel data resources which are not available for low - resource languages . | ||
| 2021.acl-long.507 We show that margin - based bitext mining in a *****multilingual sentence***** space can be successfully scaled to operate on monolingual corpora of billions of sentences . | ||
| 2021.emnlp-main.612 We propose a method to distill a language - agnostic meaning embedding from a *****multilingual sentence***** encoder . | ||
| Natural Language Generation | 6 | |
| R19-1024 In *****Natural Language Generation***** systems , personalization strategies - i.e , the use of information about a target author to generate text that ( more ) closely resembles human - produced language - have long been applied to improve results . | ||
| 2021.inlg-1.14 We observe a severe under - reporting of the different kinds of errors that *****Natural Language Generation***** systems make . | ||
| W17-1613 We discuss the ethical implications of *****Natural Language Generation***** systems . | ||
| P19-1210 Transcripts of natural , multi - person meetings differ significantly from documents like news articles , which can make *****Natural Language Generation***** models for generating summaries unfocused . | ||
| W17-3505 We present a flexible *****Natural Language Generation***** approach for Spanish , focused on the surface realisation stage , which integrates an inflection module in order to improve the naturalness and expressivity of the generated language . | ||
| where | 6 | |
| 2020.emnlp-main.387 Computational methods that identify portrayals of risk behaviors from audio - visual cues are limited in their applicability to films in post - production , *****where***** modifications might be prohibitively expensive . | ||
| 2020.conll-1.4 Since NLI examples encompass a variety of linguistic , logical , and reasoning phenomena , it remains unclear as to which specific concepts are learnt by the trained systems and *****where***** they can achieve strong generalization . | ||
| L12-1627 Comparing the semantic frames needed to annotate BCCWJ with those that the FrameNet ( FN ) project ( Fillmore and Baker 2009 , Fillmore 2006 ) already has defined revealed that : 1 ) differences in the Japanese and English semantic frames often concern different perspectives and different lexical aspects exhibited by the two lexicons ; and 2 ) in most of the cases *****where***** JFN defined new semantic frame for a word , the frame did not involve culture - specific scenes . | ||
| 2005.mtsummit-posters.6 In one of its uses , ki functions as a clause complementizer and is mapped usually by that in declarative clauses and by various wh - words ( such as what , why , *****where***** , how , etc . ) | ||
| 2020.emnlp-main.336 Although much attention has been paid to summarizing structured text like news reports or encyclopedia articles , summarizing conversationsan essential part of human - human / machine interaction *****where***** most important pieces of information are scattered across various utterances of different speakersremains relatively under - investigated . | ||
| Task - oriented dialog | 6 | |
| 2020.nlp4convai-1.6 *****Task - oriented dialog***** models typically leverage complex neural architectures and large - scale , pre - trained Transformers to achieve state - of - the - art performance on popular natural language understanding benchmarks . | ||
| D18-1038 *****Task - oriented dialog***** systems are becoming pervasive , and many companies heavily rely on them to complement human agents for customer service in call centers . | ||
| 2020.sigdial-1.4 *****Task - oriented dialog***** systems rely on dialog state tracking ( DST ) to monitor the user 's goal during the course of an interaction . | ||
| 2021.dravidianlangtech-1.11 *****Task - oriented dialog***** systems help a user achieve a particular goal by parsing user requests to execute a particular action . | ||
| D19-1131 *****Task - oriented dialog***** systems need to know when a query falls outside their range of supported intents , but current text classification corpora only define label sets that cover every example . | ||
| Automatic post - editing ( APE | 6 | |
| P19-1292 *****Automatic post - editing ( APE***** ) seeks to automatically refine the output of a black - box machine translation ( MT ) system through human post - edits . | ||
| 2020.emnlp-main.217 *****Automatic post - editing ( APE***** ) aims to improve machine translations , thereby reducing human post - editing effort . | ||
| E17-1050 *****Automatic post - editing ( APE***** ) for machine translation ( MT ) aims to fix recurrent errors made by the MT decoder by learning from correction examples . | ||
| D19-1634 *****Automatic post - editing ( APE***** ) , which aims to correct errors in the output of machine translation systems in a post - processing step , is an important task in natural language processing . | ||
| 2021.alta-1.18 *****Automatic post - editing ( APE***** ) is an important remedy for reducing errors of raw translated texts that are produced by machine translation ( MT ) systems or software - aided translation . | ||
| Question - | 6 | |
| N19-1242 *****Question -***** answering plays an important role in e - commerce as it allows potential customers to actively seek crucial information about products or services to help their purchase decision making . | ||
| W19-5059 In this paper , we present three approaches for Natural Language Inference , Question Entailment Recognition and *****Question -***** Answering to improve domain - specific Information Retrieval . | ||
| L16-1734 We present a corpus and a knowledge database aiming at developing *****Question -***** Answering in a new context , the open world of a video game . | ||
| D19-5803 Machine reading comprehension is a task related to *****Question -***** Answering where questions are not generic in scope but are related to a particular document . | ||
| 2021.naacl-main.153 Although *****Question -***** Answering has long been of research interest , its accessibility to users through a speech interface and its support to multiple languages have not been addressed in prior studies . | ||
| Virtual | 6 | |
| N18-1163 *****Virtual***** agents are becoming a prominent channel of interaction in customer service . | ||
| 2020.findings-emnlp.15 *****Virtual***** Assistants can be quite literal at times . | ||
| 2000.amta-workshop.5 Our project Wired for Peace : *****Virtual***** Diplomacy in Northeast Asia ( Http://www- neacd.ucsd.edu/ ) has as its main aim to provide policymakers and researchers of the U.S. , China , Russia , Japan , and Korea with Internet based tools to allow for continuous communication on issues of the regional security and cooperation . | ||
| 2020.sltu-1.50 *****Virtual***** agents are increasingly used for delivering health information in general , and mental health assistance in particular . | ||
| 2021.ecnlp-1.6 The growing popularity of *****Virtual***** Assistants poses new challenges for Entity Resolution , the task of linking mentions in text to their referent entities in a knowledge base . | ||
| task - oriented conversational | 6 | |
| W18-5036 We present a domain portable zero - shot learning approach for entity recognition in *****task - oriented conversational***** agents , which does not assume any annotated sentences at training time . | ||
| W19-8611 Generating fluent natural language responses from structured semantic representations is a critical step in *****task - oriented conversational***** systems . | ||
| 2021.conll-1.4 Previous research has found that *****task - oriented conversational***** agents are perceived more positively by users when they provide information in an empathetic manner compared to a plain , emotionless information exchange . | ||
| 2021.sigdial-1.25 This paper aims at providing a comprehensive overview of recent developments in dialogue state tracking ( DST ) for *****task - oriented conversational***** systems . | ||
| P19-1080 Generating fluent natural language responses from structured semantic representations is a critical step in *****task - oriented conversational***** systems . | ||
| distributed representations of | 6 | |
| W17-2615 Recently Le & Mikolov described two log - linear models , called Paragraph Vector , that can be used to learn state - of - the - art *****distributed representations of***** documents . | ||
| C16-1110 This paper presents an approach combining lexico - semantic resources and *****distributed representations of***** words applied to the evaluation in machine translation ( MT ) . | ||
| Q17-1028 We describe a neural network model that jointly learns *****distributed representations of***** texts and knowledge base ( KB ) entities . | ||
| E17-2072 Recently , there has been a lot of activity in learning *****distributed representations of***** words in vector spaces . | ||
| W18-6120 We propose a new word embedding method called word - like character n - gram embedding , which learns *****distributed representations of***** words by embedding word - like character n - grams . | ||
| open - domain dialog | 6 | |
| W19-5944 The aim of this paper is to mitigate the shortcomings of automatic evaluation of *****open - domain dialog***** systems through multi - reference evaluation . | ||
| P17-2080 Deep latent variable models have been shown to facilitate the response generation for *****open - domain dialog***** systems . | ||
| 2020.sigdial-1.28 It is important to define meaningful and interpretable automatic evaluation metrics for *****open - domain dialog***** research . | ||
| 2020.emnlp-main.28 Existing *****open - domain dialog***** models are generally trained to minimize the perplexity of target human responses . | ||
| 2020.acl-main.64 The lack of meaningful automatic evaluation metrics for dialog has impeded *****open - domain dialog***** research . | ||
| Neural conversation | 6 | |
| 2021.emnlp-main.173 *****Neural conversation***** models have shown great potentials towards generating fluent and informative responses by introducing external background knowledge . | ||
| 2020.acl-main.61 *****Neural conversation***** models are known to generate appropriate but non - informative responses in general . | ||
| D19-1188 *****Neural conversation***** systems generate responses based on the sequence - to - sequence ( SEQ2SEQ ) paradigm . | ||
| D18-1431 *****Neural conversation***** models tend to generate safe , generic responses for most inputs . | ||
| I17-2029 *****Neural conversation***** systems , typically using sequence - to - sequence ( seq2seq ) models , are showing promising progress recently . | ||
| use | 6 | |
| W19-5346 We submit constrained systems , i.e , we rely on the data provided for this language pair and do not *****use***** any external data . | ||
| N19-1044 Existing methods can be classified into two main categories , namely the *****use***** of placeholder tags for lexicon words and the use of hard constraints during decoding . | ||
| L16-1656 Although their role is relevant in the context of written text , there is no approach or dataset that makes *****use***** of contextuality of classic semantic relations beyond the boundary of one sentence . | ||
| K19-1084 We introduce a class of seq2seq models , GAMs ( Global Autoregressive Models ) , which combine an autoregressive component with a log - linear component , allowing the *****use***** of global a priori features to compensate for lack of data . | ||
| L10-1595 Most text processing research has focused on locating mission - relevant text ( information retrieval ) and on techniques for enriching text by transforming it to other forms of text ( translation , summarization ) always for *****use***** by humans . | ||
| large - scale parallel | 6 | |
| C18-1111 Neural machine translation ( NMT ) is a deep learning based approach for machine translation , which yields the state - of - the - art translation performance in scenarios where *****large - scale parallel***** corpora are available . | ||
| I17-1039 While neural machine translation ( NMT ) has become the new paradigm , the parameter optimization requires *****large - scale parallel***** data which is scarce in many domains and language pairs . | ||
| W17-4809 Although parallel coreference corpora can to a high degree support the development of SMT systems , there are no *****large - scale parallel***** datasets available due to the complexity of the annotation task and the variability in annotation schemes . | ||
| W18-2707 A *****large - scale parallel***** corpus is required to train encoder - decoder neural machine translation . | ||
| N18-2084 The performance of Neural Machine Translation ( NMT ) systems often suffers in low - resource scenarios where sufficiently *****large - scale parallel***** corpora can not be obtained . | ||
| Recognizing Textual Entailment ( RTE | 6 | |
| I17-1100 We propose to unify a variety of existing semantic classification tasks , such as semantic role labeling , anaphora resolution , and paraphrase detection , under the heading of *****Recognizing Textual Entailment ( RTE***** ) . | ||
| C16-1270 *****Recognizing Textual Entailment ( RTE***** ) is a fundamentally important task in natural language processing that has many applications . | ||
| N18-1069 How to identify , extract , and use phrasal knowledge is a crucial problem for the task of *****Recognizing Textual Entailment ( RTE***** ) . | ||
| P18-1091 Natural Language Inference ( NLI ) , also known as *****Recognizing Textual Entailment ( RTE***** ) , is one of the most important problems in natural language processing . | ||
| 2020.eval4nlp-1.10 *****Recognizing Textual Entailment ( RTE***** ) was proposed as a unified evaluation framework to compare semantic understanding of different NLP systems . | ||
| cross - lingual dependency | 6 | |
| D19-1574 This paper explores the task of leveraging typology in the context of *****cross - lingual dependency***** parsing . | ||
| E17-1023 This paper presents a new approach to the problem of *****cross - lingual dependency***** parsing , aiming at leveraging training data from different source languages to learn a parser in a target language . | ||
| W17-1216 This paper describes the submission from the University of Helsinki to the shared task on *****cross - lingual dependency***** parsing at VarDial 2017 . | ||
| 2020.findings-emnlp.265 We propose a novel approach to *****cross - lingual dependency***** parsing based on word reordering . | ||
| W18-6017 This paper describes a method of creating synthetic treebanks for *****cross - lingual dependency***** parsing using a combination of machine translation ( including pivot translation ) , annotation projection and the spanning tree algorithm . | ||
| applied | 6 | |
| 2020.lrec-1.773 However , standard NLP tools were often designed with standard texts in mind , and their performance decreases heavily when *****applied***** to social media data . | ||
| 2020.coling-main.271 As the main strength of our method , it can identify the coordination boundaries without training on labeled data , and can be *****applied***** even if coordination structure annotations are not available . | ||
| 2020.coling-main.171 Following the intuition that texts and images are complementary in advertising , we introduce a multimodal ensemble of a state of the art image - based classifier , a classifier based on an object detection architecture , and a fine - tuned language model *****applied***** to texts extracted from ads by OCR . | ||
| 2021.naacl-main.325 Natural language processing ( NLP ) research combines the study of universal principles , through basic science , with *****applied***** science targeting specific use cases and settings . | ||
| L08-1026 Constraint - based precision grammars , like the HPSG grammar we are using for the experiments reported in this paper , typically lack robustness , especially when *****applied***** to real world texts . | ||
| multi - hop | 6 | |
| 2020.lrec-1.671 Explainable question answering for complex questions often requires combining large numbers of facts to answer a question while providing a human - readable explanation for the answer , a process known as *****multi - hop***** inference . | ||
| 2020.textgraphs-1.14 Explainable question answering for science questions is a challenging task that requires *****multi - hop***** inference over a large set of fact sentences . | ||
| P19-2030 In this paper , we propose a *****multi - hop***** attention for the Transformer . | ||
| 2020.emnlp-main.99 Existing work on augmenting question answering ( QA ) models with external knowledge ( e.g. , knowledge graphs ) either struggle to model *****multi - hop***** relations efficiently , or lack transparency into the model 's prediction rationale . | ||
| 2020.findings-emnlp.351 Multi - hop reasoning approaches over knowledge graphs infer a missing relationship between entities with a *****multi - hop***** rule , which corresponds to a chain of relationships . | ||
| Abusive language | 6 | |
| 2020.alw-1.17 *****Abusive language***** classifiers have been shown to exhibit bias against women and racial minorities . | ||
| R19-1132 *****Abusive language***** detection has received much attention in the last years , and recent approaches perform the task in a number of different languages . | ||
| 2021.ranlp-1.99 *****Abusive language***** detection has become an important tool for the cultivation of safe online platforms . | ||
| 2021.naacl-main.48 *****Abusive language***** detection is an emerging field in natural language processing which has received a large amount of attention recently . | ||
| 2020.lrec-1.760 *****Abusive language***** detection is an unsolved and challenging problem for the NLP community . | ||
| open - domain conversational | 6 | |
| 2020.acl-main.19 Question answering ( QA ) is an important aspect of *****open - domain conversational***** agents , garnering specific research focus in the conversational QA ( ConvQA ) subtask . | ||
| 2021.naacl-demos.4 Having engaging and informative conversations with users is the utmost goal for *****open - domain conversational***** systems . | ||
| P18-1204 Asking good questions in *****open - domain conversational***** systems is quite significant but rather untouched . | ||
| D18-1432 We present three enhancements to existing encoder - decoder models for *****open - domain conversational***** agents , aimed at effectively modeling coherence and promoting output diversity : ( 1 ) We introduce a measure of coherence as the GloVe embedding similarity between the dialogue context and the generated response , ( 2 ) we filter our training corpora based on the measure of coherence to obtain topically coherent and lexically diverse context - response pairs , ( 3 ) we then train a response generator using a conditional variational autoencoder model that incorporates the measure of coherence as a latent variable and uses a context gate to guarantee topical consistency with the context and promote lexical diversity . | ||
| L16-1433 This paper presents an automatic corpus - based process to author an *****open - domain conversational***** strategy usable both in chatterbot systems and as a fallback strategy for out - of - domain human utterances . | ||
| Chinese Grammatical Error Diagnosis ( CGED | 6 | |
| W18-3706 This paper presents the NLPTEA 2018 shared task for *****Chinese Grammatical Error Diagnosis ( CGED***** ) which seeks to identify grammatical error types , their range of occurrence and recommended corrections within sentences written by learners of Chinese as foreign language . | ||
| 2020.nlptea-1.13 *****Chinese Grammatical Error Diagnosis ( CGED***** ) is a natural language processing task for the NLPTEA6 workshop . | ||
| W18-3708 This paper introduces the DM_NLP team 's system for NLPTEA 2018 shared task of *****Chinese Grammatical Error Diagnosis ( CGED***** ) , which can be used to detect and correct grammatical errors in texts written by Chinese as a Foreign Language ( CFL ) learners . | ||
| W18-3710 *****Chinese Grammatical Error Diagnosis ( CGED***** ) is a natural language processing task for the NLPTEA2018 workshop held during ACL2018 . | ||
| 2020.nlptea-1.4 This paper presents the NLPTEA 2020 shared task for *****Chinese Grammatical Error Diagnosis ( CGED***** ) which seeks to identify grammatical error types , their range of occurrence and recommended corrections within sentences written by learners of Chinese as a foreign language . | ||
| Event detection ( ED | 6 | |
| 2020.findings-emnlp.229 *****Event detection ( ED***** ) aims to identify and classify event triggers in texts , which is a crucial subtask of event extraction ( EE ) . | ||
| D18-1517 *****Event detection ( ED***** ) and word sense disambiguation ( WSD ) are two similar tasks in that they both involve identifying the classes ( i.e. | ||
| 2021.emnlp-main.206 *****Event detection ( ED***** ) aims at identifying event instances of specified types in given texts , which has been formalized as a sequence labeling task . | ||
| 2020.findings-emnlp.211 *****Event detection ( ED***** ) , a key subtask of information extraction , aims to recognize instances of specific event types in text . | ||
| 2020.emnlp-main.129 *****Event detection ( ED***** ) , which means identifying event trigger words and classifying event types , is the first and most fundamental step for extracting event knowledge from plain text . | ||
| Fine - | 6 | |
| 2021.adaptnlp-1.22 *****Fine -***** tuning is known to improve NLP models by adapting an initial model trained on more plentiful but less domain - salient examples to data in a target domain . | ||
| 2020.acl-main.357 *****Fine -***** tuning of pre - trained transformer models has become the standard approach for solving common NLP tasks . | ||
| 2020.semeval-1.213 *****Fine -***** tuning of pre - trained transformer networks such as BERT yield state - of - the - art results for text classification tasks . | ||
| 2020.coling-main.482 *****Fine -***** tuning with pre - trained language models ( e.g. | ||
| W18-3408 *****Fine -***** tuning is a popular method to achieve better performance when only a small target corpus is available . | ||
| Relation Extraction ( RE | 6 | |
| R19-1076 *****Relation Extraction ( RE***** ) consists in detecting and classifying semantic relations between entities in a sentence . | ||
| 2020.acl-main.715 This paper studies the task of *****Relation Extraction ( RE***** ) that aims to identify the semantic relations between two entity mentions in text . | ||
| D19-5720 Named Entity Recognition ( NER ) and *****Relation Extraction ( RE***** ) are essential tools in distilling knowledge from biomedical literature . | ||
| L14-1595 The increasing availability and maturity of both scalable computing architectures and deep syntactic parsers is opening up new possibilities for *****Relation Extraction ( RE***** ) on large corpora of natural language text . | ||
| 2020.acl-main.142 TACRED is one of the largest , most widely used crowdsourced datasets in *****Relation Extraction ( RE***** ) . | ||
| Natural Language Inference ( NLI ) | 6 | |
| 2021.acl-short.109 *****Natural Language Inference ( NLI )***** datasets contain examples with highly ambiguous labels . | ||
| 2021.emnlp-main.467 Many recent successes in sentence representation learning have been achieved by simply fine - tuning on the *****Natural Language Inference ( NLI )***** datasets with triplet loss or siamese loss . | ||
| 2020.emnlp-main.665 *****Natural Language Inference ( NLI )***** datasets contain annotation artefacts resulting in spurious correlations between the natural language utterances and their respective entailment classes . | ||
| S19-1028 Popular *****Natural Language Inference ( NLI )***** datasets have been shown to be tainted by hypothesis - only biases . | ||
| 2020.conll-1.4 Pre - trained Transformer - based neural architectures have consistently achieved state - of - the - art performance in the *****Natural Language Inference ( NLI )***** task . | ||
| knowledge bases ( KB | 6 | |
| C16-2059 Words to express relations in natural language ( NL ) statements may be different from those to represent properties in *****knowledge bases ( KB***** ) . | ||
| I17-2038 This paper investigates the problem of answering compositional factoid questions over *****knowledge bases ( KB***** ) under efficiency constraints . | ||
| 2021.emnlp-main.353 Incorporating *****knowledge bases ( KB***** ) into end - to - end task - oriented dialogue systems is challenging , since it requires to properly represent the entity of KB , which is associated with its KB context and dialogue context . | ||
| 2020.acl-main.624 Entity linking ( EL ) is concerned with disambiguating entity mentions in a text against *****knowledge bases ( KB***** ) . | ||
| E17-1057 Building *****knowledge bases ( KB***** ) automatically from text corpora is crucial for many applications such as question answering and web search . | ||
| oral | 6 | |
| 2021.mmsr-1.2 Speaker gestures are semantically co - expressive with speech and serve different pragmatic functions to accompany *****oral***** modality . | ||
| L12-1325 This article summarizes the evaluation process of an interface under development to consult an *****oral***** corpus of learners of Spanish as a Foreign Language . | ||
| L12-1188 In criminal proceedings , sometimes it is not easy to evaluate the sincerity of *****oral***** testimonies . | ||
| 2020.emnlp-main.473 The growth of social media has encouraged the written use of African American Vernacular English ( AAVE ) , which has traditionally been used only in *****oral***** contexts . | ||
| L16-1049 This project assesses the resources necessary to make *****oral***** history searchable by means of automatic speech recognition ( ASR ) . | ||
| question - answering | 6 | |
| C16-1224 Answer selection is a core component in any *****question - answering***** systems . | ||
| D18-1134 Previous work on *****question - answering***** systems mainly focuses on answering individual questions , assuming they are independent and devoid of context . | ||
| P19-1621 Although current evaluation of *****question - answering***** systems treats predictions in isolation , we need to consider the relationship between predictions to measure true understanding . | ||
| L06-1483 This paper describes the utility of semantic resources such as the Web , WordNet and gazetteers in the answer selection process for a *****question - answering***** system . | ||
| L06-1104 Search engines on the web and most existing *****question - answering***** systems provide the user with a set of hyperlinks and/or web page extracts containing answer(s ) to a question . | ||
| non - native | 6 | |
| 2021.acl-short.16 State - of - the - art machine translation ( MT ) systems are typically trained to generate standard target language ; however , many languages have multiple varieties ( regional varieties , dialects , sociolects , *****non - native***** varieties ) that are different from the standard language . | ||
| 2020.nlpcss-1.18 I test two hypotheses that play an important role in modern sociolinguistics and language evolution studies : first , that *****non - native***** production is simpler than native ; second , that production addressed to non - native speakers is simpler than that addressed to natives . | ||
| W19-4410 Grammatical error detection ( GED ) in *****non - native***** writing requires systems to identify a wide range of errors in text written by language learners . | ||
| W19-2719 This study aims to model the discourse structure of spontaneous spoken responses within the context of an assessment of English speaking proficiency for *****non - native***** speakers . | ||
| W17-5902 We present a pilot study on parsing *****non - native***** texts written by learners of Czech . | ||
| Reinforcement learning ( RL | 6 | |
| D19-1014 *****Reinforcement learning ( RL***** ) is an effective approach to learn an optimal dialog policy for task - oriented visual dialog systems . | ||
| N18-2112 *****Reinforcement learning ( RL***** ) is a promising approach to solve dialogue policy optimisation . | ||
| 2020.coling-main.41 *****Reinforcement learning ( RL***** ) can enable task - oriented dialogue systems to steer the conversation towards successful task completion . | ||
| D18-1415 *****Reinforcement learning ( RL***** ) is an attractive solution for task - oriented dialog systems . | ||
| 2020.ccl-1.91 *****Reinforcement learning ( RL***** ) has made remarkable progress in neural machine translation ( NMT ) . | ||
| Zero - shot | 6 | |
| 2021.emnlp-main.664 *****Zero - shot***** translations is a fascinating feature of Multilingual Neural Machine Translation ( MNMT ) systems . | ||
| 2020.acl-main.272 *****Zero - shot***** learning has been a tough problem since no labeled data is available for unseen classes during training , especially for classes with low similarity . | ||
| 2021.naacl-main.250 *****Zero - shot***** learning aims to recognize unseen objects using their semantic representations . | ||
| P19-1121 *****Zero - shot***** translation , translating between language pairs on which a Neural Machine Translation ( NMT ) system has never been trained , is an emergent property when training the system in multilingual settings . | ||
| 2020.wmt-1.18 This paper describes the University of Edinburgh 's submission of German - English systems to the WMT2020 Shared Tasks on News Translation and *****Zero - shot***** Robustness . | ||
| Story Cloze | 6 | |
| W17-0906 The LSDSem'17 shared task is the *****Story Cloze***** Test , a new evaluation for story understanding and script learning . | ||
| W17-0912 This paper describes an ensemble system submitted as part of the LSDSem Shared Task 2017 - the *****Story Cloze***** Test . | ||
| W17-0908 The *****Story Cloze***** test is a recent effort in providing a common test scenario for text understanding systems . | ||
| W17-0911 The *****Story Cloze***** Test consists of choosing a sentence that best completes a story given two choices . | ||
| N18-2015 In the *****Story Cloze***** Test , a system is presented with a 4 - sentence prompt to a story , and must determine which one of two potential endings is the ` right ' ending to the story . | ||
| Word Sense Disambiguation ( WSD ) | 6 | |
| L10-1128 We propose a *****Word Sense Disambiguation ( WSD )***** method that accurately classifies ambiguous words to concepts in the Associative Concept Dictionary ( ACD ) even when the test corpus and the training corpus for WSD are acquired from different domains . | ||
| 2021.acl-long.406 Lately proposed *****Word Sense Disambiguation ( WSD )***** systems have approached the estimated upper bound of the task on standard evaluation benchmarks . | ||
| 2020.acl-main.369 Knowing the Most Frequent Sense ( MFS ) of a word has been proved to help *****Word Sense Disambiguation ( WSD )***** models significantly . | ||
| L12-1049 *****Word Sense Disambiguation ( WSD )***** systems require large sense - tagged corpora along with lexical databases to reach satisfactory results . | ||
| L16-1268 *****Word Sense Disambiguation ( WSD )***** systems tend to have a strong bias towards assigning the Most Frequent Sense ( MFS ) , which results in high performance on the MFS but in a very low performance on the less frequent senses . | ||
| machine reading comprehension ( MRC ) | 6 | |
| 2020.acl-main.211 Existing *****machine reading comprehension ( MRC )***** models do not scale effectively to real - world applications like web - level information retrieval and question answering ( QA ) . | ||
| D19-1251 Numerical reasoning , such as addition , subtraction , sorting and counting is a critical skill in human 's reading comprehension , which has not been well considered in existing *****machine reading comprehension ( MRC )***** systems . | ||
| K19-1065 Remarkable success has been achieved in the last few years on some limited *****machine reading comprehension ( MRC )***** tasks . | ||
| D19-5828 In this paper , we introduce a simple system Baidu submitted for MRQA ( Machine Reading for Question Answering ) 2019 Shared Task that focused on generalization of *****machine reading comprehension ( MRC )***** models . | ||
| D19-5802 Most *****machine reading comprehension ( MRC )***** models separately handle encoding and matching with different network architectures . | ||
| Turkish | 6 | |
| L14-1279 We report two tools to conduct psycholinguistic experiments on *****Turkish***** words . | ||
| W19-3110 We present a broad coverage model of *****Turkish***** morphology and an open - source morphological analyzer that implements it . | ||
| W18-6419 We participated in the eight translation directions of four language pairs : Estonian - English , Finnish - English , *****Turkish***** - English and Chinese - English . | ||
| 2021.law-1.12 This paper presents several challenges faced when annotating *****Turkish***** treebanks in accordance with the Universal Dependencies ( UD ) guidelines and proposes solutions to address them . | ||
| 2021.gwc-1.18 However , many other languages lack polarity dictionaries , or the existing ones are small in size as in the case of SentiTurkNet , the first and only polarity dictionary in *****Turkish***** . | ||
| SemEval 2018 Task | 6 | |
| S18-1083 This paper describes the NeuroSent system that participated in *****SemEval 2018 Task***** 3 . | ||
| S18-1135 We describe two systems for semantic relation classification with which we participated in the *****SemEval 2018 Task***** 7 , subtask 1 on semantic relation classification : an SVM model and a CNN model . | ||
| S18-1190 This paper describes a warrant classification system for *****SemEval 2018 Task***** 12 , that attempts to learn semantic representations of reasons , claims and warrants . | ||
| S18-1013 This paper describes the NeuroSent system that participated in *****SemEval 2018 Task***** 1 . | ||
| S18-1118 We present SUNNYNLP , our system for solving *****SemEval 2018 Task***** 10 : Capturing Discriminative Attributes . | ||
| Knowledge Graph ( KG ) | 6 | |
| D19-1264 *****Knowledge Graph ( KG )***** reasoning aims at finding reasoning paths for relations , in order to solve the problem of incompleteness in KG . | ||
| 2021.acl-long.82 *****Knowledge Graph ( KG )***** completion research usually focuses on densely connected benchmark datasets that are not representative of real KGs . | ||
| D18-1225 *****Knowledge Graph ( KG )***** embedding has emerged as an active area of research resulting in the development of several KG embedding methods . | ||
| W18-3017 *****Knowledge Graph ( KG )***** embedding projects entities and relations into low dimensional vector space , which has been successfully applied in KG completion task . | ||
| P18-1012 *****Knowledge Graph ( KG )***** embedding has emerged as a very active area of research over the last few years , resulting in the development of several embedding methods . | ||
| arc | 5 | |
| W17-2203 Our main findings are: (a), the global emotion model is competitive with a large-vocabulary bag-of-words genre classifier (80%F1); (b), the emotion ***** arc ***** model shows a lower performance (59 % F1) but shows complementary behavior to the global model, as indicated by a very good performance of an oracle model (94 % F1) and an improved performance of an ensemble model (84 % F1); (c), genres differ in the extent to which stories follow the same emotional ***** arc *****s, with particularly uniform behavior for anger (mystery) and fear (adventures, romance, humor, science fiction). | ||
| 1963.earlymt-1.25 The terms ***** arc ***** all noun-phrases, and several different types of such phrases have been included in the program. | ||
| D17-1003 (1) it separates the construction for noncrossing edges and crossing edges; (2) in a single construction step, whether to create a new ***** arc ***** is deterministic. | ||
| K19-1023 From this representation we are able to derive an efficient parsing algorithm and design a neural network that learns vertex representations and ***** arc ***** scores. | ||
| 2020.emnlp-main.254 These results lead us to argue, second, that common simplistic probe tasks such as POS labeling and dependency ***** arc ***** labeling, are inadequate to evaluate the properties encoded in contextual word representations | ||
| 5k | 5 | |
| I17-1102 We evaluate the proposed models on multilingual document classification with disjoint label sets, on a large dataset which we provide, with 600k news documents in 8 languages, and ***** 5k ***** labels. | ||
| R19-1006 The approach yields sufficient data even in the case of relatively small Wikipedias, such as the Bulgarian one, where 62k articles produced ***** 5k ***** biased sentences. | ||
| N19-1380 In this paper, we present a new data set of 57k annotated utterances in English (43k), Spanish (8.6k) and Thai (***** 5k *****) across the domains weather, alarm, and reminder. | ||
| N19-1354 When trained and tested on 6 languages with less than ***** 5k ***** training instances, our parser consistently outperforms the strong bilstm baseline (Kiperwasser and Goldberg, 2016). | ||
| 2020.emnlp-main.43 We also collect ***** 5k ***** Cherokee monolingual data to enable semi-supervised learning | ||
| judgments | 5 | |
| W17-5210 We propose a framework for representing claims as microstructures, which express the beliefs, ***** judgments *****, and policies about the relations between domain-specific concepts. | ||
| 2012.amta-papers.12 We create a corpus with relevance ***** judgments ***** for both human and machine translated results, and use it to quantify the effect that MT quality has on end-to-end relevance. | ||
| 2021.cl-1.4 There are limitations with these approaches that either do not provide any context for ***** judgments *****, and thereby ignore ambiguity, or provide very specific sentential contexts that cannot then be used to generate a larger lexical resource. | ||
| 2021.ranlp-1.28 We build a new Portuguese benchmark corpus with 785 pairs between premise questions and archived answered questions marked with relevance ***** judgments ***** by medical experts. | ||
| L12-1380 In an online survey, we present a sense of a target word from one dictionary with senses from the other dictionary, asking for ***** judgments ***** of relatedness | ||
| PoS tagger | 5 | |
| W19-6133 Evaluation shows that for correctly tagged text, Nefnir obtains an accuracy of 99.55%, and for text tagged with a ***** PoS tagger *****, the accuracy obtained is 96.88%. | ||
| L12-1422 PoS tagging is now considered a ”””“solved problem””””; yet, because of the differences in the tagsets, interchange of the various ***** PoS tagger ***** available is still hampered. | ||
| L08-1611 With the aid of the same large-scale Arabic morphological analyzer and ***** PoS tagger ***** in the runtime, the possible senses of virtually any given Arabic word are retrievable. | ||
| L16-1241 Because of the small size of Romanian corpora, the performance of a ***** PoS tagger ***** or a dependency parser trained with the standard supervised methods fall far short from the performance achieved in most languages. | ||
| L12-1309 It consists of various components: a robust crawler (Heritrix), a user friendly web interface, several conversion and cleaning tools, an anti-duplicate filter, a language guesser, and a ***** PoS tagger ***** | ||
| Conventionally | 5 | |
| W18-5410 ***** Conventionally *****, many studies have used an attention matrix to interpret how Encoder-Decoder-based models translate a given source sentence to the corresponding target sentence. | ||
| D18-1384 ***** Conventionally *****, the prominent review aspects of a product type are determined manually. | ||
| L16-1040 ***** Conventionally *****, one may use per-token complementarity to describe these differences, but it is not very useful when the set is heavily skewed. | ||
| D18-1150 ***** Conventionally *****, neural language models are trained by minimizing perplexity (PPL) on grammatical sentences. | ||
| W19-0404 ***** Conventionally *****, topics are represented by their n most probable words, however, these representations are often difficult for humans to interpret | ||
| crucially | 5 | |
| 2021.eacl-main.249 We also evaluate a multitask dual encoder trained on both image-caption and caption-caption pairs that ***** crucially ***** demonstrates CxC's value for measuring the influence of intra- and inter-modality learning. | ||
| L08-1436 In such a language infrastructure, an access function to a lexical resource, embodied as an atomic Web service, plays a ***** crucially ***** important role in composing a composite Web service tailored to a users specific requirement. | ||
| 2021.spnlp-1.6 The analysis of public debates ***** crucially ***** requires the classification of political demands according to hierarchical claim ontologies (e.g. for immigration, a supercategory “Controlling Migration” might have subcategories “Asylum limit” or “Border installations”). | ||
| L06-1420 Present-day machine translation technologies ***** crucially ***** depend on the size and quality of lexical resources. | ||
| L08-1213 We present an exhaustive evaluation of two German treebanks with ***** crucially ***** different encoding schemes | ||
| bidirectional RNN | 5 | |
| D18-1140 Using the SARC dataset of Reddit comments, we show that augmenting a ***** bidirectional RNN ***** with these representations improves performance; the Bayesian approach suffices in homogeneous contexts, whereas the added power of the dense embeddings proves valuable in more diverse ones. | ||
| P18-2066 It then uses the learned document embedding to enhance another ***** bidirectional RNN ***** model to identify event triggers and their types in sentences. | ||
| S19-2075 Our best submission was a special ***** bidirectional RNN *****, which was ranked at the 11th position out of 68 submissions. | ||
| 2020.iwslt-1.11 All systems were neural based, including a fully-connected neural network for speech activity detection, a Kaldi factorized time delay neural network with recurrent neural network (RNN) language model rescoring for speech recognition, a ***** bidirectional RNN ***** with attention mechanism for sentence segmentation, and transformer networks trained with OpenNMT and Marian for machine translation. | ||
| W17-4104 We use a simple ***** bidirectional RNN ***** with LSTM nodes and achieve accuracy of 90% or higher | ||
| nonparametric | 5 | |
| 2008.amta-papers.17 This paper applies ***** nonparametric ***** statistical techniques to Machine Translation (MT) Evaluation using data from a large scale task-based study. | ||
| Q14-1036 Adaptor grammars are a flexible, powerful formalism for defining ***** nonparametric *****, unsupervised models of grammar productions. | ||
| 2021.acl-long.182 In this study, we develop a tree-structured topic model by leveraging ***** nonparametric ***** neural variational inference. | ||
| 2021.emnlp-main.755 CBR-KBQA consists of a ***** nonparametric ***** memory that stores cases (question and logical forms) and a parametric model that can generate a logical form for a new question by retrieving cases that are relevant to it. | ||
| J17-3003 Our approach builds on Reproducing Kernel Hilbert Space (RKHS) representations for ***** nonparametric ***** statistics, and takes the form of a test statistic that is computed from pairs of individual geotagged observations without aggregation into predefined geographical bins | ||
| behaviors | 5 | |
| L16-1592 If until now researchers have primarily focused on leveraging personalized content to identify latent information such as gender, nationality, location, or age of the author, this study seeks to establish a structured way of extracting possessions, or items that people own or are entitled to, as a way to ultimately provide insights into people's ***** behaviors ***** and characteristics. | ||
| 2021.eacl-main.256 Multiple studies have demonstrated that ***** behaviors ***** expressed on online social media platforms can indicate the mental health state of an individual. | ||
| 2020.emnlp-main.99 We also empirically show its effectiveness and scalability on CommonsenseQA and OpenbookQA datasets, and interpret its ***** behaviors ***** with case studies, with the code for experiments released. | ||
| L06-1503 Our analyses show that the co-occurrence of certain ***** behaviors ***** and valence classes significantly deviates from what is to be expected by chance; in isolated cases, ***** behaviors ***** are predictive of valence. | ||
| 2020.aacl-srw.5 This improved student identification tool has the potential to facilitate research on topics ranging from professional networking to the impact of education on Twitter ***** behaviors ***** | ||
| assuming | 5 | |
| W16-5411 Our approach can also be used in preparing training sentences for binary classification (domain-related vs. noise, subjectivity vs. objectivity, etc.), ***** assuming ***** that sentence-type annotation can be predicted by annotation of the most relevant sub-sentences. | ||
| D18-1134 Previous work on question-answering systems mainly focuses on answering individual questions, ***** assuming ***** they are independent and devoid of context. | ||
| N18-1059 In this paper, we present a novel framework for answering broad and complex questions, ***** assuming ***** answering simple questions is possible using a search engine and a reading comprehension model. | ||
| C18-1021 Most previous research in text simplification has aimed to develop generic solutions, ***** assuming ***** very homogeneous target audiences with consistent intra-group simplification needs. | ||
| 2021.emnlp-main.616 In this way, user interests learned from the past can be customized to match future hashtags, which is beyond the capability of existing methods ***** assuming ***** unchanged hashtag semantics | ||
| Semantic parsers | 5 | |
| C18-1076 ***** Semantic parsers ***** critically rely on accurate and high-coverage lexicons. | ||
| D19-6111 ***** Semantic parsers ***** are used to convert user's natural language commands to executable logical form in intelligent personal agents. | ||
| 2021.starsem-1.16 ***** Semantic parsers ***** map natural language utterances to meaning representations. | ||
| Q15-1039 ***** Semantic parsers ***** conventionally construct logical forms bottom-up in a fixed order, resulting in the generation of many extraneous partial logical forms. | ||
| 2021.eacl-main.292 ***** Semantic parsers ***** gain performance boosts with deep neural networks, but inherit vulnerabilities against adversarial examples | ||
| crawling | 5 | |
| P19-1249 For its construction the Twitter feeds of 71,706 verified accounts have been carefully linked with their respective Wikidata items, ***** crawling ***** both. | ||
| L12-1461 To achieve processing times that are insignificant compared to the time data collection (***** crawling *****) takes, we reimplemented the standard sentence- and word-level tokenizers and created new boilerplate and near-duplicate detection algorithms. | ||
| 2021.mtsummit-research.8 In this period we completed the full pipeline of development of a neural machine translation system: data ***** crawling ***** and cleaning and aligning and creating test sets and developing and testing models and and delivering them to the user partners. | ||
| 2020.lrec-1.767 It specifically consists in identifying Twitter users from the three cities, ***** crawling ***** their entire timelines, filtering the collected data in terms of user location and tweet language, and automatically excluding near-duplicate content. | ||
| L14-1072 One way to gather documents related to a specific topic of interest is to traverse a portion of the web graph in a targeted way, using focused ***** crawling ***** algorithms | ||
| Discourse segmentation | 5 | |
| D18-1116 ***** Discourse segmentation *****, which segments texts into Elementary Discourse Units, is a fundamental step in discourse analysis. | ||
| 2021.emnlp-main.188 ***** Discourse segmentation ***** and sentence-level discourse parsing play important roles for various NLP tasks to consider textual coherence. | ||
| P17-2037 ***** Discourse segmentation ***** is a crucial step in building end-to-end discourse parsers. | ||
| 2021.emnlp-main.104 ***** Discourse segmentation *****, the first step of discourse analysis, has been shown to improve results for text summarization, translation and other NLP tasks. | ||
| D17-1258 ***** Discourse segmentation ***** is the first step in building discourse parsers | ||
| Flair | 5 | |
| 2020.coling-demos.12 It combines state-of-the-art entity recognition and linking architectures, such as ***** Flair ***** and fine-tuned Bi-Encoders based on BERT, with an easy-to-use interface for healthcare professionals. | ||
| 2021.eacl-srw.21 We identify subtle differences between the performance of BERT and ***** Flair ***** on two English NER corpora and identify a weak spot in the performance of current models in Spanish. | ||
| 2020.clinicalnlp-1.6 We evaluate several biomedical contextual embeddings (based on BERT, ELMo, and ***** Flair *****) for the detection of medication entities such as Drugs and Adverse Drug Events (ADE) from Electronic Health Records (EHR) using the 2018 ADE and Medication Extraction (Track 2) n2c2 data-set. | ||
| 2020.semeval-1.136 The Bidirectional Encoder Representations from Transformers (BERT) regressor is considered the primary pre-trained model in our approach, whereas ***** Flair ***** is the main NLP library. | ||
| P19-1527 We also enrich our architectures with the recently published contextual embeddings: ELMo, BERT and ***** Flair *****, reaching further improvements for the four nested entity corpora | ||
| SLs | 5 | |
| 1999.mtsummit-1.47 The model has been developed for English and Russian as both ***** SLs ***** and TLs but is readily extensible to other languages. | ||
| 2021.bucc-1.4 Despite the significant advances of MT for spoken languages in the recent couple of decades, MT is in its infancy when it comes to ***** SLs *****. | ||
| L10-1381 Research on ***** SLs ***** requires to build, to analyse and to use corpora. | ||
| 2020.signlang-1.2 However, as ***** SLs ***** are structured through the use of space and iconicity, focusing on lexicon only prevents the field of Continuous Sign Language Recognition (CSLR) from extending to Sign Language Understanding and Translation. | ||
| 2020.lrec-1.738 The paper gives a brief overview of sign languages (***** SLs *****) and some peculiarities of SL fables such as the use of space, the strategy of Role Shift and classifiers | ||
| modifications | 5 | |
| L12-1080 The baseline decision tree classifier is compared against an eight-member voting scheme obtained from variants of the training data generated by ***** modifications ***** on the class representation and from two different classification algorithms, namely decision trees and k-nearest neighbors. | ||
| W18-1502 The application tracks users' ***** modifications ***** to generated sentences, which can be used to quantify their “helpfulness” in advancing the story. | ||
| W17-4422 We describe two ***** modifications ***** of a basic neural network architecture for sequence tagging. | ||
| 2014.iwslt-evaluation.16 In addition, ***** modifications ***** and improvements on automatic acoustic segmentation and deep neural network speaker adaptation were applied. | ||
| 2020.sigmorphon-1.13 Inspired by the high performance of a standard transformer model (Vaswani et al., 2017) on the task, we improve over this approach by adding two ***** modifications *****: (i) Instead of training exclusively on G2P, we additionally create examples for the opposite direction, phoneme-to-grapheme conversion (P2G) | ||
| estimating | 5 | |
| 2020.emnlp-main.396 We analyze span ID tasks via performance prediction, ***** estimating ***** how well neural architectures do on different tasks. | ||
| W18-4007 Instead of directly ***** estimating ***** the relation distribution of individual entities, it is generalized to the “class signature” of each entity. | ||
| 2020.semeval-1.109 Assessing the funniness of edited news headlines task deals with ***** estimating ***** the humorness in the headlines edited with micro-edits. | ||
| 2020.coling-main.316 This paper develops and implements a scalable methodology for (a) ***** estimating ***** the noisiness of labels produced by a typical crowdsourcing semantic annotation task, and (b) reducing the resulting error of the labeling process by as much as 20-30% in comparison to other common labeling strategies. | ||
| L08-1208 This question is tightly related to ***** estimating ***** the classifier performance after a certain amount of data has already been annotated | ||
| Notable | 5 | |
| 2020.lrec-1.390 ***** Notable ***** extensions include: confidence, corpus frequency, orthographic variants, lexicalized and non-lexicalized synsets and lemmas, new parts of speech, and more. | ||
| 2014.iwslt-evaluation.3 ***** Notable ***** features of the English system include deep neural network acoustic models in both tandem and hybrid configuration with the use of multi-level adaptive networks, LHUC adaptation and Maxout units. | ||
| 2020.cl-1.1 ***** Notable ***** findings include the following observations: (i) Word morphology and part-of-speech information are captured at the lower layers of the model; (ii) | ||
| W18-6316 ***** Notable ***** cases are standard national varieties such as Brazilian and European Portuguese, and Canadian and European French, which popular online machine translation services are not keeping distinct. | ||
| 2013.iwslt-evaluation.22 ***** Notable ***** features of the system include deep neural network acoustic models in both tandem and hybrid configuration, cross-domain adaptation with multi-level adaptive networks, and the use of a recurrent neural network language model | ||
| aiming | 5 | |
| W17-7703 We think that our results are a smooth entry for users ***** aiming ***** to receive the first impression about what is discussed within a debate topic containing waste number of argumentations. | ||
| 2021.emnlp-main.300 In this work, we focus on answering deep questions over financial data, ***** aiming ***** to automate the analysis of a large corpus of financial documents. | ||
| L08-1117 We address the problem of evaluation metrics of such information, ***** aiming ***** at fair comparisons between systems, by proposing some measures taking into account the globality of a text. | ||
| 2020.lrec-1.144 We present a work ***** aiming ***** to generate adapted content for dyslexic children for French, in the context of the ALECTOR project. | ||
| P18-5007 Finally, we discuss why they succeed, and when they may fail, ***** aiming ***** at providing some practical advice about deep reinforcement learning for solving real-world NLP problems | ||
| 20K | 5 | |
| 2021.emnlp-main.594 In support of this goal, we release MS2 (Multi-Document Summarization of Medical Studies), a dataset of over 470k documents and ***** 20K ***** summaries derived from the scientific literature. | ||
| 2020.vardial-1.13 We observe that parsing models trained on Occitan dialects achieve better results than a delexicalized model trained on other Romance languages despite the latter training corpus being much larger (***** 20K ***** vs 900K tokens). | ||
| D18-1166 It comprises of approximately ***** 20K ***** instructional recipes with multiple modalities such as titles, descriptions and aligned set of images. | ||
| 2021.acl-long.3 For experiments, we collect a large-scale Chinese dataset from Sina Weibo containing over ***** 20K ***** polls. | ||
| 2020.semeval-1.100 These corpora are comprised of ***** 20K ***** and 19K examples, respectively | ||
| VQA dataset | 5 | |
| C18-1163 To deal with this, we have created a Japanese ***** VQA dataset ***** by using crowdsourced annotation with images from the Visual Genome dataset. | ||
| 2021.naacl-main.192 Crucially, only information that is readily available in any ***** VQA dataset ***** is used to compute its scores. | ||
| D18-1164 We conduct extensive experiments on a popular ***** VQA dataset ***** and our system achieves comparable performance with the baselines, yet with added benefits of explanability and the inherent ability to further improve with higher quality explanations. | ||
| 2021.acl-short.90 To build such a framework, we create PathVQA, a ***** VQA dataset ***** with 32,795 questions asked from 4,998 pathology images. | ||
| 2020.lrec-1.678 Unfortunately, no ***** VQA dataset ***** exists that includes verb semantic information | ||
| intersection | 5 | |
| C16-1272 We leverage these insights to design an algorithm that decomposes the sentence ***** intersection ***** task into several simpler annotation tasks, facilitating the construction of a high quality dataset via crowdsourcing. | ||
| 2021.emnlp-main.694 In this paper, we propose a model that explicitly handles multiple-entity questions by implementing a new ***** intersection ***** operation, which identifies the shared elements between two sets of entities. | ||
| 2020.repl4nlp-1.4 Indeed, this representation enables the use fuzzy set theoretic operations, such as union, ***** intersection ***** and difference. | ||
| 2000.iwpt-1.8 Range Concatenation Languages are closed both under ***** intersection ***** and complementation and these closure properties may allow to consider novel ways to describe some linguistic processings. | ||
| R17-1061 Another type of conventionalized phrases can be revealed using two factors: low entropy of phrase associations and low ***** intersection ***** of component word and phrase associations | ||
| Practically | 5 | |
| P17-1095 ***** Practically *****, this means that we may treat the lexical resources as observations under the proposed generative model. | ||
| 2021.newsum-1.15 ***** Practically *****, more data is better at generalizing the training patterns to unseen data. | ||
| L12-1354 ***** Practically ***** all approaches solved this problem with machine learning methods. | ||
| 2020.aacl-main.25 ***** Practically *****, we conduct extensive experiments by varying which tokens to smooth, tuning the probability mass to be deducted from the true targets and considering different prior distributions. | ||
| 2021.naacl-main.266 ***** Practically *****, some combinations of slot values can be invalid according to external knowledge | ||
| conventionally | 5 | |
| W18-4301 What is often overlooked in the construction of a global interpretation of a narrative is the role contributed by the objects participating in these structures, and the latent events and activities ***** conventionally ***** associated with them. | ||
| 2021.mtsummit-research.11 Our proposed solution achieves significant performance improvement UNMT models that train ***** conventionally *****. | ||
| 2021.acl-long.248 These strategies ***** conventionally ***** aim to recognize coarse-grained entity types with few examples, while in practice, most unseen entity types are fine-grained. | ||
| 2021.emnlp-main.595 Image captioning has ***** conventionally ***** relied on reference-based automatic evaluations, where machine captions are compared against captions written by humans. | ||
| 2020.emnlp-main.676 Stock movements are influenced by varied factors beyond the ***** conventionally ***** studied historical prices, such as social media and correlations among stocks | ||
| 20k | 5 | |
| D18-1117 Our VideoStory captions dataset is complementary to prior work and contains ***** 20k ***** videos posted publicly on a social media platform amounting to 396 hours of video with 123k sentences, temporally aligned to the video. | ||
| 2020.emnlp-main.253 To address this we introduce a new corpus called COMETA, consisting of ***** 20k ***** English biomedical entity mentions from Reddit expert-annotated with links to SNOMED CT, a widely-used medical knowledge graph. | ||
| 2020.emnlp-main.408 We release a new session-based, compositional task-oriented parsing dataset of ***** 20k ***** sessions consisting of 60k utterances. | ||
| 2020.law-1.9 This paper reports on the harvesting, analysis, and enrichment of ***** 20k ***** documents from 4 different endangered language archives in 300 different low-resource languages. | ||
| 2021.emnlp-main.340 Experimental results show that, trained with only ***** 20k ***** English Wikipedia-based synthetic QA pairs, the QA model substantially outperforms previous unsupervised models on three in-domain datasets (SQuAD1.1, Natural Questions, TriviaQA) and three out-of-domain datasets (NewsQA, BioASQ, DuoRC), demonstrating the transferability of the approach | ||
| endowing | 5 | |
| 2020.acl-tutorials.7 Yet, ***** endowing ***** machines with such human-like commonsense reasoning capabilities has remained an elusive goal of artificial intelligence research for decades. | ||
| 2021.acl-long.553 Our experiments also show that neither the adjective itself nor its taxonomic class suffice in determining the correct plausibility judgement, emphasizing the importance of ***** endowing ***** automatic natural language understanding systems with more context sensitivity and common-sense reasoning. | ||
| P19-1369 In this paper, we take a radical step towards building a human-like conversational agent: ***** endowing ***** it with the ability of proactively leading the conversation (introducing a new topic or maintaining the current topic). | ||
| 2020.findings-emnlp.238 In the context of chit-chat dialogues it has been shown that ***** endowing ***** systems with a persona profile is important to produce more coherent and meaningful conversations. | ||
| 2021.naacl-main.68 The geometry of the boxes allows for efficient calculation of intersections and volumes, ***** endowing ***** the model with calibrated probabilistic semantics and facilitating the incorporation of relational constraints | ||
| verifying | 5 | |
| 2020.privatenlp-1.3 Given the size of app marketplaces, ***** verifying ***** compliance with such regulations is a tedious task. | ||
| 2016.gwc-1.44 This paper presents our first attempt at ***** verifying ***** integrity constraints of our openWordnet-PT against the ontology for Wordnets encoding. | ||
| W17-4114 The contributions of this work are (1) ***** verifying ***** the effectiveness of the state-of-the-art NER model for Japanese, (2) proposing a neural model for predicting a tag for each character using word and character information. | ||
| 2020.findings-emnlp.262 In this paper, we propose to enhance neural content planning by (1) understanding data values with contextual numerical value representations that bring the sense of value comparison into content planning; (2) ***** verifying ***** the importance and ordering of the selected sequence of records with policy gradient. | ||
| 2020.acl-main.549 Fact checking is a challenging task because ***** verifying ***** the truthfulness of a claim requires reasoning about multiple retrievable evidence | ||
| Alternatively | 5 | |
| W19-5027 ***** Alternatively *****, one can require a large amount of domain-specific QA data, but such data are rare, especially for the medical domain. | ||
| 2020.findings-emnlp.203 ***** Alternatively *****, one can try to avoid hallucinations by verifying that any specific entities in the summary appear in the original text in a similar context. | ||
| W17-1505 ***** Alternatively *****, to address the lack of diversity of mentions in the MT hypotheses, we focus on mention pairs and integrate their coreference scores with MT ones, resulting in post-editing decisions for mentions. | ||
| 2021.ranlp-1.47 ***** Alternatively *****, less aggressive online learning setups may preserve model stability, at the cost of reduced adaptation to user-generated corrections. | ||
| 2021.naacl-srw.9 ***** Alternatively *****, knowledge graphs can be constructed and queried to answer users' questions | ||
| establishing | 5 | |
| 2001.mtsummit-road.2 We propose a program of research which has as its goal ***** establishing ***** a framework and methodology for investigating the pragmatic aspects of the translation process and implementing a computational platform for carrying out systematic experiments on the pragmatics of translation. | ||
| W17-2410 We provide three graph construction methods ***** establishing ***** an edge from a given vertex to a preceding adjacent vertex, to a single similar vertex, or to multiple similar vertices. | ||
| 2021.eacl-main.64 By utilizing all annotated data, our model can boost the performance of a standard sequence-to-sequence model by over 5 BLEU points, ***** establishing ***** a new state-of-the-art on both datasets. | ||
| 2020.emnlp-main.357 Extensive results show that SSCR improves the correctness of ILBIE in terms of both object identity and position, ***** establishing ***** a new state of the art (SOTA) on two IBLIE datasets (i-CLEVR and CoDraw). | ||
| 2021.eacl-main.25 We demonstrate the utility of our framework by: (a) ***** establishing ***** best practices for eliciting diversity judgments from humans, (b) showing that humans substantially outperform automatic metrics in estimating content diversity, and (c) demonstrating that existing methods for controlling diversity by tuning a “decoding parameter” mostly affect form but not meaning | ||
| scenario | 5 | |
| W19-3410 We introduce the task of ***** scenario ***** detection, in which we identify references to scripts. | ||
| L06-1176 Surveys of real life terminology work have been conducted and these surveys have resulted in identification of ***** scenario ***** specific best practice descriptions of terminology work. | ||
| 2020.alta-1.20 In today's world of technology and automation, Natural language processing tools have benefited from growing access to data in order to analyze the context and ***** scenario *****. | ||
| 2005.mtsummit-invited.2 However, by appropriately choosing domain and ***** scenario *****, current MT technologies are successfully integrated in the pilot system. | ||
| 2020.lrec-1.66 We present a crowdsourcing method for eliciting natural-language commands containing temporal expressions for an AI voice assistant, by using pictures and ***** scenario ***** descriptions | ||
| effectiveness | 5 | |
| 2020.findings-emnlp.151 Recent studies on domain-specific BERT models show that ***** effectiveness ***** on downstream tasks can be improved when models are pretrained on in-domain data. | ||
| 2021.emnlp-main.668 To show our method's ***** effectiveness *****, we conduct extensive experiments on cross-lingual inference and review classification tasks across various languages. | ||
| 2021.eacl-main.135 Experiments performed on the SemEval2018 multi-label emotion data over three language sets (i.e., English, Arabic and Spanish) demonstrate our method's ***** effectiveness *****. | ||
| 2020.acl-main.491 Clear evidence of method ***** effectiveness ***** is found in very few cases: LIME improves simulatability in tabular classification, and our Prototype method is effective in counterfactual simulation tests. | ||
| 2020.ecomnlp-1.9 To generate a slogan, we apply an encoder–decoder model which has shown ***** effectiveness ***** in many kinds of natural language generation tasks, such as abstractive summarization | ||
| Correctly | 5 | |
| 2020.acl-main.418 ***** Correctly ***** resolving textual mentions of people fundamentally entails making inferences about those people. | ||
| 2021.emnlp-main.683 ***** Correctly ***** ordering the sentences requires an understanding of coherence with respect to the chronological sequence of events described in the text. | ||
| 2020.law-1.13 ***** Correctly ***** identifying these relations is therefore a crucial step in automatically answering MSQs. | ||
| 2021.cl-3.19 Abstract ***** Correctly ***** resolving textual mentions of people fundamentally entails making inferences about those people. | ||
| 2020.rdsm-1.4 ***** Correctly ***** classifying stances of replies can be significantly helpful for the automatic detection and classification of online rumours | ||
| 5K | 5 | |
| 2021.emnlp-main.463 On small datasets with less than ***** 5K ***** training examples, we get a gain of 1.82% in performance with additional pre-training for only 5% steps compared to the originally pre-trained models. | ||
| 2020.emnlp-main.376 Here we introduce DAIS, a large benchmark dataset containing 50K human judgments for ***** 5K ***** distinct sentence pairs in the English dative alternation. | ||
| 2021.emnlp-main.843 Moreover, structurally-diverse sampling achieves these improvements with as few as ***** 5K ***** examples, compared to 1M examples when sampling uniformly at random – a 200x improvement in data efficiency. | ||
| 2020.acl-main.142 To answer these questions, we first validate the most challenging ***** 5K ***** examples in the development and test sets using trained annotators. | ||
| 2020.acl-main.653 MLQA has over 12K instances in English and ***** 5K ***** in each other language, with each instance parallel between 4 languages on average | ||
| Theoretically | 5 | |
| 2014.amta-researchers.6 ***** Theoretically *****, using this kind of data to train SMT systems is likely to reinforce the errors committed by other systems, or even by an earlier versions of the same system. | ||
| 2020.acl-main.524 ***** Theoretically *****, the resulting distinct feature distributions for each entity type make it more powerful for cross-domain transfer. | ||
| 2020.aacl-main.25 ***** Theoretically *****, we derive and explain exactly what label smoothing is optimizing for. | ||
| 2021.naacl-main.85 ***** Theoretically *****, we provide a connection of two recent methods, Jacobian Regularization and Virtual Adversarial Training, under this framework. | ||
| 2021.naacl-main.302 ***** Theoretically *****, we demonstrate that these three dropouts play different roles from regularization perspectives | ||
| CASE | 5 | |
| 2021.case-1.25 This particular method is responsive to all three Subtasks of Task 2, Fine-Grained Classification of Socio-Political Events, introduced at the ***** CASE ***** workshop of ACL-IJCNLP 2021. | ||
| 2021.case-1.19 We participated ***** CASE ***** shared task in ACL-IJCNLP 2021. | ||
| 2021.case-1.26 We present our submission to Task 2 of the Socio-political and Crisis Events Detection Shared Task at the ***** CASE ***** @ | ||
| 2021.case-1.11 Benchmarking state-of-the-art text classification and information extraction systems in multilingual, cross-lingual, few-shot, and zero-shot settings for socio-political event information collection is achieved in the scope of the shared task Socio-political and Crisis Events Detection at the workshop ***** CASE ***** @ | ||
| 2021.case-1.20 In this paper, we present our submission to the Shared Tasks on Socio-Political and Crisis Events Detection, Task 1, Multilingual Protest News Detection, Subtask 2, Event Sentence Classification, of ***** CASE ***** | ||
| goal - oriented | 5 | |
| N19-2027 Neural approaches to Natural Language Generation ( NLG ) have been promising for *****goal - oriented***** dialogue . | ||
| P19-1646 The ability to engage in *****goal - oriented***** conversations has allowed humans to gain knowledge , reduce uncertainty , and perform tasks more efficiently . | ||
| 2021.emnlp-main.498 We propose MultiDoc2Dial , a new task and dataset on modeling *****goal - oriented***** dialogues grounded in multiple documents . | ||
| 2020.emnlp-main.652 We introduce doc2dial , a new dataset of *****goal - oriented***** dialogues that are grounded in the associated documents . | ||
| W19-5918 Learning an efficient manager of dialogue agent from data with little manual intervention is important , especially for *****goal - oriented***** dialogues . | ||
| automatic post - editing ( APE | 5 | |
| 2020.coling-main.524 In *****automatic post - editing ( APE***** ) it makes sense to condition post - editing ( pe ) decisions on both the source ( src ) and the machine translated text ( mt ) as input . | ||
| E17-2056 We present a second - stage machine translation ( MT ) system based on a neural machine translation ( NMT ) approach to *****automatic post - editing ( APE***** ) that improves the translation quality provided by a first - stage MT system . | ||
| C16-1241 In this paper we combine two strands of machine translation ( MT ) research : *****automatic post - editing ( APE***** ) and multi - engine ( system combination ) MT . | ||
| W17-5705 Aiming at facilitating the research on quality estimation ( QE ) and *****automatic post - editing ( APE***** ) of machine translation ( MT ) outputs , especially for those among Asian languages , we have created new datasets for Japanese to English , Chinese , and Korean translations . | ||
| 2020.wmt-1.82 This paper describes POSTECH - ETRI 's submission to WMT2020 for the shared task on *****automatic post - editing ( APE***** ) for 2 language pairs : English - German ( En - De ) and English - Chinese ( En - Zh ) . | ||
| semantic text | 5 | |
| C16-1272 Sentence intersection captures the semantic overlap of two texts , generalizing over paradigms such as textual entailment and *****semantic text***** similarity . | ||
| K18-1050 Entity Linking ( EL ) is an essential task for *****semantic text***** understanding and information extraction . | ||
| D19-1272 In this paper , we present a novel method for measurably adjusting the semantics of text while preserving its sentiment and fluency , a task we call *****semantic text***** exchange . | ||
| L12-1233 The aim of this paper is to present a system for *****semantic text***** annotation called Inforex . | ||
| 2021.ranlp-1.120 Unsupervised representation learning of words from large multilingual corpora is useful for downstream tasks such as word sense disambiguation , *****semantic text***** similarity , and information retrieval . | ||
| Abstractive text | 5 | |
| K19-1078 *****Abstractive text***** summarization aims at generating human - like summaries by understanding and paraphrasing the given input content . | ||
| 2021.eacl-main.220 *****Abstractive text***** summarization aims at compressing the information of a long source document into a rephrased , condensed summary . | ||
| 2021.emnlp-main.741 *****Abstractive text***** summarization is one of the areas influenced by the emergence of pre - trained language models . | ||
| D18-1207 *****Abstractive text***** summarization aims to shorten long text documents into a human readable form that contains the most important facts from the original document . | ||
| N18-2102 *****Abstractive text***** summarization is the task of compressing and rewriting a long document into a short summary while maintaining saliency , directed logical entailment , and non - redundancy . | ||
| Mental health | 5 | |
| 2021.acl-long.322 *****Mental health***** conditions remain underdiagnosed even in countries with common access to advanced medical care . | ||
| W18-0607 *****Mental health***** problems represent a major public health challenge . | ||
| P19-2003 *****Mental health***** research can benefit increasingly fruitfully from computational linguistics methods , given the abundant availability of language data in the internet and advances of computational tools . | ||
| P19-1089 *****Mental health***** counseling is an enterprise with profound societal importance where conversations play a primary role . | ||
| W18-0606 *****Mental health***** forums are online spaces where people can share their experiences anonymously and get peer support . | ||
| automatic term | 5 | |
| R19-1117 Traditional approaches to *****automatic term***** extraction do not rely on machine learning ( ML ) and select the top n ranked candidate terms or candidate terms above a certain predefined cut - off point , based on a limited number of linguistic and statistical clues . | ||
| 2020.acl-main.258 While *****automatic term***** extraction is a well - researched area , computational approaches to distinguish between degrees of technicality are still understudied . | ||
| L14-1703 In this paper , we propose a method that combines the principles of *****automatic term***** recognition and the distributional hypothesis to identify technology terms from a corpus of scientific publications . | ||
| 2020.lrec-1.540 We perform a comparative study for *****automatic term***** extraction from domain - specific language using a PageRank model with different edge - weighting methods . | ||
| L06-1238 Algorithms for *****automatic term***** extraction in a specific domain should consider at least two issues , namely Unithood and Termhood ( Kageura , 1996 ) . | ||
| Code - mixed | 5 | |
| 2021.nlp4convai-1.26 *****Code - mixed***** language plays a crucial role in communication in multilingual societies . | ||
| 2020.calcs-1.4 *****Code - mixed***** texts are abundant , especially in social media , and poses a problem for NLP tools , which are typically trained on monolingual corpora . | ||
| 2021.ranlp-srw.2 *****Code - mixed***** language plays a crucial role in communication in multilingual societies . | ||
| 2019.icon-1.17 *****Code - mixed***** texts are widespread nowadays due to the advent of social media . | ||
| 2021.calcs-1.7 *****Code - mixed***** languages are very popular in multilingual societies around the world , yet the resources lag behind to enable robust systems on such languages . | ||
| Multilingual BERT ( mBERT | 5 | |
| 2020.repl4nlp-1.16 *****Multilingual BERT ( mBERT***** ) trained on 104 languages has shown surprisingly good cross - lingual performance on several NLP tasks , even without explicit cross - lingual signals . | ||
| 2021.eacl-main.215 We investigate how *****Multilingual BERT ( mBERT***** ) encodes grammar by examining how the high - order grammatical feature of morphosyntactic alignment ( how different languages define what counts as a subject ) is manifested across the embedding spaces of different languages . | ||
| 2020.findings-emnlp.83 *****Multilingual BERT ( mBERT***** ) has shown reasonable capability for zero - shot cross - lingual transfer when fine - tuned on downstream tasks . | ||
| 2020.emnlp-main.362 *****Multilingual BERT ( mBERT***** ) , XLM - RoBERTa ( XLMR ) and other unsupervised multilingual encoders can effectively learn cross - lingual representation . | ||
| 2020.acl-main.493 Recent work has found evidence that *****Multilingual BERT ( mBERT***** ) , a transformer - based multilingual masked language model , is capable of zero - shot cross - lingual transfer , suggesting that some aspects of its representations are shared cross - lingually . | ||
| syntax - based | 5 | |
| R19-1063 This paper describes a novel , *****syntax - based***** system for automatic detection and resolution of Noun Phrase Ellipsis ( NPE ) in English . | ||
| W17-1724 This paper aims at assessing to what extent a *****syntax - based***** method ( Recurring Lexico - syntactic Trees ( RLT ) extraction ) allows us to extract large phraseological units such as prefabricated routines , e.g. | ||
| D18-1189 Recent research proposes *****syntax - based***** approaches to address the problem of generating programs from natural language specifications . | ||
| W19-3648 We describe work in progress for evaluating performance of sequence - to - sequence neural networks on the task of *****syntax - based***** reordering for rules applicable to simultaneous machine translation . | ||
| 2008.jeptalnrecital-court.14 We consider the value of replacing and/or combining string - basedmethods with *****syntax - based***** methods for phrase - based statistical machine translation ( PBSMT ) , and we also consider the relative merits of using constituency - annotated vs. dependency - annotated training data . | ||
| bag - of - words | 5 | |
| S19-2157 In the effort to tackle the challenge of Hyperpartisan News Detection , i.e. , the task of deciding whether a news article is biased towards one party , faction , cause , or person , we experimented with two systems : i ) a standard supervised learning approach using superficial text and *****bag - of - words***** features from the article title and body , and ii ) a deep learning system comprising a four - layer convolutional neural network and max - pooling layers after the embedding layer , feeding the consolidated features to a bi - directional recurrent neural network . | ||
| C18-1226 We evaluated various compositional models , from *****bag - of - words***** representations to compositional RNN - based models , on several extrinsic supervised and unsupervised evaluation benchmarks . | ||
| 2020.bea-1.15 Most natural language processing research now recommends large Transformer - based models with fine - tuning for supervised classification tasks ; older strategies like *****bag - of - words***** features and linear models have fallen out of favor . | ||
| 2021.naacl-main.243 Neural topic models can augment or replace *****bag - of - words***** inputs with the learned representations of deep pre - trained transformer - based word prediction models . | ||
| 2021.ranlp-1.93 We present a neural - network - driven model for annotating frustration intensity in customer support tweets , based on representing tweet texts using a *****bag - of - words***** encoding after processing with subword segmentation together with non - lexical features . | ||
| Multi - task | 5 | |
| D18-1484 *****Multi - task***** learning in text classification leverages implicit correlations among related tasks to extract common features and yield performance gains . | ||
| W17-4110 *****Multi - task***** training is an effective method to mitigate the data sparsity problem . | ||
| D18-1486 *****Multi - task***** learning has an ability to share the knowledge among related tasks and implicitly increase the training data . | ||
| 2020.socialnlp-1.8 Two prevalent transfer learning approaches are used in recent works to improve neural networks performance for domains with small amounts of annotated data : *****Multi - task***** learning which involves training the task of interest with related auxiliary tasks to exploit their underlying similarities , and Mono - task fine - tuning , where the weights of the model are initialized with the pretrained weights of a large - scale labeled source domain and then fine - tuned with labeled data of the target domain ( domain of interest ) . | ||
| 2021.emnlp-main.451 *****Multi - task***** learning with transformer encoders ( MTL ) has emerged as a powerful technique to improve performance on closely - related tasks for both accuracy and efficiency while a question still remains whether or not it would perform as well on tasks that are distinct in nature . | ||
| Probabilistic topic | 5 | |
| Q15-1022 *****Probabilistic topic***** models are widely used to discover latent topics in document collections , while latent feature vector representations of words have been used to obtain high performance in many NLP tasks . | ||
| 2020.cl-1.3 *****Probabilistic topic***** modeling is a common first step in crosslingual tasks to enable knowledge transfer and extract multilingual features . | ||
| 2021.eacl-main.209 *****Probabilistic topic***** models in low data resource scenarios are faced with less reliable estimates due to sparsity of discrete word co - occurrence counts , and do not have the luxury of retraining word or topic embeddings using neural methods . | ||
| Q17-1001 *****Probabilistic topic***** models are important tools for indexing , summarizing , and analyzing large document collections by their themes . | ||
| D19-1349 *****Probabilistic topic***** models such as latent Dirichlet allocation ( LDA ) are popularly used with Bayesian inference methods such as Gibbs sampling to learn posterior distributions over topic model parameters . | ||
| Information Retrieval ( IR | 5 | |
| L16-1113 Parsing Web information , namely parsing content to find relevant documents on the basis of a user 's query , represents a crucial step to guarantee fast and accurate *****Information Retrieval ( IR***** ) . | ||
| W19-4605 Word Embeddings ( WE ) are getting increasingly popular and widely applied in many Natural Language Processing ( NLP ) applications due to their effectiveness in capturing semantic properties of words ; Machine Translation ( MT ) , *****Information Retrieval ( IR***** ) and Information Extraction ( IE ) are among such areas . | ||
| 2020.sltu-1.49 Dense word vectors or ` word embeddings ' which encode semantic properties of words , have now become integral to NLP tasks like Machine Translation ( MT ) , Question Answering ( QA ) , Word Sense Disambiguation ( WSD ) , and *****Information Retrieval ( IR***** ) . | ||
| L08-1252 Discovering relations among Named Entities ( NEs ) from large corpora is both a challenging , as well as useful task in the domain of Natural Language Processing , with applications in *****Information Retrieval ( IR***** ) , Summarization ( SUM ) , Question Answering ( QA ) and Textual Entailment ( TE ) . | ||
| P17-2079 We propose AliMe Chat , an open - domain chatbot engine that integrates the joint results of *****Information Retrieval ( IR***** ) and Sequence to Sequence ( Seq2Seq ) based generation models . | ||
| greedy | 5 | |
| P17-2058 We demonstrate that a continuous relaxation of the argmax operation can be used to create a differentiable approximation to *****greedy***** decoding in sequence - to - sequence ( seq2seq ) models . | ||
| N18-1157 Submodular maximization with the *****greedy***** algorithm has been studied as an effective approach to extractive summarization . | ||
| L16-1116 Corpus design for speech synthesis is a well - researched topic in languages such as English compared to Modern Standard Arabic , and there is a tendency to focus on methods to automatically generate the orthographic transcript to be recorded ( usually *****greedy***** methods ) . | ||
| D18-1342 Beam search is widely used in neural machine translation , and usually improves translation quality compared to *****greedy***** search . | ||
| D18-1035 Beam search is a widely used approximate search strategy for neural network decoders , and it generally outperforms simple *****greedy***** decoding on tasks like machine translation . | ||
| Latent Dirichlet Allocation ( LDA | 5 | |
| 2020.acl-main.32 Recent years have witnessed a surge of interests of using neural topic models for automatic topic extraction from text , since they avoid the complicated mathematical derivations for model inference as in traditional topic models such as *****Latent Dirichlet Allocation ( LDA***** ) . | ||
| P17-2084 Topical PageRank ( TPR ) uses latent topic distribution inferred by *****Latent Dirichlet Allocation ( LDA***** ) to perform ranking of noun phrases extracted from documents . | ||
| Q17-1037 While generative models such as *****Latent Dirichlet Allocation ( LDA***** ) have proven fruitful in topic modeling , they often require detailed assumptions and careful specification of hyperparameters . | ||
| C16-1211 *****Latent Dirichlet Allocation ( LDA***** ) and its variants have been widely used to discover latent topics in textual documents . | ||
| C16-1166 The exchangeability assumption in topic models like *****Latent Dirichlet Allocation ( LDA***** ) often results in inferring inconsistent topics for the words of text spans like noun - phrases , which are usually expected to be topically coherent . | ||
| text - to - speech | 5 | |
| 2020.lrec-1.801 In this paper we present a multidialectal corpus approach for building a *****text - to - speech***** voice for a new dialect in a language with existing resources , focusing on various South American dialects of Spanish . | ||
| W18-2414 Grapheme - to - phoneme models are key components in automatic speech recognition and *****text - to - speech***** systems . | ||
| L10-1063 This presentation and accompanying demonstration focuses on the development of a mobile platform for e - learning purposes with enhanced *****text - to - speech***** capabilities . | ||
| E17-3009 Voice enabled human computer interfaces ( HCI ) that integrate automatic speech recognition , *****text - to - speech***** synthesis and natural language understanding have become a commodity , introduced by the immersion of smart phones and other gadgets in our daily lives . | ||
| 1997.iwpt-1.4 Intonational information is frequently discarded in speech recognition , and assigned by default heuristics in *****text - to - speech***** generation . | ||
| PARSEME shared | 5 | |
| 2020.mwe-1.14 We present edition 1.2 of the *****PARSEME shared***** task on identification of verbal multiword expressions ( VMWEs ) . | ||
| W19-5121 Recent initiatives such as the *****PARSEME shared***** task allowed the rapid development of MWE identification systems . | ||
| 2020.mwe-1.20 In this paper , we present MultiVitaminBooster , a system implemented for the *****PARSEME shared***** task on semi - supervised identification of verbal multiword expressions - edition 1.2 . | ||
| 2020.mwe-1.16 We describe the Seen2Unseen system that participated in edition 1.2 of the *****PARSEME shared***** task on automatic identification of verbal multiword expressions ( VMWEs ) . | ||
| W18-4932 We describe the VarIDE system ( standing for Variant IDEntification ) which participated in the edition 1.1 of the *****PARSEME shared***** task on automatic identification of verbal multiword expressions ( VMWEs ) . | ||
| Mental | 5 | |
| Q16-1033 *****Mental***** illness is one of the most pressing public health issues of our time . | ||
| R19-1077 *****Mental***** health is one of the main concerns of today 's society . | ||
| 2021.ranlp-1.41 *****Mental***** health is getting more and more attention recently , depression being a very common illness nowadays , but also other disorders like anxiety , obsessive - compulsive disorders , feeding disorders , autism , or attention - deficit / hyperactivity disorders . | ||
| C18-1126 *****Mental***** health is a significant and growing public health concern . | ||
| D19-5542 *****Mental***** health poses a significant challenge for an individual 's well - being . | ||
| Entity Linking ( EL | 5 | |
| 2020.emnlp-main.253 Whilst there has been growing progress in *****Entity Linking ( EL***** ) for general language , existing datasets fail to address the complex nature of health terminology in layman 's language . | ||
| K18-1050 *****Entity Linking ( EL***** ) is an essential task for semantic text understanding and information extraction . | ||
| 2021.naacl-industry.25 Named Entity Recognition ( NER ) and *****Entity Linking ( EL***** ) play an essential role in voice assistant interaction , but are challenging due to the special difficulties associated with spoken user queries . | ||
| Q14-1019 *****Entity Linking ( EL***** ) and Word Sense Disambiguation ( WSD ) both address the lexical ambiguity of language . | ||
| L16-1528 In this paper we present a gold standard dataset for *****Entity Linking ( EL***** ) in the Music Domain . | ||
| Style | 5 | |
| P18-1080 *****Style***** transfer is the task of rephrasing the text to contain specific stylistic properties without changing the intent or affect within the context . | ||
| 2021.naacl-main.275 *****Style***** transfer has been widely explored in natural language generation with non - parallel corpus by directly or indirectly extracting a notion of style from source and target domain corpus . | ||
| N18-1012 *****Style***** transfer is the task of automatically transforming a piece of text in one particular style into another . | ||
| 2021.emnlp-main.349 *****Style***** transfer aims to rewrite a source text in a different target style while preserving its content . | ||
| L04-1244 *****Style***** guides or writing recommendations play an important role in the field of technical documentation production , e.g. | ||
| Cross - domain sentiment | 5 | |
| D19-1558 *****Cross - domain sentiment***** classification has drawn much attention in recent years . | ||
| N19-1258 *****Cross - domain sentiment***** classification aims to predict sentiment polarity on a target domain utilizing a classifier learned from a source domain . | ||
| 2020.acl-main.370 *****Cross - domain sentiment***** classification aims to address the lack of massive amounts of labeled data . | ||
| 2020.coling-main.22 *****Cross - domain sentiment***** analysis is currently a hot topic in both the research and industrial areas . | ||
| 2020.acl-main.292 *****Cross - domain sentiment***** analysis has received significant attention in recent years , prompted by the need to combat the domain gap between different applications that make use of sentiment analysis . | ||
| Variational Autoencoder ( VAE | 5 | |
| W19-8673 *****Variational Autoencoder ( VAE***** ) is a powerful method for learning representations of high - dimensional data . | ||
| 2020.emnlp-main.378 When trained effectively , the *****Variational Autoencoder ( VAE***** ) can be both a powerful generative model and an effective representation learning framework for natural language . | ||
| D19-1370 When trained effectively , the *****Variational Autoencoder ( VAE***** ) is both a powerful language model and an effective representation learning framework . | ||
| 2020.coling-main.216 The *****Variational Autoencoder ( VAE***** ) is a popular and powerful model applied to text modelling to generate diverse sentences . | ||
| 2020.acl-main.235 *****Variational Autoencoder ( VAE***** ) is widely used as a generative model to approximate a model 's posterior on latent variables by combining the amortized variational inference and deep neural networks . | ||
| Natural language processing ( NLP ) | 5 | |
| W17-1604 *****Natural language processing ( NLP )***** systems analyze and/or generate human language , typically on users ' behalf . | ||
| 2021.naacl-main.161 *****Natural language processing ( NLP )***** tasks , ranging from text classification to text generation , have been revolutionised by the pretrained language models , such as BERT . | ||
| 2021.naacl-main.49 *****Natural language processing ( NLP )***** applications are now more powerful and ubiquitous than ever before . | ||
| 2021.naacl-main.325 *****Natural language processing ( NLP )***** research combines the study of universal principles , through basic science , with applied science targeting specific use cases and settings . | ||
| 2021.eacl-main.314 *****Natural language processing ( NLP )***** tasks ( e.g. | ||
| demographic | 5 | |
| 2021.eacl-main.268 Affect preferences vary with user demographics , and tapping into *****demographic***** information provides important cues about the users ' language preferences . | ||
| 2021.gebnlp-1.8 In this work we explore the effect of incorporating *****demographic***** metadata in a text classifier trained on top of a pre - trained transformer language model . | ||
| P17-2075 In social media , *****demographic***** inference is a critical task in order to gain a better understanding of a cohort and to facilitate interacting with one 's audience . | ||
| 2020.vardial-1.27 Identifying a user 's location can be useful for recommendation systems , *****demographic***** analyses , and disaster outbreak monitoring . | ||
| S19-1015 Language use varies across different *****demographic***** factors , such as gender , age , and geographic location . | ||
| Dialogue state | 5 | |
| D18-1299 *****Dialogue state***** tracker is the core part of a spoken dialogue system . | ||
| W19-5905 *****Dialogue state***** tracking is an important component in task - oriented dialogue systems to identify users ' goals and requests as a dialogue proceeds . | ||
| 2020.acl-main.567 *****Dialogue state***** tracker is responsible for inferring user intentions through dialogue history . | ||
| P18-1135 *****Dialogue state***** tracking , which estimates user goals and requests given the dialogue context , is an essential part of task - oriented dialogue systems . | ||
| W19-5910 *****Dialogue state***** tracking requires the population and maintenance of a multi - slot frame representation of the dialogue state . | ||
| visual question answering ( VQA | 5 | |
| 2021.ccl-1.92 The predominant approach of *****visual question answering ( VQA***** ) relies on encoding the imageand question with a black box neural encoder and decoding a single token into answers suchas yes or no . | ||
| 2020.acl-main.683 This work deals with the challenge of learning and reasoning over language and vision data for the related downstream tasks such as *****visual question answering ( VQA***** ) and natural language for visual reasoning ( NLVR ) . | ||
| 2021.conll-1.3 We present VQA - MHUG a novel 49 - participant dataset of multimodal human gaze on both images and questions during *****visual question answering ( VQA***** ) collected using a high - speed eye tracker . | ||
| 2021.naacl-main.289 Most existing research on *****visual question answering ( VQA***** ) is limited to information explicitly present in an image or a video . | ||
| 2020.findings-emnlp.34 We study the problem of *****visual question answering ( VQA***** ) in images by exploiting supervised domain adaptation , where there is a large amount of labeled data in the source domain but only limited labeled data in the target domain , with the goal to train a good target model . | ||
| Community Question Answering | 5 | |
| S19-2149 We present SemEval-2019 Task 8 on Fact Checking in *****Community Question Answering***** Forums , which features two subtasks . | ||
| R19-1070 *****Community Question Answering***** forums are popular among Internet users , and a basic problem they encounter is trying to find out if their question has already been posed before . | ||
| W16-3909 Name Variation in *****Community Question Answering***** Systems Abstract Community question answering systems are forums where users can ask and answer questions in various categories . | ||
| S19-2200 *****Community Question Answering***** forums are very popular nowadays , as they represent effective means for communities to share information around particular topics . | ||
| S19-2150 Fact checking is an important task for maintaining high quality posts and improving user experience in *****Community Question Answering***** forums . | ||
| neural language models ( LMs | 5 | |
| 2020.acl-main.47 We examine a methodology using *****neural language models ( LMs***** ) for analyzing the word order of language . | ||
| N19-1417 In recent years *****neural language models ( LMs***** ) have set the state - of - the - art performance for several benchmarking datasets . | ||
| 2020.blackboxnlp-1.3 Recently , *****neural language models ( LMs***** ) have demonstrated impressive abilities in generating high - quality discourse . | ||
| 2020.cl-2.8 Recent developments in *****neural language models ( LMs***** ) have raised concerns about their potential misuse for automatically spreading misinformation . | ||
| 2020.deelio-1.5 Following the major success of *****neural language models ( LMs***** ) such as BERT or GPT-2 on a variety of language understanding tasks , recent work focused on injecting ( structured ) knowledge from external resources into these models . | ||
| Translation Memory ( TM ) | 5 | |
| 2010.amta-papers.19 With the steadily increasing demand for high - quality translation , the localisation industry is constantly searching for technologies that would increase translator throughput , in particular focusing on the use of high - quality Statistical Machine Translation ( SMT ) supplementing the established *****Translation Memory ( TM )***** technology . | ||
| 2021.ranlp-1.78 *****Translation Memory ( TM )***** system , a major component of computer - assisted translation ( CAT ) , is widely used to improve human translators ' productivity by making effective use of previously translated resource . | ||
| 2010.amta-papers.27 We report findings from a user study with professional post - editors using a translation recommendation framework ( He et al . , 2010 ) to integrate Statistical Machine Translation ( SMT ) output with *****Translation Memory ( TM )***** systems . | ||
| 2008.amta-srw.4 More and more *****Translation Memory ( TM )***** systems nowadays are fortified with machine translation ( MT ) techniques to enable them to propose a translation to the translator when no match is found in his TM resources . | ||
| 2001.mtsummit-eval.1 Following the guidelines for MT evaluation proposed in the ISLE taxonomy , this paper presents considerations and procedures for evaluating the integration of machine - translated segments into a larger translation workflow with *****Translation Memory ( TM )***** systems . | ||
| Self - | 5 | |
| 2021.naacl-main.347 *****Self -***** disclosure in online health conversations may offer a host of benefits , including earlier detection and treatment of medical issues that may have otherwise gone unaddressed . | ||
| W18-5030 *****Self -***** disclosure is a key social strategy employed in conversation to build relations and increase conversational depth . | ||
| 2021.acl-long.221 *****Self -***** training has proven effective for improving NMT performance by augmenting model training with synthetic parallel data . | ||
| N19-1003 *****Self -***** training is a semi - supervised learning approach for utilizing unlabeled data to create better learners . | ||
| 2021.tacl-1.4 *****Self -***** attention has recently been adopted for a wide range of sequence modeling problems . | ||
| Distributed word | 5 | |
| C16-1227 *****Distributed word***** representation is an efficient method for capturing semantic and syntactic word relations . | ||
| C18-1140 *****Distributed word***** embeddings have shown superior performances in numerous Natural Language Processing ( NLP ) tasks . | ||
| P17-2070 *****Distributed word***** representations are widely used for modeling words in NLP tasks . | ||
| 2020.sltu-1.13 *****Distributed word***** embeddings have become ubiquitous in natural language processing as they have been shown to improve performance in many semantic and syntactic tasks . | ||
| C18-1172 *****Distributed word***** representation plays a pivotal role in various natural language processing tasks . | ||
| Visual Question Answering ( VQA ) | 5 | |
| 2020.nlpbt-1.6 In the majority of the existing *****Visual Question Answering ( VQA )***** research , the answers consist of short , often single words , as per instructions given to the annotators during dataset construction . | ||
| 2021.acl-short.60 *****Visual Question Answering ( VQA )***** methods aim at leveraging visual input to answer questions that may require complex reasoning over entities . | ||
| D17-1097 In this paper , we make a simple observation that questions about images often contain premises objects and relationships implied by the question and that reasoning about premises can help *****Visual Question Answering ( VQA )***** models respond more intelligently to irrelevant or previously unseen questions . | ||
| 2020.acl-main.727 Existing *****Visual Question Answering ( VQA )***** methods tend to exploit dataset biases and spurious statistical correlations , instead of producing right answers for the right reasons . | ||
| 2021.reinact-1.5 *****Visual Question Answering ( VQA )***** systems are increasingly adept at a variety of tasks , and this technology can be used to assist blind and partially sighted people . | ||
| knowledge graph ( KG ) | 5 | |
| 2020.emnlp-main.688 Walk - based models have shown their advantages in *****knowledge graph ( KG )***** reasoning by achieving decent performance while providing interpretable decisions . | ||
| 2020.coling-main.153 As research on utilizing human knowledge in natural language processing has attracted considerable attention in recent years , *****knowledge graph ( KG )***** completion has come into the spotlight . | ||
| D19-1368 Bilinear models such as DistMult and ComplEx are effective methods for *****knowledge graph ( KG )***** completion . | ||
| 2020.emnlp-main.459 Multi - hop reasoning has been widely studied in recent years to seek an effective and interpretable method for *****knowledge graph ( KG )***** completion . | ||
| 2021.naacl-main.278 Prior work on Data - To - Text Generation , the task of converting *****knowledge graph ( KG )***** triples into natural text , focused on domain - specific benchmark datasets . | ||
| Named Entities ( NEs | 5 | |
| 2011.freeopmt-1.7 This paper proposes to enrich RBMT dictionaries with *****Named Entities ( NEs***** ) automatically acquired from Wikipedia . | ||
| L08-1052 This paper presents the automatic extension of Princeton WordNet with *****Named Entities ( NEs***** ) . | ||
| L08-1252 Discovering relations among *****Named Entities ( NEs***** ) from large corpora is both a challenging , as well as useful task in the domain of Natural Language Processing , with applications in Information Retrieval ( IR ) , Summarization ( SUM ) , Question Answering ( QA ) and Textual Entailment ( TE ) . | ||
| L12-1139 *****Named Entities ( NEs***** ) that occur in natural language text are important especially due to the advent of social media , and they play a critical role in the development of many natural language technologies . | ||
| R19-1114 Many Natural Language Processing ( NLP ) tasks depend on using *****Named Entities ( NEs***** ) that are contained in texts and in external knowledge sources . | ||
| Sign Language | 5 | |
| 2020.signlang-1.10 Proform constructs such as classifier predicates and size and shape specifiers are essential elements of *****Sign Language***** communication , but have remained a challenge for synthesis due to their highly variable nature . | ||
| 2020.signlang-1.30 *****Sign Language***** Recognition is a challenging research domain . | ||
| 2021.emnlp-main.405 Coreference resolution is key to many natural language processing tasks and yet has been relatively unexplored in *****Sign Language***** Processing . | ||
| 2020.signlang-1.18 This article treats about a *****Sign Language***** concordancer . | ||
| 2020.signlang-1.8 This paper presents LSE_UVIGO , a multi - source database designed to foster research on *****Sign Language***** Recognition . | ||
| SemEval-2020 task | 5 | |
| 2020.semeval-1.36 In this paper , we present our system for *****SemEval-2020 task***** 3 , Predicting the ( Graded ) Effect of Context in Word Similarity . | ||
| 2020.semeval-1.33 This paper describes the system we built for *****SemEval-2020 task***** 3 . | ||
| 2020.semeval-1.228 In this paper , we show our system for *****SemEval-2020 task***** 11 , where we tackle propaganda span identification ( SI ) and technique classification ( TC ) . | ||
| 2020.semeval-1.197 This paper summarizes our studies on propaganda detection techniques for news articles in the *****SemEval-2020 task***** 11 . | ||
| 2020.semeval-1.216 This paper shows our system for *****SemEval-2020 task***** 10 , Emphasis Selection for Written Text in Visual Media . | ||
| Natural language understanding ( NLU | 5 | |
| 2020.emnlp-main.410 *****Natural language understanding ( NLU***** ) in the context of goal - oriented dialog systems typically includes intent classification and slot labeling tasks . | ||
| 2020.acl-main.163 *****Natural language understanding ( NLU***** ) and natural language generation ( NLG ) are two fundamental and related tasks in building task - oriented dialogue systems with opposite objectives : NLU tackles the transformation from natural language to formal representations , whereas NLG does the reverse . | ||
| 2020.findings-emnlp.443 *****Natural language understanding ( NLU***** ) and Natural language generation ( NLG ) tasks hold a strong dual relationship , where NLU aims at predicting semantic labels based on natural language utterances and NLG does the opposite . | ||
| P19-1545 *****Natural language understanding ( NLU***** ) and natural language generation ( NLG ) are both critical research topics in the NLP and dialogue fields . | ||
| 2020.coling-main.430 *****Natural language understanding ( NLU***** ) aims at identifying user intent and extracting semantic slots . | ||
| Ancient | 5 | |
| 2020.lt4hala-1.9 This paper describes a first attempt to automatic semantic role labeling in *****Ancient***** Greek , using a supervised machine learning approach . | ||
| 2004.jeptalnrecital-poster.23 The existence of a Dictionary in electronic form for Modern Greek ( MG ) is mandatory if one is to process MG at the morphological and syntactic levels since MG is a highly inflectional language with marked stress and a spelling system with many characteristics carried over from *****Ancient***** Greek . | ||
| 2021.gwc-1.30 This paper presents the work in progress toward the creation of a family of WordNets for Sanskrit , *****Ancient***** Greek , and Latin . | ||
| 2019.gwc-1.21 With the increasing availability of wordnets for ancient languages , such as *****Ancient***** Greek and Latin , gaps remain in the coverage of less studied languages of antiquity . | ||
| D19-1668 *****Ancient***** History relies on disciplines such as Epigraphy , the study of ancient inscribed texts , for evidence of the recorded past . | ||
| Public | 5 | |
| L12-1397 This corpus contains manual transcriptions of spoken conversations recorded in the French call - center of the Paris *****Public***** Transport Authority ( RATP ) . | ||
| 2021.eacl-main.113 *****Public***** datasets are often used to evaluate the efficacy and generalizability of state - of - the - art methods for many tasks in natural language processing ( NLP ) . | ||
| W17-5009 *****Public***** speakings play important roles in schools and work places and properly using humor contributes to effective presentations . | ||
| L14-1018 *****Public***** opinion , as measured by media sentiment , can be an important indicator in the financial and economic context . | ||
| L14-1514 *****Public***** speaking is a widely requested professional skill , and at the same time an activity that causes one of the most common adult phobias ( Miller and Stone , 2009 ) . | ||
| User - generated | 5 | |
| D19-1468 *****User - generated***** reviews can be decomposed into fine - grained segments ( e.g. , sentences , clauses ) , each evaluating a different aspect of the principal entity ( e.g. , price , quality , appearance ) . | ||
| W16-3905 *****User - generated***** content presents many challenges for its automatic processing . | ||
| N18-1087 *****User - generated***** text tends to be noisy with many lexical and orthographic inconsistencies , making natural language processing ( NLP ) tasks more challenging . | ||
| 2021.wnut-1.45 *****User - generated***** texts include various types of stylistic properties , or noises . | ||
| C18-1148 *****User - generated***** content such as the questions on community question answering ( CQA ) forums does not always come with appropriate headlines , in contrast to the news articles used in various headline generation tasks . | ||
| evaluation of | 5 | |
| 2020.lrec-1.749 These annotated results reveal that ( 1 ) writer - labeled market sentiment may be a misleading label ; ( 2 ) writer 's sentiment and market sentiment of an investor may be different ; ( 3 ) most financial tweets provide unfounded analysis results ; and ( 4 ) almost no investors write down the gain / loss results for their positions , which would otherwise greatly facilitate detailed *****evaluation of***** their performance . | ||
| L14-1592 The recent popularity of machine translation has increased the demand for the *****evaluation of***** translations . | ||
| U18-1010 Accurate *****evaluation of***** translation has long been a difficult , yet important problem . | ||
| L14-1303 In this paper , we report on the construction of a resource of Swiss legislative texts that is automatically annotated with structural , morphosyntactic and content - related information , and we discuss the exploitation of this resource for the purposes of legislative drafting , legal linguistics and translation and for the *****evaluation of***** legislation . | ||
| W17-4510 The *****evaluation of***** summaries is a challenging but crucial task of the summarization field . | ||
| Relation extraction ( RE | 5 | |
| 2020.coling-main.564 *****Relation extraction ( RE***** ) has been extensively studied due to its importance in real - world applications such as knowledge base construction and question answering . | ||
| 2020.emnlp-main.303 *****Relation extraction ( RE***** ) aims to identify the semantic relations between named entities in text . | ||
| L12-1290 *****Relation extraction ( RE***** ) is an important text mining task which is the basis for further complex and advanced tasks . | ||
| D19-1038 *****Relation extraction ( RE***** ) seeks to detect and classify semantic relationships between entities , which provides useful information for many NLP applications . | ||
| N19-1147 *****Relation extraction ( RE***** ) aims to label relations between groups of marked entities in raw text . | ||
| attention - based | 5 | |
| S17-2157 We present a neural encoder - decoder AMR parser that extends an *****attention - based***** model by predicting the alignment between graph nodes and sentence tokens explicitly with a pointer mechanism . | ||
| P19-1009 We propose an *****attention - based***** model that treats AMR parsing as sequence - to - graph transduction . | ||
| W17-5716 In this paper , we describe our neural machine translation ( NMT ) system , which is based on the *****attention - based***** NMT and uses long short - term memories ( LSTM ) as RNN . | ||
| P19-3007 The Transformer is a sequence model that forgoes traditional recurrent architectures in favor of a fully *****attention - based***** approach . | ||
| S18-1019 In this paper , we propose an *****attention - based***** classifier that predicts multiple emotions of a given sentence . | ||
| fact - | 5 | |
| 2020.lrec-1.161 Fact - checking information before publication has long been a core task for journalists , but recent times have seen the emergence of dedicated news items specifically aimed at *****fact -***** checking after publication . | ||
| R19-1113 Fake news detection and closely - related *****fact -***** checking have recently attracted a lot of attention . | ||
| 2021.nlp4if-1.7 In this paper , we explore the construction of natural language explanations for news claims , with the goal of assisting *****fact -***** checking and news evaluation applications . | ||
| W17-4216 Previous work on the epistemology of *****fact -***** checking indicated the dilemma between the needs of binary answers for the public and ambiguity of political discussion . | ||
| R17-1037 In the context of investigative journalism , we address the problem of automatically identifying which claims in a given document are most worthy and should be prioritized for *****fact -***** checking . | ||
| Character - level | 5 | |
| W19-4811 *****Character - level***** models have been used extensively in recent years in NLP tasks as both supplements and replacements for closed - vocabulary token - level word representations . | ||
| D18-1345 *****Character - level***** patterns have been widely used as features in English Named Entity Recognition ( NER ) systems . | ||
| D19-5202 *****Character - level***** translation has been proved to be able to achieve preferable translation quality without explicit segmentation , but training a character - level model needs a lot of hardware resources . | ||
| D18-1365 *****Character - level***** features are currently used in different neural network - based natural language processing algorithms . | ||
| P18-1036 *****Character - level***** models have become a popular approach specially for their accessibility and ability to handle unseen data . | ||
| Morphologically rich | 5 | |
| 2020.emnlp-main.388 *****Morphologically rich***** languages seem to benefit from joint processing of morphology and syntax , as compared to pipeline architectures . | ||
| 2010.amta-papers.4 *****Morphologically rich***** languages pose a challenge for statistical machine translation ( SMT ) . | ||
| L08-1536 *****Morphologically rich***** languages pose a challenge to the annotators of treebanks with respect to the status of orthographic ( space - delimited ) words in the syntactic parse trees . | ||
| W18-5806 *****Morphologically rich***** languages are challenging for natural language processing tasks due to data sparsity . | ||
| P17-1006 *****Morphologically rich***** languages accentuate two properties of distributional vector space models : 1 ) the difficulty of inducing accurate representations for low - frequency word forms ; and 2 ) insensitivity to distinct lexical relations that have similar distributional signatures . | ||
| Spoken language understanding ( SLU | 5 | |
| W18-5043 *****Spoken language understanding ( SLU***** ) by using recurrent neural networks ( RNN ) achieves good performances for large training data sets , but collecting large training datasets is a challenge , especially for new voice applications . | ||
| 2021.naacl-main.152 *****Spoken language understanding ( SLU***** ) requires a model to analyze input acoustic signal to understand its linguistic content and make predictions . | ||
| 2021.naacl-industry.9 *****Spoken language understanding ( SLU***** ) extracts the intended mean- ing from a user utterance and is a critical component of conversational virtual agents . | ||
| N18-1194 *****Spoken language understanding ( SLU***** ) is an essential component in conversational systems . | ||
| 2020.coling-main.234 *****Spoken language understanding ( SLU***** ) , which converts user requests in natural language to machine - interpretable expressions , is becoming an essential task . | ||
| creative | 5 | |
| 2020.acl-demos.28 Building datasets of *****creative***** text , such as humor , is quite challenging . | ||
| S18-1088 The paper describes our search for a universal algorithm of detecting intentional lexical ambiguity in different forms of *****creative***** language . | ||
| 2020.gebnlp-1.9 There is a growing collection of work analyzing and mitigating societal biases in language understanding , generation , and retrieval tasks , though examining biases in *****creative***** tasks remains underexplored . | ||
| W18-1502 We examine an emerging NLP application that supports *****creative***** writing by automatically suggesting continuing sentences in a story . | ||
| P19-1572 The inability to quantify key aspects of *****creative***** language is a frequent obstacle to natural language understanding . | ||
| Contextual word | 5 | |
| D19-1005 *****Contextual word***** representations , typically trained on unstructured , unlabeled text , do not contain any explicit grounding to real world entities and are often unable to remember facts about those entities . | ||
| D18-1179 *****Contextual word***** representations derived from pre - trained bidirectional language models ( biLMs ) have recently been shown to provide significant improvements to the state of the art for a wide range of NLP tasks . | ||
| D19-1138 *****Contextual word***** embeddings ( e.g. | ||
| N19-1112 *****Contextual word***** representations derived from large - scale neural language models are successful across a diverse set of NLP tasks , suggesting that they encode useful and transferable features of language . | ||
| W19-3823 *****Contextual word***** embeddings such as BERT have achieved state of the art performance in numerous NLP tasks . | ||
| Bidirectional Encoder Representations from Transformers ( BERT | 5 | |
| 2021.nllp-1.22 *****Bidirectional Encoder Representations from Transformers ( BERT***** ) has achieved state - of - the - art performances on several text classification tasks , such as GLUE and sentiment analysis . | ||
| 2020.findings-emnlp.58 *****Bidirectional Encoder Representations from Transformers ( BERT***** ) has shown marvelous improvements across various NLP tasks , and consecutive variants have been proposed to further improve the performance of the pre - trained language models . | ||
| 2021.wanlp-1.19 *****Bidirectional Encoder Representations from Transformers ( BERT***** ) has gained popularity in recent years producing state - of - the - art performances across Natural Language Processing tasks . | ||
| D19-1387 *****Bidirectional Encoder Representations from Transformers ( BERT***** ) represents the latest incarnation of pretrained language models which have recently advanced a wide range of natural language processing tasks . | ||
| D19-6221 Since the introduction of context - aware token representation techniques such as Embeddings from Language Models ( ELMo ) and *****Bidirectional Encoder Representations from Transformers ( BERT***** ) , there has been numerous reports on improved performance on a variety of natural language tasks . | ||
| script | 5 | |
| W17-0901 We present a semi - supervised clustering approach to induce *****script***** structure from crowdsourced descriptions of event sequences by grouping event descriptions into paraphrase sets ( representing event types ) and inducing their temporal order . | ||
| W17-0906 The LSDSem'17 shared task is the Story Cloze Test , a new evaluation for story understanding and *****script***** learning . | ||
| S19-1012 We introduce MCScript2.0 , a machine comprehension corpus for the end - to - end evaluation of *****script***** knowledge . | ||
| 2020.coling-main.29 Given an incomplete event chain , *****script***** learning aims to predict the missing event , which can support a series of NLP applications . | ||
| P18-2119 The Story Cloze Test ( SCT ) is a recent framework for evaluating story comprehension and *****script***** learning . | ||
| Reading | 5 | |
| C16-1167 *****Reading***** comprehension has embraced a booming in recent NLP research . | ||
| N19-1270 *****Reading***** strategies have been shown to improve comprehension levels , especially for readers lacking adequate prior knowledge . | ||
| 2021.iwpt-1.3 The *****Reading***** Machine , is a parsing framework that takes as input raw text and performs six standard nlp tasks : tokenization , pos tagging , morphological analysis , lemmatization , dependency parsing and sentence segmentation . | ||
| D19-5820 *****Reading***** comprehension is one of the crucial tasks for furthering research in natural language understanding . | ||
| N19-1246 *****Reading***** comprehension has recently seen rapid progress , with systems matching humans on the most popular datasets for the task . | ||
| speech translation ( ST | 5 | |
| 2021.eacl-main.57 Previous studies demonstrated that a dynamic phone - informed compression of the input audio is beneficial for *****speech translation ( ST***** ) . | ||
| 2021.acl-long.224 Five years after the first published proofs of concept , direct approaches to *****speech translation ( ST***** ) are now competing with traditional cascade solutions . | ||
| 2021.iwslt-1.24 Recent studies argue that knowledge distillation is promising for *****speech translation ( ST***** ) using end - to - end models . | ||
| 2021.eacl-main.216 Using end - to - end models for *****speech translation ( ST***** ) has increasingly been the focus of the ST community . | ||
| 2020.acl-main.217 End - to - end models for *****speech translation ( ST***** ) more tightly couple speech recognition ( ASR ) and machine translation ( MT ) than a traditional cascade of separate ASR and MT models , with simpler model architectures and the potential for reduced error propagation . | ||
| Chinese named entity recognition ( NER | 5 | |
| 2021.emnlp-demo.32 In this paper , we introduce CroAno , a web - based crowd annotation platform for the *****Chinese named entity recognition ( NER***** ) . | ||
| 2020.acl-main.611 Recently , the character - word lattice structure has been proved to be effective for *****Chinese named entity recognition ( NER***** ) by incorporating the word information . | ||
| 2020.acl-main.528 Recently , many works have tried to augment the performance of *****Chinese named entity recognition ( NER***** ) using word lexicons . | ||
| D19-1096 Recurrent neural networks ( RNN ) used for *****Chinese named entity recognition ( NER***** ) that sequentially track character and word information have achieved great success . | ||
| C18-1183 A bottleneck problem with *****Chinese named entity recognition ( NER***** ) in new domains is the lack of annotated data . | ||
| multilingual parallel | 5 | |
| 2020.lrec-1.502 We introduce GeBioToolkit , a tool for extracting *****multilingual parallel***** corpora at sentence level , with document and gender information from Wikipedia biographies . | ||
| L12-1103 This paper describes methods and results for the annotation of two discourse - level phenomena , connectives and pronouns , over a *****multilingual parallel***** corpus . | ||
| L12-1371 MultiUN is a *****multilingual parallel***** corpus extracted from the official documents of the United Nations . | ||
| 2020.mwe-1.6 In this work , we present the construction of *****multilingual parallel***** corpora with annotation of multiword expressions ( MWEs ) . | ||
| L12-1058 We present the architecture and the current state of InterCorp , a *****multilingual parallel***** corpus centered around Czech , intended primarily for human users and consisting of written texts with a focus on fiction . | ||
| Question Generation ( QG | 5 | |
| 2021.emnlp-main.340 *****Question Generation ( QG***** ) is the task of generating a plausible question for a given passage , answer pair . | ||
| 2010.jeptalnrecital-court.36 *****Question Generation ( QG***** ) and Question Answering ( QA ) are some of the many challenges for natural language understanding and interfaces . | ||
| 2020.acl-main.69 *****Question Generation ( QG***** ) is fundamentally a simple syntactic transformation ; however , many aspects of semantics influence what questions are good to form . | ||
| D19-5822 *****Question Generation ( QG***** ) is a Natural Language Processing ( NLP ) task that aids advances in Question Answering ( QA ) and conversational assistants . | ||
| W19-8613 *****Question Generation ( QG***** ) is the task of generating questions from a given passage . | ||
| Capturing Discriminative | 5 | |
| S18-1165 This paper describes the system we submitted to Task 10 ( *****Capturing Discriminative***** Attributes ) in SemEval 2018 . | ||
| S18-1117 This paper describes the SemEval 2018 Task 10 on *****Capturing Discriminative***** Attributes . | ||
| S18-1155 This paper describes our system submission to the SemEval 2018 Task 10 on *****Capturing Discriminative***** Attributes . | ||
| S18-1118 We present SUNNYNLP , our system for solving SemEval 2018 Task 10 : *****Capturing Discriminative***** Attributes . | ||
| S18-1162 Luminoso participated in the SemEval 2018 task on *****Capturing Discriminative***** Attributes with a system based on ConceptNet , an open knowledge graph focused on general knowledge . | ||
| Arabic Fine - Grained Dialect | 5 | |
| W19-4622 In this paper , we present the results and findings of the MADAR Shared Task on *****Arabic Fine - Grained Dialect***** Identification . | ||
| W19-4628 This paper presents the results of the experiments done as a part of MADAR Shared Task in WANLP 2019 on *****Arabic Fine - Grained Dialect***** Identification . | ||
| W19-4638 In this paper , we describe our team 's effort on the MADAR Shared Task on *****Arabic Fine - Grained Dialect***** Identification . | ||
| W19-4623 In this paper , we present our systems for the MADAR Shared Task : *****Arabic Fine - Grained Dialect***** Identification . | ||
| W19-4631 In this paper , we present two approaches for *****Arabic Fine - Grained Dialect***** Identification . | ||
| Large language | 5 | |
| 2021.emnlp-main.98 *****Large language***** models have led to remarkable progress on many NLP tasks , and researchers are turning to ever - larger text corpora to train them . | ||
| 2020.emnlp-main.496 *****Large language***** models have recently achieved state of the art performance across a wide variety of natural language tasks . | ||
| 2021.emnlp-main.564 *****Large language***** models have shown promising results in zero - shot settings . | ||
| 2021.emnlp-main.602 *****Large language***** models have become increasingly difficult to train because of the growing computation time and cost . | ||
| 2021.blackboxnlp-1.10 *****Large language***** models are known to suffer from the hallucination problem in that they are prone to output statements that are false or inconsistent , indicating a lack of knowledge . | ||
| Search | 5 | |
| D17-1121 *****Search***** systems are often focused on providing relevant results for the now , assuming both corpora and user needs that focus on the present . | ||
| W18-3721 *****Search***** engine is an important tool of modern academic study , but the results are lack of measurement of beginner friendliness . | ||
| L06-1104 *****Search***** engines on the web and most existing question - answering systems provide the user with a set of hyperlinks and/or web page extracts containing answer(s ) to a question . | ||
| D17-1061 *****Search***** engines play an important role in our everyday lives by assisting us in finding the information we need . | ||
| D19-1612 *****Search***** applications often display shortened sentences which must contain certain query terms and must fit within the space constraints of a user interface . | ||
| deep neural networks ( DNNs | 5 | |
| 2020.deelio-1.3 Studies have shown that *****deep neural networks ( DNNs***** ) are vulnerable to adversarial examples perturbed inputs that cause DNN - based models to produce incorrect results . | ||
| 2021.emnlp-main.659 Backdoor attacks , which maliciously control a well - trained model 's outputs of the instances with specific triggers , are recently shown to be serious threats to the safety of reusing *****deep neural networks ( DNNs***** ) . | ||
| P18-1099 The success of *****deep neural networks ( DNNs***** ) is heavily dependent on the availability of labeled data . | ||
| 2021.emnlp-main.752 Backdoor attacks are a kind of emergent training - time threat to *****deep neural networks ( DNNs***** ) . | ||
| P18-1032 The behavior of *****deep neural networks ( DNNs***** ) is hard to understand . | ||
| string | 5 | |
| D18-1135 Recently , *****string***** kernels have obtained state - of - the - art results in various text classification tasks such as Arabic dialect identification or native language identification . | ||
| W17-1411 We focus on the task of supervised sentiment classification of short and informal texts in Croatian , using two simple yet effective methods : word embeddings and *****string***** kernels . | ||
| R19-1106 This paper compares how different machine learning classifiers can be used together with simple *****string***** matching and named entity recognition to detect locations in texts . | ||
| C18-1115 We show that the general problem of *****string***** transduction can be reduced to the problem of sequence labeling . | ||
| P18-2080 In this work , we present an approach based on combining *****string***** kernels and word embeddings for automatic essay scoring . | ||
| Machine Reading Comprehension ( MRC ) | 5 | |
| D19-1169 Though the community has made great progress on *****Machine Reading Comprehension ( MRC )***** task , most of the previous works are solving English - based MRC problems , and there are few efforts on other languages mainly due to the lack of large - scale training data . In this paper , we propose Cross - Lingual Machine Reading Comprehension ( CLMRC ) task for the languages other than English . | ||
| 2021.eacl-main.311 An in - depth analysis of the level of language understanding required by existing *****Machine Reading Comprehension ( MRC )***** benchmarks can provide insight into the reading capabilities of machines . | ||
| P19-1219 To bridge the gap between *****Machine Reading Comprehension ( MRC )***** models and human beings , which is mainly reflected in the hunger for data and the robustness to noise , in this paper , we explore how to integrate the neural networks of MRC models with the general knowledge of human beings . | ||
| 2021.acl-short.43 Pre - trained language models have achieved human - level performance on many *****Machine Reading Comprehension ( MRC )***** tasks , but it remains unclear whether these models truly understand language or answer questions by exploiting statistical biases in datasets . | ||
| 2021.naacl-main.100 State - of - the - art *****Machine Reading Comprehension ( MRC )***** models for Open - domain Question Answering ( QA ) are typically trained for span selection using distantly supervised positive examples and heuristically retrieved negative examples . | ||
| Icelandic | 5 | |
| 2021.nodalida-main.50 We present Talrmur , a large high - quality Text - To - Speech ( TTS ) corpus for the *****Icelandic***** language . | ||
| W19-6116 The topic of this paper is The Database of Icelandic Morphology ( DIM ) , a multipurpose linguistic resource , created for use in language technology , as a reference for the general public in Iceland , and for use in research on the *****Icelandic***** language . | ||
| L14-1659 The new POS - tagged Icelandic corpus of the Leipzig Corpora Collection is an extensive resource for the analysis of the *****Icelandic***** language . | ||
| W19-6142 We report on work in progress which consists of annotating an *****Icelandic***** corpus for named entities ( NEs ) and using it for training a named entity recognizer based on a Bidirectional Long Short - Term Memory model . | ||
| L14-1533 In this paper , we describe the correction of PoS tags in a new *****Icelandic***** corpus , MIM - GOLD , consisting of about 1 million tokens sampled from the Tagged Icelandic Corpus , MM , released in 2013 . | ||
| information retrieval ( IR ) | 5 | |
| D18-1478 Pseudo relevance feedback ( PRF ) is commonly used to boost the performance of traditional *****information retrieval ( IR )***** models by using top - ranked documents to identify and weight new query terms , thereby reducing the effect of query - document vocabulary mismatches . | ||
| 2021.naacl-main.292 The embedding - based large - scale query - document retrieval problem is a hot topic in the *****information retrieval ( IR )***** field . | ||
| D19-5816 Multi - hop question answering ( QA ) requires an *****information retrieval ( IR )***** system that can find multiple supporting evidence needed to answer the question , making the retrieval process very challenging . | ||
| D19-3034 We introduce Seagle , a platform for comparative evaluation of semantic text encoding models on *****information retrieval ( IR )***** tasks . | ||
| R19-1049 We present a novel approach to automatic question answering that does not depend on the performance of an *****information retrieval ( IR )***** system and does not require that the training data come from the same source as the questions . | ||
| event detection ( ED | 5 | |
| N19-1105 Modern weakly supervised methods for *****event detection ( ED***** ) avoid time - consuming human annotation and achieve promising results by learning from auto - labeled data . | ||
| 2020.emnlp-main.435 Recent studies on *****event detection ( ED***** ) have shown that the syntactic dependency graph can be employed in graph convolution neural networks ( GCN ) to achieve state - of - the - art performance . | ||
| P17-1164 This paper tackles the task of *****event detection ( ED***** ) , which involves identifying and categorizing events . | ||
| D19-1068 The scarcity in annotated data poses a great challenge for *****event detection ( ED***** ) . | ||
| N19-1080 The goal of *****event detection ( ED***** ) is to detect the occurrences of events and categorize them . | ||
| Large - scale pretrained language | 5 | |
| 2020.acl-main.679 *****Large - scale pretrained language***** models are the major driving force behind recent improvements in perfromance on the Winograd Schema Challenge , a widely employed test of commonsense reasoning ability . | ||
| 2020.findings-emnlp.81 *****Large - scale pretrained language***** models have achieved outstanding performance on natural language understanding tasks . | ||
| 2020.findings-emnlp.387 *****Large - scale pretrained language***** models have become ubiquitous in Natural Language Processing . | ||
| P19-1608 *****Large - scale pretrained language***** models define state of the art in natural language processing , achieving outstanding performance on a variety of tasks . | ||
| 2021.acl-short.2 *****Large - scale pretrained language***** models have led to dramatic improvements in text generation . | ||
| Few - shot | 5 | |
| 2020.coling-main.140 *****Few - shot***** classification requires classifiers to adapt to new classes with only a few training instances . | ||
| 2021.naacl-main.59 *****Few - shot***** learning arises in important practical scenarios , such as when a natural language understanding system needs to learn new semantic labels for an emerging , resource - scarce domain . | ||
| 2020.coling-main.92 *****Few - shot***** learning addresses the the problem of learning based on a small amount of training data . | ||
| 2021.naacl-main.158 *****Few - shot***** learning has drawn researchers ' attention to overcome the problem of data scarcity . | ||
| 2021.alvr-1.6 *****Few - shot***** learning is to recognize novel classes with a few labeled samples per class . | ||
| code - switching | 5 | |
| 2020.emnlp-main.396 Span identification ( in short , span ID ) tasks such as chunking , NER , or *****code - switching***** detection , ask models to identify and classify relevant spans in a text . | ||
| W17-0804 We present a *****code - switching***** corpus of Turkish - German that is collected by recording conversations of bilinguals . | ||
| C16-1153 Emotions in *****code - switching***** text can be expressed in either monolingual or bilingual forms . | ||
| 2021.calcs-1.12 This paper addresses the problem of sentiment analysis for Jopara , a *****code - switching***** language between Guarani and Spanish . | ||
| 2020.acl-main.96 In this paper , we demonstrate how *****code - switching***** patterns can be utilised to improve various downstream NLP applications . | ||
| Translation Memory ( TM | 5 | |
| L04-1257 An innovative way of integrating *****Translation Memory ( TM***** ) and Machine Translation ( MT ) processing is presented which goes beyond the traditional cascade integration of Translation Memory and Machine Translation . | ||
| 2011.mtsummit-tutorials.2 In this tutorial , we cover techniques that facilitate the integration of Machine Translation ( MT ) and *****Translation Memory ( TM***** ) , which can help the adoption of MT technology in localisation industry . | ||
| 2014.amta-researchers.19 Combining *****Translation Memory ( TM***** ) with Statistical Machine Translation ( SMT ) together has been demonstrated to be beneficial . | ||
| P19-1175 We present a simple yet powerful data augmentation method for boosting Neural Machine Translation ( NMT ) performance by leveraging information retrieved from a *****Translation Memory ( TM***** ) . | ||
| 2021.acl-long.567 Prior work has proved that *****Translation Memory ( TM***** ) can boost the performance of Neural Machine Translation ( NMT ) . | ||
| expressing | 5 | |
| 2020.lrec-1.400 In this paper , we introduce a new set of lexicons for *****expressing***** subjectivity in text documents written in Brazilian Portuguese . | ||
| W16-4306 Automatic detection of five language components , which are all relevant for *****expressing***** opinions and for stance taking , was studied : positive sentiment , negative sentiment , speculation , contrast and condition . | ||
| D18-1471 Vulgar words are employed in language use for several different functions , ranging from *****expressing***** aggression to signaling group identity or the informality of the communication . | ||
| 2021.wanlp-1.52 Sarcasm is one of the main challenges for sentiment analysis systems due to using implicit indirect phrasing for *****expressing***** opinions , especially in Arabic . | ||
| 2020.sltu-1.22 The aim of this paper is to investigate the role of Luxembourgish adjectives in *****expressing***** sentiments in user comments written at the web presence of rtl.lu ( RTL is the abbreviation for Radio Television Letzebuerg ) . | ||
| Offensive language | 5 | |
| 2020.semeval-1.287 *****Offensive language***** identification is to detect the hurtful tweets , derogatory comments , swear words on social media . | ||
| 2020.semeval-1.296 *****Offensive language***** detection is one of the most challenging problem in the natural language processing field , being imposed by the rising presence of this phenomenon in online social media . | ||
| 2021.dravidianlangtech-1.46 *****Offensive language***** identification has been an active area of research in natural language processing . | ||
| 2021.dravidianlangtech-1.22 Title : JudithJeyafreedaAndrew@DravidianLangTech - EACL2021 : *****Offensive language***** detection for Dravidian Code - mixed YouTube comments Author : Judith Jeyafreeda Andrew Messaging online has become one of the major ways of communication . | ||
| 2020.semeval-1.203 *****Offensive language***** detection is an important and challenging task in natural language processing . | ||
| bilingual lexicon induction ( BLI | 5 | |
| P19-1018 Recent work on *****bilingual lexicon induction ( BLI***** ) has frequently depended either on aligned bilingual lexicons or on distribution matching , often with an assumption about the isometry of the two spaces . | ||
| 2021.mrl-1.4 Cross - lingual word embeddings ( CLWEs ) have proven indispensable for various natural language processing tasks , e.g. , *****bilingual lexicon induction ( BLI***** ) . | ||
| 2020.emnlp-main.186 Performance in cross - lingual NLP tasks is impacted by the ( dis)similarity of languages at hand : e.g. , previous work has suggested there is a connection between the expected success of *****bilingual lexicon induction ( BLI***** ) and the assumption of ( approximate ) isomorphism between monolingual embedding spaces . | ||
| 2020.acl-main.201 Cross - lingual word embeddings ( CLWE ) are often evaluated on *****bilingual lexicon induction ( BLI***** ) . | ||
| E17-1102 We study the problem of *****bilingual lexicon induction ( BLI***** ) in a setting where some translation resources are available , but unknown translations are sought for certain , possibly domain - specific terminology . | ||
| Optical Character Recognition ( OCR | 5 | |
| 2020.lrec-1.436 Recent advances in *****Optical Character Recognition ( OCR***** ) and Handwritten Text Recognition ( HTR ) have led to more accurate textrecognition of historical documents . | ||
| 2020.latechclfl-1.6 The quality of *****Optical Character Recognition ( OCR***** ) is a key factor in the digitisation of historical documents . | ||
| W19-3633 We present initial experiments to evaluate the performance of tasks such as Part of Speech Tagging on data corrupted by *****Optical Character Recognition ( OCR***** ) . | ||
| 2021.nllp-1.18 Older legal texts are often scanned and digitized via *****Optical Character Recognition ( OCR***** ) , which results in numerous errors . | ||
| 2020.eamt-1.62 The OCCAM project ( Optical Character recognition , ClassificAtion & Machine Translation ) aims at integrating the CEF ( Connecting Europe Facility ) Automated Translation service with image classification , Translation Memories ( TMs ) , *****Optical Character Recognition ( OCR***** ) , and Machine Translation ( MT ) . | ||
| Common | 5 | |
| 2021.humeval-1.1 *****Common***** sense is an integral part of human cognition which allows us to make sound decisions , communicate effectively with others and interpret situations and utterances . | ||
| L14-1162 This article introduces a new speech corpus , the Nijmegen Corpus of Casual Czech ( NCCCz ) , which contains more than 30 hours of high - quality recordings of casual conversations in *****Common***** Czech , among ten groups of three male and ten groups of three female friends . | ||
| 2020.semeval-1.75 *****Common***** sense for natural language processing methods has been attracting a wide research interest , recently . | ||
| 2020.acl-main.156 We use the multilingual OSCAR corpus , extracted from *****Common***** Crawl via language classification , filtering and cleaning , to train monolingual contextualized word embeddings ( ELMo ) for five mid - resource languages . | ||
| 2020.lrec-1.425 This contribution describes an ongoing project of speech data collection , using the web application Samrmur which is built upon *****Common***** Voice , Mozilla Foundation 's web platform for open - source voice collection . | ||
| Knowledge graphs ( KGs | 5 | |
| D19-5302 *****Knowledge graphs ( KGs***** ) are generally used for various NLP tasks . | ||
| 2020.emnlp-main.577 *****Knowledge graphs ( KGs***** ) can vary greatly from one domain to another . | ||
| 2020.textgraphs-1.1 *****Knowledge graphs ( KGs***** ) of real - world facts about entities and their relationships are useful resources for a variety of natural language processing tasks . | ||
| 2021.wanlp-1.24 *****Knowledge graphs ( KGs***** ) are widely used to store and access information about entities and their relationships . | ||
| D19-1266 *****Knowledge graphs ( KGs***** ) often suffer from sparseness and incompleteness . | ||
| machine - translated | 5 | |
| 2021.emnlp-main.267 Quality estimation ( QE ) of machine translation ( MT ) aims to evaluate the quality of *****machine - translated***** sentences without references and is important in practical applications of MT . | ||
| 2020.acl-main.155 Current advances in machine translation ( MT ) increase the need for translators to switch from traditional translation to post - editing ( PE ) of *****machine - translated***** text , a process that saves time and reduces errors . | ||
| 2020.acl-main.532 In this paper , we show that neural machine translation ( NMT ) systems trained on large back - translated data overfit some of the characteristics of *****machine - translated***** texts . | ||
| 2001.mtsummit-eval.1 Following the guidelines for MT evaluation proposed in the ISLE taxonomy , this paper presents considerations and procedures for evaluating the integration of *****machine - translated***** segments into a larger translation workflow with Translation Memory ( TM ) systems . | ||
| 2019.iwslt-1.23 These metrics are followed by DA , and then by metrics comparing the *****machine - translated***** version and independent references . | ||
| fine - grained entity | 5 | |
| P17-1128 Finding the correct hypernyms for entities is essential for taxonomy learning , *****fine - grained entity***** categorization , query understanding , etc . | ||
| 2021.acl-long.160 Neural entity typing models typically represent *****fine - grained entity***** types as vectors in a high - dimensional space , but such spaces are not well - suited to modeling these types ' complex interdependencies . | ||
| 2021.acl-long.141 Recently , there is an effort to extend *****fine - grained entity***** typing by using a richer and ultra - fine set of types , and labeling noun phrases including pronouns and nominal nouns instead of just named entity mentions . | ||
| E17-2119 We address *****fine - grained entity***** classification and propose a novel attention - based recurrent neural network ( RNN ) encoder - decoder that generates paths in the type hierarchy and can be trained end - to - end . | ||
| 2020.findings-emnlp.42 Label inventories for *****fine - grained entity***** typing have grown in size and complexity . | ||
| Norwegian | 5 | |
| 2009.freeopmt-1.7 We describe the development of a two - way shallow - transfer machine translation system between *****Norwegian***** Nynorsk and Norwegian Bokma l built on the Apertium platform , using the Free and Open Source resources Norsk Ordbank and the OsloBergen Constraint Grammar tagger . | ||
| L14-1273 The Norwegian Dependency Treebank is a new syntactic treebank for *****Norwegian***** Bokml and Nynorsk with manual syntactic and morphological annotation , developed at the National Library of Norway in collaboration with the University of Oslo . | ||
| L14-1033 In the data analysis , the acoustic - phonetic properties of words spoken with two different levels of accentuation ( de - accented and nuclear accented in non - contrastive narrow - focus ) are examined in question - answer elicited sentences and iterative imitations ( on the syllable da ) produced by Bulgarian , Russian , French , German and *****Norwegian***** speakers ( 3 male and 3 female per language ) . | ||
| L10-1530 This paper presents the Kachna corpus of spontaneous speech , in which ten Czech and ten *****Norwegian***** speakers were recorded both in their native language and in English . | ||
| L14-1622 This paper presents an open source part - of - speech tagger for the *****Norwegian***** language . | ||
| Multi - hop | 5 | |
| D18-1362 *****Multi - hop***** reasoning is an effective approach for query answering ( QA ) over incomplete knowledge graphs ( KGs ) . | ||
| D19-1455 *****Multi - hop***** QA requires a model to connect multiple pieces of evidence scattered in a long context to answer the question . | ||
| 2021.emnlp-main.700 *****Multi - hop***** reasoning has been widely studied in recent years to obtain more interpretable link prediction . | ||
| 2020.emnlp-main.459 *****Multi - hop***** reasoning has been widely studied in recent years to seek an effective and interpretable method for knowledge graph ( KG ) completion . | ||
| 2021.naacl-main.363 *****Multi - hop***** reasoning requires aggregation and inference from multiple facts . | ||
| Identifying and Categorizing Offensive | 5 | |
| S19-2066 This article describes Amobee 's participation in HatEval : Multilingual detection of hate speech against immigrants and women in Twitter ( task 5 ) and OffensEval : *****Identifying and Categorizing Offensive***** Language in Social Media ( task 6 ) . | ||
| S19-2122 In this paper we present our approach and the system description for Sub Task A and Sub Task B of SemEval 2019 Task 6 : *****Identifying and Categorizing Offensive***** Language in Social Media . | ||
| S19-2124 The paper describes the systems submitted to OffensEval ( SemEval 2019 , Task 6 ) on ` *****Identifying and Categorizing Offensive***** Language in Social Media ' by the ` NIT_Agartala_NLP_Team ' . | ||
| S19-2109 This paper describes our system ( Fermi ) for Task 6 : OffensEval : *****Identifying and Categorizing Offensive***** Language in Social Media of SemEval-2019 . | ||
| S19-2112 We present our results for OffensEval : *****Identifying and Categorizing Offensive***** Language in Social Media ( SemEval 2019 - Task 6 ) . | ||
| broadcast | 5 | |
| 2010.amta-government.11 Current state - of - the - art in speech recognition , machine translation , and natural language processing ( NLP ) technologies has allowed the development of powerful media monitoring systems that provide today 's analysts with automatic tools for ingesting and searching through different types of data , such as *****broadcast***** video , web pages , documents , and scanned images . | ||
| L10-1244 Typical *****broadcast***** material contains not only studio - recorded texts read by trained speakers , but also spontaneous and dialect speech , debates with cross - talk , voice - overs , and on - site reports with difficult acoustic environments . | ||
| W18-5402 This paper addresses a relatively new task : prediction of ASR performance on unseen *****broadcast***** programs . | ||
| 2021.acl-long.177 We introduce the largest transcribed Arabic speech corpus , QASR , collected from the *****broadcast***** domain . | ||
| W17-1307 Dialectal Arabic ( DA ) is significantly different from the Arabic language taught in schools and used in written communication and formal speech ( *****broadcast***** news , religion , politics , etc . ) . | ||
| Knowledge graph | 5 | |
| 2020.coling-main.44 *****Knowledge graph***** embedding maps entities and relations into low - dimensional vector space . | ||
| 2021.emnlp-main.769 *****Knowledge graph***** inference has been studied extensively due to its wide applications . | ||
| 2020.emnlp-main.541 *****Knowledge graph***** reasoning is a critical task in natural language processing . | ||
| 2020.coling-main.134 *****Knowledge graph***** embedding is an important task and it will benefit lots of downstream applications . | ||
| 2021.emnlp-main.625 *****Knowledge graph***** embedding , representing entities and relations in the knowledge graphs with high - dimensional vectors , has made significant progress in link prediction . |